Best AI tools for< Detect Jailbreaks >
20 - AI tool Sites
Prompt Security
Prompt Security is a platform that secures all uses of Generative AI in the organization: from tools used by your employees to your customer-facing apps.
ZeroGPT
ZeroGPT is a trusted AI detector tool that specializes in detecting AI-generated content like ChatGPT, GPT4, and Gemini. It offers advanced features such as AI summarization, paraphrasing, grammar and spell checking, translation, word counting, and citation generation. The tool is designed to provide highly accurate results and supports multiple languages. ZeroGPT stands out for its highlighted sentences feature, batch file upload capability, high accuracy model, and automatically generated reports. It utilizes DeepAnalyse™ Technology, a multi-stage methodology that optimizes accuracy while minimizing false positives and negatives. Users can unlock premium features and API access to enhance their writing skills and integrate the tool on a large scale.
AI or Not
AI or Not is an AI-powered tool that helps businesses and individuals detect AI-generated images and audio. It uses advanced machine learning algorithms to analyze content and determine the likelihood of AI manipulation. With AI or Not, users can protect themselves from fraud, misinformation, and other malicious activities involving AI-generated content.
GPT-2 Output Detector
The GPT-2 Output Detector is an online tool that helps users identify whether a given text was generated by the GPT-2 language model. The tool is based on the RoBERTa implementation of Transformers, a popular natural language processing library. Users can enter text into the text box, and the tool will predict the probability that the text was generated by GPT-2. The results start to get reliable after around 50 tokens.
GRAIL
GRAIL is a healthcare company innovating to solve medicine’s most important challenges. Our team of leading scientists, engineers and clinicians are on an urgent mission to detect cancer early, when it is more treatable and potentially curable. GRAIL's Galleri® test is a first-of-its-kind multi-cancer early detection (MCED) test that can detect a signal shared by more than 50 cancer types and predict the tissue type or organ associated with the signal to help healthcare providers determine next steps.
AIDetect
AIDetect is a powerful AI content detector tool that allows users to identify AI-generated writing within any text. It offers cutting-edge features and high accuracy, comparable to Turnitin, to help users verify the authenticity of content. With advanced technology, AIDetect ensures that users can distinguish between human and AI-generated content effortlessly.
AI Detector
AI Detector is an online tool that uses advanced algorithms and machine learning to check if your written text is generated by AI or a human writer. It analyzes the writing style, sentence structure, and other linguistic patterns to determine the likelihood of AI authorship. The tool provides a percentage score indicating the probability of AI-generated content, helping users identify potential plagiarism or AI-assisted writing.
AIDP
AIDP is a comprehensive platform that helps you find and remove the fingerprints of AI in documents. It includes automatic and manual tools for revising content that was written by ChatGPT and other AI models. With AIDP, you can: * Detect and wipe the traces of AI instantly. * See what triggers AI detection. * Get suggestions for wording changes and rewrites. * Make AI sound human. * Get a tone analysis to determine how your document sounds. * Find and wipe AI from any document.
BladeRunner
BladeRunner is a browser plug-in that highlights AI-generated text directly on the page. It helps users detect AI-generated content in various contexts such as social media, news, education, e-commerce, and government communications. In the age of AI, where distinguishing between human and AI-generated content is challenging, BladeRunner aims to assist users in identifying AI impostors and maintaining a higher standard of accuracy in digital interactions.
AI Checker
AI Checker is a free tool and plagiarism detector that accurately identifies if a text is generated by AI tools like GPT-3, GPT-4, Gemini, OpenAI, and others. It helps users protect their content by detecting AI-generated text and human-written content. The tool uses advanced algorithms to provide accurate results and percentage analysis of AI-generated content within a text. AI Checker is beneficial for writers, students, educators, content marketers, freelancers, editors, publishers, researchers, and content consumers across different languages and contexts.
GPTKit
GPTKit is a free AI text generation detection tool that utilizes six different AI-based content detection techniques to identify and classify text as either human- or AI-generated. It provides reports on the authenticity and reality of the analyzed content, with an accuracy of approximately 93%. The first 2048 characters in every request are free, and users can register for free to get 2048 characters/request.
Decopy AI Content Detector
Decopy AI Content Detector is an AI tool designed to help users determine if a given text was written by a human or generated by AI. It accurately identifies AI-generated, paraphrased, and human-written content. The tool offers features such as AI content highlighting, superior detection accuracy, user-friendly interface, free AI detection, instant access without sign-up, and guaranteed privacy. Users can utilize the AI Detector for tasks like academic integrity checks, content creation, journalism verification, publishing standards maintenance, SEO content uniqueness, social media reliability checks, legal document originality verification, and corporate training material quality assurance.
AI Scam Detective
AI Scam Detective is an AI tool designed to help users detect and prevent online scams. Users can paste messages or conversations into the tool, which then provides a score from 1-10 on the likelihood of it being a scam. Created by Sam Meehan, this tool aims to empower users to identify and avoid potential scams in their online interactions.
Unholy.ai
Unholy.ai is an AI tool designed to detect any 'unholiness' in the music you listen to. It uses advanced algorithms to analyze audio tracks and identify any elements that may be considered inappropriate or offensive. The tool is currently under construction and being developed by Saud with the help of GPT Bros. Unholy.ai aims to provide users with a unique way to ensure the music they listen to aligns with their preferences and values.
TrueBees
TrueBees is a deepfakes detector application designed to verify the trustworthiness of images shared on social media. It utilizes AI technology to detect AI-generated portraits and prevent their dissemination online. The tool is specifically tailored for media professionals and law firms to ensure the authenticity of images used in news articles, legal cases, and social media posts. TrueBees aims to combat the spread of deepfakes and disinformation by providing a reliable solution for image verification.
AI Content Detector
The AI Content Detector is an online tool that helps users determine the similarity score of AI-generated content and whether it was written by a human or an AI tool. It utilizes advanced algorithms and natural language processing to analyze text, providing a percentage-based authenticity result. Users can input text for analysis and receive accurate results regarding the likelihood of AI authorship. The tool compares syntax, vocabulary, and semantics with AI and human models, offering high accuracy in identifying paraphrased content.
ZeroGPT
ZeroGPT is a comprehensive AI detection tool that helps users identify AI-generated content. It offers a range of features, including sentence highlighting, batch file upload, high accuracy, and support for multiple languages. ZeroGPT's DeepAnalyse™ Technology employs a multi-stage methodology to analyze text and determine its origin. The tool is designed to minimize false positives and negatives, providing users with reliable results. ZeroGPT also offers a user-friendly API for organizations, enabling them to integrate the tool into their systems.
AI Voice Detector
AI Voice Detector is an advanced tool designed to protect individuals and businesses from audio manipulation and AI voice scams. It offers a high model accuracy of 92% in identifying whether an audio is real or AI-generated. The tool can be used to detect AI voices in various platforms like YouTube, WhatsApp, TikTok, Zoom, and Google Meet. With features such as multilanguage support, background noise removal, and browser extension integration, AI Voice Detector stands out as a reliable solution for authenticating audio files.
Quetext
Quetext is a plagiarism checker and AI content detector that helps students, teachers, and professionals identify potential plagiarism and AI in their work. With its deep search technology, contextual analysis, and smart algorithms, Quetext makes checking writing easier and more accurate. Quetext also offers a variety of features such as bulk uploads, source exclusion, enhanced citation generator, grammar & spell check, and Deep Search. With its rich and intuitive feedback, Quetext helps users find plagiarism and AI with less stress.
Deepfake Detector
Deepfake Detector is an AI tool designed to identify deepfake audio and video content with 92% model accuracy. It helps individuals and businesses protect themselves from deepfake scams by analyzing voice messages and calls for authenticity. The tool offers probabilities as a guide for further investigation, ensuring credibility in media reporting and legal proceedings. With features like AI Noise Remover and easy API integration, Deepfake Detector is a market leader in detecting deepfakes and preventing financial losses.
20 - Open Source AI Tools
uptrain
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases), perform root cause analysis on failure cases and give insights on how to resolve them.
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding _programmable guardrails_ to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.
PurpleLlama
Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.
awesome-gpt-security
Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.
chatgpt-universe
ChatGPT is a large language model that can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in a conversational way. It is trained on a massive amount of text data, and it is able to understand and respond to a wide range of natural language prompts. Here are 5 jobs suitable for this tool, in lowercase letters: 1. content writer 2. chatbot assistant 3. language translator 4. creative writer 5. researcher
awesome-generative-ai
A curated list of Generative AI projects, tools, artworks, and models
last_layer
last_layer is a security library designed to protect LLM applications from prompt injection attacks, jailbreaks, and exploits. It acts as a robust filtering layer to scrutinize prompts before they are processed by LLMs, ensuring that only safe and appropriate content is allowed through. The tool offers ultra-fast scanning with low latency, privacy-focused operation without tracking or network calls, compatibility with serverless platforms, advanced threat detection mechanisms, and regular updates to adapt to evolving security challenges. It significantly reduces the risk of prompt-based attacks and exploits but cannot guarantee complete protection against all possible threats.
fast-llm-security-guardrails
ZenGuard AI enables AI developers to integrate production-level, low-code LLM (Large Language Model) guardrails into their generative AI applications effortlessly. With ZenGuard AI, ensure your application operates within trusted boundaries, is protected from prompt injections, and maintains user privacy without compromising on performance.
awesome-llm-security
Awesome LLM Security is a curated collection of tools, documents, and projects related to Large Language Model (LLM) security. It covers various aspects of LLM security including white-box, black-box, and backdoor attacks, defense mechanisms, platform security, and surveys. The repository provides resources for researchers and practitioners interested in understanding and safeguarding LLMs against adversarial attacks. It also includes a list of tools specifically designed for testing and enhancing LLM security.
OpenRedTeaming
OpenRedTeaming is a repository focused on red teaming for generative models, specifically large language models (LLMs). The repository provides a comprehensive survey on potential attacks on GenAI and robust safeguards. It covers attack strategies, evaluation metrics, benchmarks, and defensive approaches. The repository also implements over 30 auto red teaming methods. It includes surveys, taxonomies, attack strategies, and risks related to LLMs. The goal is to understand vulnerabilities and develop defenses against adversarial attacks on large language models.
call-center-ai
Call Center AI is an AI-powered call center solution leveraging Azure and OpenAI GPT. It allows for AI agent-initiated phone calls or direct calls to the bot from a configured phone number. The bot is customizable for various industries like insurance, IT support, and customer service, with features such as accessing claim information, conversation history, language change, SMS sending, and more. The project is a proof of concept showcasing the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI for an automated call center solution.
CJA_Comprehensive_Jailbreak_Assessment
This public repository contains the paper 'Comprehensive Assessment of Jailbreak Attacks Against LLMs'. It provides a labeling method to label results using Python and offers the opportunity to submit evaluation results to the leaderboard. Full codes will be released after the paper is accepted.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
rlhf_trojan_competition
This competition is organized by Javier Rando and Florian Tramèr from the ETH AI Center and SPY Lab at ETH Zurich. The goal of the competition is to create a method that can detect universal backdoors in aligned language models. A universal backdoor is a secret suffix that, when appended to any prompt, enables the model to answer harmful instructions. The competition provides a set of poisoned generation models, a reward model that measures how safe a completion is, and a dataset with prompts to run experiments. Participants are encouraged to use novel methods for red-teaming, automated approaches with low human oversight, and interpretability tools to find the trojans. The best submissions will be offered the chance to present their work at an event during the SaTML 2024 conference and may be invited to co-author a publication summarizing the competition results.
ShieldLM
ShieldLM is a bilingual safety detector designed to detect safety issues in LLMs' generations. It aligns with human safety standards, supports customizable detection rules, and provides explanations for decisions. Outperforming strong baselines, ShieldLM is impressive across 4 test sets.
FigStep
FigStep is a black-box jailbreaking algorithm against large vision-language models (VLMs). It feeds harmful instructions through the image channel and uses benign text prompts to induce VLMs to output contents that violate common AI safety policies. The tool highlights the vulnerability of VLMs to jailbreaking attacks, emphasizing the need for safety alignments between visual and textual modalities.
llm-misinformation-survey
The 'llm-misinformation-survey' repository is dedicated to the survey on combating misinformation in the age of Large Language Models (LLMs). It explores the opportunities and challenges of utilizing LLMs to combat misinformation, providing insights into the history of combating misinformation, current efforts, and future outlook. The repository serves as a resource hub for the initiative 'LLMs Meet Misinformation' and welcomes contributions of relevant research papers and resources. The goal is to facilitate interdisciplinary efforts in combating LLM-generated misinformation and promoting the responsible use of LLMs in fighting misinformation.
20 - OpenAI Gpts
FallacyGPT
Detect logical fallacies and lapses in critical thinking to help avoid misinformation in the style of Socrates
AI Detector
AI Detector GPT is powered by Winston AI and created to help identify AI generated content. It is designed to help you detect use of AI Writing Chatbots such as ChatGPT, Claude and Bard and maintain integrity in academia and publishing. Winston AI is the most trusted AI content detector.
Plagiarism Checker
Plagiarism Checker GPT is powered by Winston AI and created to help identify plagiarized content. It is designed to help you detect instances of plagiarism and maintain integrity in academia and publishing. Winston AI is the most trusted AI and Plagiarism Checker.
BS Meter Realtime
Detects and measures information credibility. Provides a "BS Score" (0-100) based on content analysis for misinformation signs, including factual inaccuracies and sensationalist language. Real-time feedback.
Wowza Bias Detective
I analyze cognitive biases in scenarios and thoughts, providing neutral, educational insights.
Defender for Endpoint Guardian
To assist individuals seeking to learn about or work with Microsoft's Defender for Endpoint. I provide detailed explanations, step-by-step guides, troubleshooting advice, cybersecurity best practices, and demonstrations, all specifically tailored to Microsoft Defender for Endpoint.
Prompt Injection Detector
GPT used to classify prompts as valid inputs or injection attempts. Json output.
Blue Team Guide
it is a meticulously crafted arsenal of knowledge, insights, and guidelines that is shaped to empower organizations in crafting, enhancing, and refining their cybersecurity defenses
PBN Detector
A tool to help you decide if a website is part of a PBN or link network, created solely for link building. >> Get in touch with Gareth if you need a Freelance SEO for link building <<
ethicallyHackingspace (eHs)® METEOR™ STORM™
Multiple Environment Threat Evaluation of Resources (METEOR)™ Space Threats and Operational Risks to Mission (STORM)™ non-profit product AI co-pilot
Mónica
CSIRT que lidera un equipo especializado en detectar y responder a incidentes de seguridad, maneja la contención y recuperación, organiza entrenamientos y simulacros, elabora reportes para optimizar estrategias de seguridad y coordina con entidades legales cuando es necesario