Best AI tools for< Detect Jailbreak >
20 - AI tool Sites

Prompt Security
Prompt Security is a platform that secures all uses of Generative AI in the organization: from tools used by your employees to your customer-facing apps.

Face Shape Detect
Face Shape Detect is an AI-powered tool that allows users to analyze their unique facial structure and determine their face shape for personalized recommendations. Users can upload a photo to receive accurate face shape analysis and styling tips. The tool prioritizes privacy by securely processing images without storing them. It helps users understand their face shape for better fashion and beauty choices.

ZeroGPT
ZeroGPT is a trusted AI detector tool that specializes in detecting AI-generated content like ChatGPT, GPT4, and Gemini. It offers advanced features such as AI summarization, paraphrasing, grammar and spell checking, translation, word counting, and citation generation. The tool is designed to provide highly accurate results and supports multiple languages. ZeroGPT stands out for its highlighted sentences feature, batch file upload capability, high accuracy model, and automatically generated reports. It utilizes DeepAnalyse™ Technology, a multi-stage methodology that optimizes accuracy while minimizing false positives and negatives. Users can unlock premium features and API access to enhance their writing skills and integrate the tool on a large scale.

AI or Not
AI or Not is an AI-powered tool that helps businesses and individuals detect AI-generated images and audio. It uses advanced machine learning algorithms to analyze content and determine the likelihood of AI manipulation. With AI or Not, users can protect themselves from fraud, misinformation, and other malicious activities involving AI-generated content.

AIDP
AIDP is a comprehensive platform that helps you find and remove the fingerprints of AI in documents. It includes automatic and manual tools for revising content that was written by ChatGPT and other AI models. With AIDP, you can: * Detect and wipe the traces of AI instantly. * See what triggers AI detection. * Get suggestions for wording changes and rewrites. * Make AI sound human. * Get a tone analysis to determine how your document sounds. * Find and wipe AI from any document.

GPT-2 Output Detector
The GPT-2 Output Detector is an online tool that helps users identify whether a given text was generated by the GPT-2 language model. The tool is based on the RoBERTa implementation of Transformers, a popular natural language processing library. Users can enter text into the text box, and the tool will predict the probability that the text was generated by GPT-2. The results start to get reliable after around 50 tokens.

GRAIL
GRAIL is a healthcare company innovating to solve medicine’s most important challenges. Our team of leading scientists, engineers and clinicians are on an urgent mission to detect cancer early, when it is more treatable and potentially curable. GRAIL's Galleri® test is a first-of-its-kind multi-cancer early detection (MCED) test that can detect a signal shared by more than 50 cancer types and predict the tissue type or organ associated with the signal to help healthcare providers determine next steps.

AIDetect
AIDetect is a powerful AI content detector tool that allows users to identify AI-generated writing within any text. It offers cutting-edge features and high accuracy, comparable to Turnitin, to help users verify the authenticity of content. With advanced technology, AIDetect ensures that users can distinguish between human and AI-generated content effortlessly.

AI Detector
AI Detector is an online tool that uses advanced algorithms and machine learning to check if your written text is generated by AI or a human writer. It analyzes the writing style, sentence structure, and other linguistic patterns to determine the likelihood of AI authorship. The tool provides a percentage score indicating the probability of AI-generated content, helping users identify potential plagiarism or AI-assisted writing.

BladeRunner
BladeRunner is a browser plug-in that highlights AI-generated text directly on web pages. It helps users detect AI-generated content in various contexts such as social media, news, education, e-commerce, and government communications. The tool aims to assist individuals in distinguishing between human-generated and AI-generated text, especially in the age of advanced language models and increasing AI influence on digital content.

AI Checker
AI Checker is a free tool and plagiarism detector that accurately identifies if a text is generated by AI tools like GPT-3, GPT-4, Gemini, OpenAI, and others. It helps users protect their content by detecting AI-generated text and human-written content. The tool uses advanced algorithms to provide accurate results and percentage analysis of AI-generated content within a text. AI Checker is beneficial for writers, students, educators, content marketers, freelancers, editors, publishers, researchers, and content consumers across different languages and contexts.

HEALWELL AI
HEALWELL AI is a healthcare technology company focusing on preventative care through AI and data science. Their mission is to improve healthcare and save lives by early disease detection. HEALWELL provides AI tools for healthcare providers to screen and detect rare, complex, and chronic diseases. They have developed AI clinical co-pilot technologies to assist physicians in early disease detection, ultimately accelerating time to diagnosis and saving lives.

GPTKit
GPTKit is a free AI text generation detection tool that utilizes six different AI-based content detection techniques to identify and classify text as either human- or AI-generated. It provides reports on the authenticity and reality of the analyzed content, with an accuracy of approximately 93%. The first 2048 characters in every request are free, and users can register for free to get 2048 characters/request.

Ai-SPY
Ai-SPY is an advanced AI audio detection tool that helps users identify whether speech is human or AI-generated. It offers detailed reports, easy integration with API access, and expert human insights for accurate analysis. Users can upload audio files or analyze social media links to determine authenticity. Ai-SPY leverages a proprietary neural network for unparalleled audio authenticity insights, making it a valuable tool for content verification and enterprise use.

Decopy AI Content Detector
Decopy AI Content Detector is an AI tool designed to help users determine if a given text was written by a human or generated by AI. It accurately identifies AI-generated, paraphrased, and human-written content. The tool offers features such as AI content highlighting, superior detection accuracy, user-friendly interface, free AI detection, instant access without sign-up, and guaranteed privacy. Users can utilize the AI Detector for tasks like academic integrity checks, content creation, journalism verification, publishing standards maintenance, SEO content uniqueness, social media reliability checks, legal document originality verification, and corporate training material quality assurance.

AI Scam Detective
AI Scam Detective is an AI tool designed to help users detect and prevent online scams. Users can paste messages or conversations into the tool, which then provides a score from 1-10 on the likelihood of it being a scam. Created by Sam Meehan, this tool aims to empower users to identify and avoid potential scams in their online interactions.

Unholy.ai
Unholy.ai is an AI tool designed to detect any 'unholiness' in the music you listen to. It uses advanced algorithms to analyze audio tracks and identify any elements that may be considered 'unholy' based on predefined criteria. The tool aims to provide users with insights into the content of their music and help them make informed decisions about what they listen to.

TrueBees
TrueBees is an AI-powered deepfakes detector designed to identify AI-generated portraits shared on social media and prevent their dissemination across the web. It offers a quick and easy way to verify image trustworthiness, helping users combat deepfakes and disinformation. TrueBees is tailored for professionals in the media industry and law firms, enabling them to ensure the authenticity of visual content and enhance trust in their publications.

AI Content Detector
The AI Content Detector is an online tool that helps users determine the similarity score of AI-generated content and whether it was written by a human or an AI tool. It utilizes advanced algorithms and natural language processing to analyze text, providing a percentage-based authenticity result. Users can input text for analysis and receive accurate results regarding the likelihood of AI authorship. The tool compares syntax, vocabulary, and semantics with AI and human models, offering high accuracy in identifying paraphrased content.

Free AI Detector & AI Checker
Free AI Detector & AI Checker is a powerful online tool that allows users to detect AI-generated content in text. Users can simply paste their text into the tool, and it can identify content created by various AI models such as ChatGPT, GPT-4, GPT-4o, Gemini, Claude, and more. The tool is perfect for students, writers, teachers, bloggers, businesses, and freelancers to ensure the authenticity and originality of their content. It is simple, reliable, and free to use without any limitations or the need for account creation.
20 - Open Source AI Tools

uptrain
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases), perform root cause analysis on failure cases and give insights on how to resolve them.

NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding _programmable guardrails_ to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.

call-center-ai
Call Center AI is an AI-powered call center solution leveraging Azure and OpenAI GPT. It allows for AI agent-initiated phone calls or direct calls to the bot from a configured phone number. The bot is customizable for various industries like insurance, IT support, and customer service, with features such as accessing claim information, conversation history, language change, SMS sending, and more. The project is a proof of concept showcasing the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI for an automated call center solution.

CJA_Comprehensive_Jailbreak_Assessment
This public repository contains the paper 'Comprehensive Assessment of Jailbreak Attacks Against LLMs'. It provides a labeling method to label results using Python and offers the opportunity to submit evaluation results to the leaderboard. Full codes will be released after the paper is accepted.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

rlhf_trojan_competition
This competition is organized by Javier Rando and Florian Tramèr from the ETH AI Center and SPY Lab at ETH Zurich. The goal of the competition is to create a method that can detect universal backdoors in aligned language models. A universal backdoor is a secret suffix that, when appended to any prompt, enables the model to answer harmful instructions. The competition provides a set of poisoned generation models, a reward model that measures how safe a completion is, and a dataset with prompts to run experiments. Participants are encouraged to use novel methods for red-teaming, automated approaches with low human oversight, and interpretability tools to find the trojans. The best submissions will be offered the chance to present their work at an event during the SaTML 2024 conference and may be invited to co-author a publication summarizing the competition results.

ShieldLM
ShieldLM is a bilingual safety detector designed to detect safety issues in LLMs' generations. It aligns with human safety standards, supports customizable detection rules, and provides explanations for decisions. Outperforming strong baselines, ShieldLM is impressive across 4 test sets.

FigStep
FigStep is a black-box jailbreaking algorithm against large vision-language models (VLMs). It feeds harmful instructions through the image channel and uses benign text prompts to induce VLMs to output contents that violate common AI safety policies. The tool highlights the vulnerability of VLMs to jailbreaking attacks, emphasizing the need for safety alignments between visual and textual modalities.

fast-llm-security-guardrails
ZenGuard AI enables AI developers to integrate production-level, low-code LLM (Large Language Model) guardrails into their generative AI applications effortlessly. With ZenGuard AI, ensure your application operates within trusted boundaries, is protected from prompt injections, and maintains user privacy without compromising on performance.

llm-misinformation-survey
The 'llm-misinformation-survey' repository is dedicated to the survey on combating misinformation in the age of Large Language Models (LLMs). It explores the opportunities and challenges of utilizing LLMs to combat misinformation, providing insights into the history of combating misinformation, current efforts, and future outlook. The repository serves as a resource hub for the initiative 'LLMs Meet Misinformation' and welcomes contributions of relevant research papers and resources. The goal is to facilitate interdisciplinary efforts in combating LLM-generated misinformation and promoting the responsible use of LLMs in fighting misinformation.

PurpleLlama
Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.

AwesomeResponsibleAI
Awesome Responsible AI is a curated list of academic research, books, code of ethics, courses, data sets, frameworks, institutes, newsletters, principles, podcasts, reports, tools, regulations, and standards related to Responsible, Trustworthy, and Human-Centered AI. It covers various concepts such as Responsible AI, Trustworthy AI, Human-Centered AI, Responsible AI frameworks, AI Governance, and more. The repository provides a comprehensive collection of resources for individuals interested in ethical, transparent, and accountable AI development and deployment.

GPT4DFCI
GPT4DFCI is a private and secure generative AI tool based on GPT-4, deployed for non-clinical use at Dana-Farber Cancer Institute. The tool is overseen by the Dana-Farber AI Governance Committee and developed by the Dana-Farber Informatics & Analytics Department. The repository includes manuscript & policy details, training material, front-end and back-end code, infrastructure information, API client for programmatic use, licensing details, and contact information.

awesome-gpt-security
Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.

chatgpt-universe
ChatGPT is a large language model that can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in a conversational way. It is trained on a massive amount of text data, and it is able to understand and respond to a wide range of natural language prompts. Here are 5 jobs suitable for this tool, in lowercase letters: 1. content writer 2. chatbot assistant 3. language translator 4. creative writer 5. researcher

awesome-generative-ai
A curated list of Generative AI projects, tools, artworks, and models
20 - OpenAI Gpts

FallacyGPT
Detect logical fallacies and lapses in critical thinking to help avoid misinformation in the style of Socrates

AI Detector
AI Detector GPT is powered by Winston AI and created to help identify AI generated content. It is designed to help you detect use of AI Writing Chatbots such as ChatGPT, Claude and Bard and maintain integrity in academia and publishing. Winston AI is the most trusted AI content detector.

Plagiarism Checker
Plagiarism Checker GPT is powered by Winston AI and created to help identify plagiarized content. It is designed to help you detect instances of plagiarism and maintain integrity in academia and publishing. Winston AI is the most trusted AI and Plagiarism Checker.

BS Meter Realtime
Detects and measures information credibility. Provides a "BS Score" (0-100) based on content analysis for misinformation signs, including factual inaccuracies and sensationalist language. Real-time feedback.

Wowza Bias Detective
I analyze cognitive biases in scenarios and thoughts, providing neutral, educational insights.

Defender for Endpoint Guardian
To assist individuals seeking to learn about or work with Microsoft's Defender for Endpoint. I provide detailed explanations, step-by-step guides, troubleshooting advice, cybersecurity best practices, and demonstrations, all specifically tailored to Microsoft Defender for Endpoint.

Prompt Injection Detector
GPT used to classify prompts as valid inputs or injection attempts. Json output.

Blue Team Guide
it is a meticulously crafted arsenal of knowledge, insights, and guidelines that is shaped to empower organizations in crafting, enhancing, and refining their cybersecurity defenses

PBN Detector
A tool to help you decide if a website is part of a PBN or link network, created solely for link building. >> Get in touch with Gareth if you need a Freelance SEO for link building <<

ethicallyHackingspace (eHs)® METEOR™ STORM™
Multiple Environment Threat Evaluation of Resources (METEOR)™ Space Threats and Operational Risks to Mission (STORM)™ non-profit product AI co-pilot

Mónica
CSIRT que lidera un equipo especializado en detectar y responder a incidentes de seguridad, maneja la contención y recuperación, organiza entrenamientos y simulacros, elabora reportes para optimizar estrategias de seguridad y coordina con entidades legales cuando es necesario