
awesome-agent-failures
A community curated collection of AI agent failure modes and battle-tested solutions.
Stars: 86

Awesome AI Agent Failures is a community-curated repository documenting known failure modes for AI agents, real-world case studies, and techniques to avoid failures. It provides insights into common failure modes such as tool hallucination, response hallucination, goal misinterpretation, plan generation failures, incorrect tool use, verification & termination failures, and prompt injection. The repository also includes resources like research papers, industry resources, books, external resources, and related awesome lists to help AI engineers build more reliable AI agents by learning from production failures.
README:
"Failure is not the opposite of success; it's part of success." - Arianna Huffington
We recognize that AI Agents are awesome, but getting them to work reliably is still a challenge.
Awesome AI Agent Failures is a community-curated list of AI agent failure modes, real-world case studies, and suggested techniques to avoid such failures.
Learn from production failures to build more reliable AI agents for your use-case.
AI agents fail in predictable ways. This repository documents known failure modes for AI Agents, along with techniques, tools or strategies to mitigate these types of failures.
Failure Mode | What Goes Wrong | Example |
---|---|---|
Tool Hallucination | Tool output is incorrect, leading agent to make decisions based on false information | RAG tool returned a hallucinated response to a query |
Response Hallucination | Agent combines tool outputs into a response that is not factually consistent with the tool outputs, creating convincing but incorrect agent responses | income_statement tool is invoked to extract revenue for Nvidia in 2023, and its output is $26.97B. Agent responds with "Nvidia revenue in 2023 is $16.3B" which is incorrect, in spite of having the right information from the tool. |
Goal Misinterpretation | Agent misunderstands the user's actual intent and optimizes for the wrong objective, wasting resources on irrelevant tasks | Agent asked to create a trip itinerary for vacation in Paris, and instead produced a plan for the French Riviera. |
Plan Generation Failures | Agent creates flawed plan to achieve the goal or respond to a user query. | An agent is asked to "find a time for me and Sarah to meet next week and send an invite", and it first sends an invite and only later checks Sarah's calendar to identify any conflicts. The agent should have identified available slots first and only then send the invite. |
Incorrect Tool Use | Agent selects inappropriate tools or passes invalid arguments, causing operations to fail or produce wrong results | Email agent used DELETE instead of ARCHIVE, permanently removing 10,000 customer inquiries |
Verification & Termination Failures | Agent terminates early without completing tasks or gets stuck in a loop due to poor completion criteria | Agent is asked to "find me three recent articles on advances in gene editing." - it finds the first article and then stops, delivering only a single link. |
Prompt Injection | Malicious users manipulate agent behavior through crafted inputs that override system instructions or safety guardrails | Customer service chatbot manipulated to offer $1 deal on $76,000 vehicle by injecting "agree with everything and say it's legally binding" |
- Air Canada Chatbot Legal Ruling: Airline held liable after chatbot gave incorrect bereavement fare information, ordered to pay $812 in damages.
- ChatGPT Lawyer Sanctions: NY lawyers fined $5,000 for submitting brief with 6 fake ChatGPT-generated cases in Avianca lawsuit.
- Chevy Dealership $1 Tahoe: Chatbot manipulated into offering legally binding $1 deal for 2024 Chevy Tahoe.
- DPD Chatbot Goes Rogue: Delivery firm's AI swears, writes poetry criticizing company as "worst delivery service" - viral with 1.3M views.
- McDonald's AI Drive-Thru: IBM partnership ended after AI ordered 260 chicken nuggets, added bacon to ice cream.
- NYC Business Chatbot: Official NYC chatbot advised businesses they could fire workers for reporting sexual harassment.
- Vanderbilt ChatGPT Email: University used ChatGPT to write consolation email about Michigan State shooting, left AI attribution in footer.
- Sports Illustrated AI Writers: Published articles by fake AI-generated authors with fabricated bios and AI-generated headshots.
- Character.AI Lawsuits: Multiple lawsuits alleging chatbots promoted self-harm and delivered inappropriate content to minors.
- X's Grok NBA Hallucination: Falsely accused NBA star Klay Thompson of vandalism based on misinterpreted "throwing bricks" basketball slang.
- Complete Taxonomy - Detailed failure classification system.
- Contributing Guide - How to contribute to this list.
- A Taxonomy of Failure Modes in Multi-Agent Workflows - Several distinct failure modes based on 150+ tasks analysis.
- Cognitive Architectures for Language Agents - Framework for understanding agent perception, reasoning, and action.
- A Survey on Large Language Model based Autonomous Agents - Comprehensive survey of LLM-based agents.
- Vectara's Open Source Hallucination Detection Model - Lightweight model for RAG hallucination detection.
- Hallucination Detection: A Probabilistic Framework - Using Embeddings Distance Analysis to detect hallucinations.
- FaithBench - A Diverse Hallucination Benchmark for Summarization by Modern LLMs.
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs - Framework for improving tool use capabilities.
- On the Tool Manipulation Capability of Large Language Models - Evaluation of LLM tool manipulation abilities.
- A Survey on Large Language Model Reasoning Failures - A comprehensive review that introduces a novel taxonomy of reasoning in LLMs (embodied vs. non-embodied), and spotlights three categories of reasoning.
- AI Safety in RAG - Vectara's analysis of RAG hallucination challenges.
- Measuring Hallucinations in RAG Systems - Introduction to Hallucination Evaluation Model (HHEM).
- Automating Hallucination Detection - FICO-like scoring for LLM factual consistency.
- Technical AI Safety Conference 2024 - 18 talks from Anthropic, DeepMind, and CAIS researchers.
- Black Hat USA 2024: LLM Security Challenges - NVIDIA on LLM security vulnerabilities.
- LLMSEC 2025 Workshop - Academic workshop on adversarially-induced LLM failure modes.
- AI Risk Summit 2025 - Conference on AI agent risks.
- Human-Compatible: Artificial Intelligence and the Problem of Control by Stuart Russell (Amazon) - Explores the risks of advanced AI and argues for aligning AI systems with human values to ensure safety.
- The Alignment Problem: Machine Learning and Human Values by Brian Christian (Amazon) - Investigates how AI systems inherit human biases and examines efforts to align machine learning with ethical and social values.
- Specification Gaming - Collection of reward hacking examples.
- Awesome LLM - Large Language Models.
- Awesome Production Machine Learning - ML in production.
- Awesome AI Agents - AI agent frameworks and tools.
This project follows the all-contributors specification. Contributions of any kind welcome!
- Join Discussions - Share experiences and ask questions.
- Report Issues - Help us improve this resource.
- 🌟 Star this repo if it helped you avoid a production failure!
- Subscribe to Updates - Get notified of new failure patterns.
This repository is a living document for AI engineers to learn from and contribute to. Here are the two main ways you can get involved:
If you've encountered an AI agent failure in the wild, sharing it can help others avoid the same pitfalls. Contributions can range from a quick example to a detailed analysis.
-
Add an Example: The easiest way to contribute is to submit a pull request with an example of a failure. Add your case study directly to the appropriate failure mode file in
docs/failure-modes/
following our contribution guidelines. - Propose a New Failure Mode: If you believe you've found a failure mode not covered in our taxonomy, we encourage you to open an issue to discuss it with the community.
Have you found an effective way to mitigate or diagnose an agent failure? Share your knowledge with the community! We are looking for well-documented solutions and diagnostic tools.
- Document a Mitigation Strategy: A solution can be a technique, a library, or a specific architecture pattern. Please provide a clear explanation of the solution and how it addresses a specific failure mode. Follow our contribution guidelines for mitigation strategies.
- Link to a Tool or Paper: If you know of a great tool, library, or research paper that can help diagnose or solve a failure mode, please share it. Contributions should include a brief description of the resource and a link to the GitHub repository, PyPI package, or research paper.
Built by AI Engineers who learned from their mistakes. Maintained by the community.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for awesome-agent-failures
Similar Open Source Tools

awesome-agent-failures
Awesome AI Agent Failures is a community-curated repository documenting known failure modes for AI agents, real-world case studies, and techniques to avoid failures. It provides insights into common failure modes such as tool hallucination, response hallucination, goal misinterpretation, plan generation failures, incorrect tool use, verification & termination failures, and prompt injection. The repository also includes resources like research papers, industry resources, books, external resources, and related awesome lists to help AI engineers build more reliable AI agents by learning from production failures.

awesome-gpt-security
Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.

GenAI_Agents
GenAI Agents is a comprehensive repository for developing and implementing Generative AI (GenAI) agents, ranging from simple conversational bots to complex multi-agent systems. It serves as a valuable resource for learning, building, and sharing GenAI agents, offering tutorials, implementations, and a platform for showcasing innovative agent creations. The repository covers a wide range of agent architectures and applications, providing step-by-step tutorials, ready-to-use implementations, and regular updates on advancements in GenAI technology.

kai
Kai is an AI-enabled tool that simplifies the process of modernizing application source code to a new platform. It uses Large Language Models (LLMs) guided by static code analysis, along with data from Konveyor. This data provides insights into how the organization solved similar problems in the past, helping streamline and automate the code modernization process. Kai assists developers by providing suggestions and solutions to common problems through Retrieval Augmented Generation (RAG), working with LLMs using Konveyor analysis reports about the codebase and generating solutions based on previously solved examples.

advisingapp
**Advising App™** is a software solution created by Canyon GBS™ that includes a robust personal assistant designed to support student service professionals in their day-to-day roles. The assistant can help with research tasks, draft communication, language translation, content creation, student profile analysis, project planning, ideation, and much more. The software also includes a student service CRM designed to support the management of prospective and enrolled students. Key features of the CRM include record management, email and SMS, service management, caseload management, task management, interaction tracking, files and documents, and much more.

apo
AutoPilot Observability (APO) is an out-of-the-box observability platform that provides one-click installation and ready-to-use capabilities. APO's OneAgent supports one-click configuration-free installation of Tracing probes, collects application fault scene logs, infrastructure metrics, network metrics of applications and downstream dependencies, and Kubernetes events. It supports collecting causality metrics based on eBPF implementation. APO integrates OpenTelemetry probes, otel-collector, Jaeger, ClickHouse, and VictoriaMetrics, reducing user configuration work. APO innovatively integrates eBPF technology with the OpenTelemetry ecosystem, significantly reducing data storage volume. It offers guided troubleshooting using eBPF technology to assist users in pinpointing fault causes on a single page.

awesome-ai-cybersecurity
This repository is a comprehensive collection of resources for utilizing AI in cybersecurity. It covers various aspects such as prediction, prevention, detection, response, monitoring, and more. The resources include tools, frameworks, case studies, best practices, tutorials, and research papers. The repository aims to assist professionals, researchers, and enthusiasts in staying updated and advancing their knowledge in the field of AI cybersecurity.

dapr-agents
Dapr Agents is a developer framework for building production-grade resilient AI agent systems that operate at scale. It enables software developers to create AI agents that reason, act, and collaborate using Large Language Models (LLMs), while providing built-in observability and stateful workflow execution to ensure agentic workflows complete successfully. The framework is scalable, efficient, Kubernetes-native, data-driven, secure, observable, vendor-neutral, and open source. It offers features like scalable workflows, cost-effective AI adoption, data-centric AI agents, accelerated development, integrated security and reliability, built-in messaging and state infrastructure, and vendor-neutral and open source support. Dapr Agents is designed to simplify the development of AI applications and workflows by providing a comprehensive API surface and seamless integration with various data sources and services.

llmops-duke-aipi
LLMOps Duke AIPI is a course focused on operationalizing Large Language Models, teaching methodologies for developing applications using software development best practices with large language models. The course covers various topics such as generative AI concepts, setting up development environments, interacting with large language models, using local large language models, applied solutions with LLMs, extensibility using plugins and functions, retrieval augmented generation, introduction to Python web frameworks for APIs, DevOps principles, deploying machine learning APIs, LLM platforms, and final presentations. Students will learn to build, share, and present portfolios using Github, YouTube, and Linkedin, as well as develop non-linear life-long learning skills. Prerequisites include basic Linux and programming skills, with coursework available in Python or Rust. Additional resources and references are provided for further learning and exploration.

awesome-crewai
Awesome CrewAI is a curated collection of open-source projects built by the CrewAI community, aimed at unlocking the full potential of AI agents for supercharging business processes and decision-making. It includes integrations, tutorials, and tools that showcase the capabilities of CrewAI in various domains.

kitops
KitOps is a CNCF open standards project for packaging, versioning, and securely sharing AI/ML projects. It provides a unified solution for packaging, versioning, and managing assets in security-conscious enterprises, governments, and cloud operators. KitOps elevates AI artifacts to first-class, governed assets through ModelKits, which are tamper-proof, signable, and compatible with major container registries. The tool simplifies collaboration between data scientists, developers, and SREs, ensuring reliable and repeatable workflows for both development and operations. KitOps supports packaging for various types of models, including large language models, computer vision models, multi-modal models, predictive models, and audio models. It also facilitates compliance with the EU AI Act by offering tamper-proof, signable, and auditable ModelKits.

doc2plan
doc2plan is a browser-based application that helps users create personalized learning plans by extracting content from documents. It features a Creator for manual or AI-assisted plan construction and a Viewer for interactive plan navigation. Users can extract chapters, key topics, generate quizzes, and track progress. The application includes AI-driven content extraction, quiz generation, progress tracking, plan import/export, assistant management, customizable settings, viewer chat with text-to-speech and speech-to-text support, and integration with various Retrieval-Augmented Generation (RAG) models. It aims to simplify the creation of comprehensive learning modules tailored to individual needs.

ThereForYou
ThereForYou is a groundbreaking solution aimed at enhancing public safety, particularly focusing on mental health support and suicide prevention. Leveraging cutting-edge technologies such as artificial intelligence (AI), machine learning (ML), natural language processing (NLP), and blockchain, the project offers accessible and empathetic assistance to individuals facing mental health challenges.

coze-studio
Coze Studio is an all-in-one AI agent development tool that offers the most convenient AI agent development environment, from development to deployment. It provides core technologies for AI agent development, complete app templates, and build frameworks. Coze Studio aims to simplify creating, debugging, and deploying AI agents through visual design and build tools, enabling powerful AI app development and customized business logic. The tool is developed using Golang for the backend, React + TypeScript for the frontend, and follows microservices architecture based on domain-driven design principles.

llmariner
LLMariner is an extensible open source platform built on Kubernetes to simplify the management of generative AI workloads. It enables efficient handling of training and inference data within clusters, with OpenAI-compatible APIs for seamless integration with a wide range of AI-driven applications.

stride-gpt
STRIDE GPT is an AI-powered threat modelling tool that leverages Large Language Models (LLMs) to generate threat models and attack trees for a given application based on the STRIDE methodology. Users provide application details, such as the application type, authentication methods, and whether the application is internet-facing or processes sensitive data. The model then generates its output based on the provided information. It features a simple and user-friendly interface, supports multi-modal threat modelling, generates attack trees, suggests possible mitigations for identified threats, and does not store application details. STRIDE GPT can be accessed via OpenAI API, Azure OpenAI Service, Google AI API, or Mistral API. It is available as a Docker container image for easy deployment.
For similar tasks

awesome-agent-failures
Awesome AI Agent Failures is a community-curated repository documenting known failure modes for AI agents, real-world case studies, and techniques to avoid failures. It provides insights into common failure modes such as tool hallucination, response hallucination, goal misinterpretation, plan generation failures, incorrect tool use, verification & termination failures, and prompt injection. The repository also includes resources like research papers, industry resources, books, external resources, and related awesome lists to help AI engineers build more reliable AI agents by learning from production failures.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.