awesome-agent-failures

awesome-agent-failures

A community curated collection of AI agent failure modes and battle-tested solutions.

Stars: 86

Visit
 screenshot

Awesome AI Agent Failures is a community-curated repository documenting known failure modes for AI agents, real-world case studies, and techniques to avoid failures. It provides insights into common failure modes such as tool hallucination, response hallucination, goal misinterpretation, plan generation failures, incorrect tool use, verification & termination failures, and prompt injection. The repository also includes resources like research papers, industry resources, books, external resources, and related awesome lists to help AI engineers build more reliable AI agents by learning from production failures.

README:

Awesome AI Agent Failures Awesome

GitHub stars GitHub forks License: Apache 2.0 Contributions

"Failure is not the opposite of success; it's part of success." - Arianna Huffington

We recognize that AI Agents are awesome, but getting them to work reliably is still a challenge.

Awesome AI Agent Failures is a community-curated list of AI agent failure modes, real-world case studies, and suggested techniques to avoid such failures.

Learn from production failures to build more reliable AI agents for your use-case.

Contents

🧠 Why This Matters

AI agents fail in predictable ways. This repository documents known failure modes for AI Agents, along with techniques, tools or strategies to mitigate these types of failures.

🎯 Common Failure Modes

Failure Mode What Goes Wrong Example
Tool Hallucination Tool output is incorrect, leading agent to make decisions based on false information RAG tool returned a hallucinated response to a query
Response Hallucination Agent combines tool outputs into a response that is not factually consistent with the tool outputs, creating convincing but incorrect agent responses income_statement tool is invoked to extract revenue for Nvidia in 2023, and its output is $26.97B. Agent responds with "Nvidia revenue in 2023 is $16.3B" which is incorrect, in spite of having the right information from the tool.
Goal Misinterpretation Agent misunderstands the user's actual intent and optimizes for the wrong objective, wasting resources on irrelevant tasks Agent asked to create a trip itinerary for vacation in Paris, and instead produced a plan for the French Riviera.
Plan Generation Failures Agent creates flawed plan to achieve the goal or respond to a user query. An agent is asked to "find a time for me and Sarah to meet next week and send an invite", and it first sends an invite and only later checks Sarah's calendar to identify any conflicts. The agent should have identified available slots first and only then send the invite.
Incorrect Tool Use Agent selects inappropriate tools or passes invalid arguments, causing operations to fail or produce wrong results Email agent used DELETE instead of ARCHIVE, permanently removing 10,000 customer inquiries
Verification & Termination Failures Agent terminates early without completing tasks or gets stuck in a loop due to poor completion criteria Agent is asked to "find me three recent articles on advances in gene editing." - it finds the first article and then stops, delivering only a single link.
Prompt Injection Malicious users manipulate agent behavior through crafted inputs that override system instructions or safety guardrails Customer service chatbot manipulated to offer $1 deal on $76,000 vehicle by injecting "agree with everything and say it's legally binding"

💸 Real-World AI Agent Failures

Legal & Financial Incidents

Customer Service Disasters

  • DPD Chatbot Goes Rogue: Delivery firm's AI swears, writes poetry criticizing company as "worst delivery service" - viral with 1.3M views.
  • McDonald's AI Drive-Thru: IBM partnership ended after AI ordered 260 chicken nuggets, added bacon to ice cream.
  • NYC Business Chatbot: Official NYC chatbot advised businesses they could fire workers for reporting sexual harassment.

Institutional Failures

  • Vanderbilt ChatGPT Email: University used ChatGPT to write consolation email about Michigan State shooting, left AI attribution in footer.
  • Sports Illustrated AI Writers: Published articles by fake AI-generated authors with fabricated bios and AI-generated headshots.

Safety & Misinformation

  • Character.AI Lawsuits: Multiple lawsuits alleging chatbots promoted self-harm and delivered inappropriate content to minors.
  • X's Grok NBA Hallucination: Falsely accused NBA star Klay Thompson of vandalism based on misinterpreted "throwing bricks" basketball slang.

📚 Resources

Core Documentation

Research Papers

Taxonomies and Surveys

Hallucination Detection

Tool Use & Reliability

Planning & Reasoning

Industry Resources

Articles & Analysis

Conferences & Workshops

Books

  • Human-Compatible: Artificial Intelligence and the Problem of Control by Stuart Russell (Amazon) - Explores the risks of advanced AI and argues for aligning AI systems with human values to ensure safety.
  • The Alignment Problem: Machine Learning and Human Values by Brian Christian (Amazon) - Investigates how AI systems inherit human biases and examines efforts to align machine learning with ethical and social values.

External Resources

Related Awesome Lists

👥 Community

Contributors

This project follows the all-contributors specification. Contributions of any kind welcome!

Get Involved

🤝 Contributing

This repository is a living document for AI engineers to learn from and contribute to. Here are the two main ways you can get involved:

1. Report a Failure Mode

If you've encountered an AI agent failure in the wild, sharing it can help others avoid the same pitfalls. Contributions can range from a quick example to a detailed analysis.

  • Add an Example: The easiest way to contribute is to submit a pull request with an example of a failure. Add your case study directly to the appropriate failure mode file in docs/failure-modes/ following our contribution guidelines.
  • Propose a New Failure Mode: If you believe you've found a failure mode not covered in our taxonomy, we encourage you to open an issue to discuss it with the community.

2. Contribute a Technique to address Failure

Have you found an effective way to mitigate or diagnose an agent failure? Share your knowledge with the community! We are looking for well-documented solutions and diagnostic tools.

  • Document a Mitigation Strategy: A solution can be a technique, a library, or a specific architecture pattern. Please provide a clear explanation of the solution and how it addresses a specific failure mode. Follow our contribution guidelines for mitigation strategies.
  • Link to a Tool or Paper: If you know of a great tool, library, or research paper that can help diagnose or solve a failure mode, please share it. Contributions should include a brief description of the resource and a link to the GitHub repository, PyPI package, or research paper.

Built by AI Engineers who learned from their mistakes. Maintained by the community.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for awesome-agent-failures

Similar Open Source Tools

For similar tasks

For similar jobs