ai-goat

Learn AI security through a series of vulnerable LLM CTF challenges. No sign ups, no cloud fees, run everything locally on your system.

Stars: 105

Visit

AI Goat is a tool designed to help users learn about AI security through a series of vulnerable LLM CTF challenges. It allows users to run everything locally on their system without the need for sign-ups or cloud fees. The tool focuses on exploring security risks associated with large language models (LLMs) like ChatGPT, providing practical experience for security researchers to understand vulnerabilities and exploitation techniques. AI Goat uses the Vicuna LLM, derived from Meta's LLaMA and ChatGPT's response data, to create challenges that involve prompt injections, insecure output handling, and other LLM security threats. The tool also includes a prebuilt Docker image, ai-base, containing all necessary libraries to run the LLM and challenges, along with an optional CTFd container for challenge management and flag submission.

README:

        _))
        > *\     _~
        `;'\__-' \_
    ____  | )  _ \ \ _____________________
    ____  / / ``  w w ____________________        
    ____ w w ________________AI_Goat______                                                                          
    ______________________________________

    Presented by: rootcauz

Learn AI security through a series of vulnerable LLM CTF challenges. No sign ups, no cloud fees, run everything locally on your system.

About

Many companies have started to build software that integrates with AI large language models (LLMs) due to the release of ChatGPT and other engines. This explosion of interest has led to the rapid development systems that reintroduce old vulnerabilities and impose new classes of less understood threats. Many company security teams may not be fully equipped do deal with LLM security as the field is still maturing with tools and learning resources.

I've developed AI Goat to learn about LLM development and the security risks companies face that use it. The CTF format is a great way for security researchers to gain practical experience and learn about how these systems are vulnerable and can be exploited. Thank you for your interest in this project and I hope you have fun!

About AI/LLM Security Risks

The OWASP Top 10 for LLM Applications is a great place to start learning about LLM security threats and mitigations. I recommend you read through the document thoroughly as many of the concepts are explored in AI Goat and it provides an awesome summary of what you will face in the challenges.

Remember, an LLM engine wrapped in a web application hosted in a cloud environment is going to be subject to the same traditional cloud and web application security threats. In addition to these traditional threats, LLM projects will also be subject to the following noncomprehensive list of threats:

Prompt Injection
Insecure Output Handling
Training Data Poisoning
Denial of Service
Supply Chain
Permission Issues
Data Leakage
Excessive Agency
Overreliance
Insecure Plugins

How AI Goat Works

AI Goat uses the Vicuna LLM which derived from Meta's LLaMA and coupled with ChatGPT's response data. When installing AI Goat the LLM binary is downloaded from HuggingFace locally on your computer. This roughly 8GB binary is the AI engine that the challenges are built around. The LLM binary essentially takes an input "prompt" and gives an output, "response". The prompt consists of three elements concatenated together in a string. These elements are: 1. Instructions; 2. Question; and 3. Response. The Instructions element consists of the described rules for the LLM. They are meant to describe to the LLM how it is supposed to behave. The Question element is where most systems allow user input. For instance, the comment entered into a chat engine would be placed in the Question element. Lastly, the Response section prescribes that the LLM give a response to the question.

A prebuilt Docker image, ai-base, has all the libraries needed to run the LLM and challenges. This container is downloaded during the installation process along with the LLM binary. A docker compose that launches each challenge attaches the LLM binary, specific challenge files, and exposes TCP ports needed to complete each challenge. See the installation and setup sections for instructions on getting started.

An optional CTFd container has been prepared that includes each challenge description, hints, category, and flag submission. The container image is hosted in our dockerhub and is call ai-ctfd alongside the ai-base image. The ai-ctfd container can be launched from the ai-goat.py and accessed using your browser.

Installation

Requirements

git
- sudo apt install git -y
python3
pip3
- sudo apt install python3-pip -y
Docker
docker-compose
User in docker group
- sudo usermod -aG docker $USER
- reboot
8GBs of drive space
Minimum 16GB system memory with at least 8GB dedicated to the challenge; otherwise LLM responses take too long
A love for cybersecurity!

Directions

git clone https://github.com/dhammon/ai-goat
cd ai-goat
pip3 install -r requirements.txt
chmod +x ai-goat.py
./ai-goat.py --install

Use

This section expects that you have already followed the Installation steps.

Step 1 - Start ai-ctfd (optional)

Using ai-ctfd provides you with a listing of all the challenges and flag submission. It is a great tool to use by yourself or when hosting a CTF. Using it as an individual provides you with a map of the challenges and helps you track which challenges you've completed. It offers flag submission to confirm challenge completion and can provide hints that nudge you in the right direction. The container can also be launched and hosted on a internal server where you can host your own CTF to a group of security enthusiasts. The following command launches ai-ctfd in the background and can be accessed on port 8000:

./ai-goat.py --run ctfd

Important: Once launched, you must create a user registering a user account. This registration stays local on the container and does not require a real email address.

Step 1.1 - Change the Flags

You can change the flags within the challenges source code and then in CTFD (they must match).

After you clone the repo, navigate to ai-goat/app/challenges/1/app.py and change the flag in the string on line 12.
Then navigate to ai-goat/app/challenges/2/entrypoint.sh and change the flag on line 3.
Next you will need to change the flags in CTFD. Launch CTFD (./ai-goat.py --run ctfd and open browser to http://127.0.0.1:8000) and then login with the root user using qVLv27Dsy5WuXRubjfII as the password.
Once logged in, navigate the admin panel (top nav bar) -> Challenges (top nav bar) -> select a challenge -> and hit the Flags sub-tab.
Change the flag for each CTFD challenge to match the same string you changed the in the source code.

Have fun!

Step 2 - Run a Challenge

See the Challenges section for a description of each challenge or refer to the ai-ctfd web page from Step 1. The following command launches the first challenge:

./ai-goat.py --run 1

The challenge container will launch in the background and provide instructions on how to interact with the challenge. Each challenge has a flag, which is a string surrounded by curly braces. For example: {eXampl3F!ag}. You verify the flag through submission to the subject challenge in the ai-ctfd page.

Important: Sometimes LLMs make up the flag value, so make sure you verify the flag in ai-ctfd ;)

Important: The LLM response can take 30 seconds or so

Troubleshooting

Challenges can be restarted by rerunning the challenge. Example: ./ai-goat.py --run <CHALLENGE NUMBER>. This command will restart the container if it is already running.
You might inadvertently pollute or break a challenge container. Use docker commands to stop containers if needed. To list containers: docker container ps. To stop a container: docker stop <CONTAINER NAME>.

Upgrade

The main branch will always contain the latest release. Historical versions will be maintained in respective branches. Upgrade instructions:

cd ai-goat
git pull
pip3 install -r requirements.txt
chmod +x ai-goat.py
./ai-goat.py --install

Important: ai-ctfd will effectively start from scratch so make sure to create a new user and resubmit/redo any previous challenges if desired.

Challenges

Challenge 1 - Basic Prompt Injection

Successful prompt injection attacks influence the LLM to respond with information they are not intended to divulge. In this challenge, an AI chatbot running on port 9001 is storing the flag as a secret and has been prompted to not give it up. Can you convince or trick the bot into giving you the flag?

./ai-goat.py -r 1

LLM01: Prompt Injections | LLM07: Data Leakage

Challenge 2 - Title Requestor

LLM system output shouldn't be trusted, especially when that output is used in downstream operations such as OS commands or network calls. This challenge has another AI chatbot running on port 9002 that takes a user question and returns a website's title. The user input is converted into a URL by the chatbot where it is used to request that site's source while ripping the the title. What else could this chatbot have network access to?

./ai-goat.py -r 2

LLM02: Insecure Output Handling

Versioning

Latest version is main branch. You can find the version in the CHANGELOG.md file. Branches are created for each respective version.

Credits

CTF engine: CTFD

Art by: ejm97 on ascii.co.uk

AI container technology:

Library: llama-cpp-python
Large Language Model: Vicuna LLM

For Tasks:

Click tags to check more tools for each tasks

exploit prompt injections analyze insecure output handling run ctf challenges learn about ai security risks practice llm vulnerability testing

For Jobs:

cybersecurity analyst security researcher ai security engineer penetration tester ctf challenge creator

Alternative AI tools for ai-goat

Similar Open Source Tools

ai-goat

github

: 105

HackBot

HackBot is an AI-powered cybersecurity chatbot designed to provide accurate answers to cybersecurity-related queries, conduct code analysis, and scan analysis. It utilizes the Meta-LLama2 AI model through the 'LlamaCpp' library to respond coherently. The chatbot offers features like local AI/Runpod deployment support, cybersecurity chat assistance, interactive interface, clear output presentation, static code analysis, and vulnerability analysis. Users can interact with HackBot through a command-line interface and utilize it for various cybersecurity tasks.

github

: 232

airbroke

Airbroke is an open-source error catcher tool designed for modern web applications. It provides a PostgreSQL-based backend with an Airbrake-compatible HTTP collector endpoint and a React-based frontend for error management. The tool focuses on simplicity, maintaining a small database footprint even under heavy data ingestion. Users can ask AI about issues, replay HTTP exceptions, and save/manage bookmarks for important occurrences. Airbroke supports multiple OAuth providers for secure user authentication and offers occurrence charts for better insights into error occurrences. The tool can be deployed in various ways, including building from source, using Docker images, deploying on Vercel, Render.com, Kubernetes with Helm, or Docker Compose. It requires Node.js, PostgreSQL, and specific system resources for deployment.

github

: 179

cluster-toolkit

Cluster Toolkit is an open-source software by Google Cloud for deploying AI/ML and HPC environments on Google Cloud. It allows easy deployment following best practices, with high customization and extensibility. The toolkit includes tutorials, examples, and documentation for various modules designed for AI/ML and HPC use cases.

github

: 231

vigenair

ViGenAiR is a tool that harnesses the power of Generative AI models on Google Cloud Platform to automatically transform long-form Video Ads into shorter variants, targeting different audiences. It generates video, image, and text assets for Demand Gen and YouTube video campaigns. Users can steer the model towards generating desired videos, conduct A/B testing, and benefit from various creative features. The tool offers benefits like diverse inventory, compelling video ads, creative excellence, user control, and performance insights. ViGenAiR works by analyzing video content, splitting it into coherent segments, and generating variants following Google's best practices for effective ads.

github

: 83

chronon

Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.

github

: 766

ezkl

EZKL is a library and command-line tool for doing inference for deep learning models and other computational graphs in a zk-snark (ZKML). It enables the following workflow: 1. Define a computational graph, for instance a neural network (but really any arbitrary set of operations), as you would normally in pytorch or tensorflow. 2. Export the final graph of operations as an .onnx file and some sample inputs to a .json file. 3. Point ezkl to the .onnx and .json files to generate a ZK-SNARK circuit with which you can prove statements such as: > "I ran this publicly available neural network on some private data and it produced this output" > "I ran my private neural network on some public data and it produced this output" > "I correctly ran this publicly available neural network on some public data and it produced this output" In the backend we use the collaboratively-developed Halo2 as a proof system. The generated proofs can then be verified with much less computational resources, including on-chain (with the Ethereum Virtual Machine), in a browser, or on a device.

github

: 1.0k

mahilo

Mahilo is a flexible framework for creating multi-agent systems that can interact with humans while sharing context internally. It allows developers to set up complex agent networks for various applications, from customer service to emergency response simulations. Agents can communicate with each other and with humans, making the system efficient by handling context from multiple agents and helping humans stay focused on specific problems. The system supports Realtime API for voice interactions, WebSocket-based communication, flexible communication patterns, session management, and easy agent definition.

github

: 338

feedgen

FeedGen is an open-source tool that uses Google Cloud's state-of-the-art Large Language Models (LLMs) to improve product titles, generate more comprehensive descriptions, and fill missing attributes in product feeds. It helps merchants and advertisers surface and fix quality issues in their feeds using Generative AI in a simple and configurable way. The tool relies on GCP's Vertex AI API to provide both zero-shot and few-shot inference capabilities on GCP's foundational LLMs. With few-shot prompting, users can customize the model's responses towards their own data, achieving higher quality and more consistent output. FeedGen is an Apps Script based application that runs as an HTML sidebar in Google Sheets, allowing users to optimize their feeds with ease.

github

: 183

trinityX

TrinityX is an open-source HPC, AI, and cloud platform designed to provide all services required in a modern system, with full customization options. It includes default services like Luna node provisioner, OpenLDAP, SLURM or OpenPBS, Prometheus, Grafana, OpenOndemand, and more. TrinityX also sets up NFS-shared directories, OpenHPC applications, environment modules, HA, and more. Users can install TrinityX on Enterprise Linux, configure network interfaces, set up passwordless authentication, and customize the installation using Ansible playbooks. The platform supports HA, OpenHPC integration, and provides detailed documentation for users to contribute to the project.

github

: 80

serverless-pdf-chat

The serverless-pdf-chat repository contains a sample application that allows users to ask natural language questions of any PDF document they upload. It leverages serverless services like Amazon Bedrock, AWS Lambda, and Amazon DynamoDB to provide text generation and analysis capabilities. The application architecture involves uploading a PDF document to an S3 bucket, extracting metadata, converting text to vectors, and using a LangChain to search for information related to user prompts. The application is not intended for production use and serves as a demonstration and educational tool.

github

: 221

azure-search-openai-demo

This sample demonstrates a few approaches for creating ChatGPT-like experiences over your own data using the Retrieval Augmented Generation pattern. It uses Azure OpenAI Service to access a GPT model (gpt-35-turbo), and Azure AI Search for data indexing and retrieval. The repo includes sample data so it's ready to try end to end. In this sample application we use a fictitious company called Contoso Electronics, and the experience allows its employees to ask questions about the benefits, internal policies, as well as job descriptions and roles.

github

: 6.0k

LLM_Web_search

LLM_Web_search project gives local LLMs the ability to search the web by outputting a specific command. It uses regular expressions to extract search queries from model output and then utilizes duckduckgo-search to search the web. LangChain's Contextual compression and Okapi BM25 or SPLADE are used to extract relevant parts of web pages in search results. The extracted results are appended to the model's output.

github

: 232

vector-vein

VectorVein is a no-code AI workflow software inspired by LangChain and langflow, aiming to combine the powerful capabilities of large language models and enable users to achieve intelligent and automated daily workflows through simple drag-and-drop actions. Users can create powerful workflows without the need for programming, automating all tasks with ease. The software allows users to define inputs, outputs, and processing methods to create customized workflow processes for various tasks such as translation, mind mapping, summarizing web articles, and automatic categorization of customer reviews.

github

: 887

ollama-autocoder

Ollama Autocoder is a simple to use autocompletion engine that integrates with Ollama AI. It provides options for streaming functionality and requires specific settings for optimal performance. Users can easily generate text completions by pressing a key or using a command pallete. The tool is designed to work with Ollama API and a specified model, offering real-time generation of text suggestions.

github

: 92

AppAgent

AppAgent is a novel LLM-based multimodal agent framework designed to operate smartphone applications. Our framework enables the agent to operate smartphone applications through a simplified action space, mimicking human-like interactions such as tapping and swiping. This novel approach bypasses the need for system back-end access, thereby broadening its applicability across diverse apps. Central to our agent's functionality is its innovative learning method. The agent learns to navigate and use new apps either through autonomous exploration or by observing human demonstrations. This process generates a knowledge base that the agent refers to for executing complex tasks across different applications.

github

: 4.7k

For similar tasks

ai-goat

github

: 105

For similar jobs

ai-goat

github

: 105

ciso-assistant-community

CISO Assistant is a tool that helps organizations manage their cybersecurity posture and compliance. It provides a centralized platform for managing security controls, threats, and risks. CISO Assistant also includes a library of pre-built frameworks and tools to help organizations quickly and easily implement best practices.

github

: 2.8k

PurpleLlama

Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.

github

: 2.9k

vpnfast.github.io

VPNFast is a lightweight and fast VPN service provider that offers secure and private internet access. With VPNFast, users can protect their online privacy, bypass geo-restrictions, and secure their internet connection from hackers and snoopers. The service provides high-speed servers in multiple locations worldwide, ensuring a reliable and seamless VPN experience for users. VPNFast is easy to use, with a user-friendly interface and simple setup process. Whether you're browsing the web, streaming content, or accessing sensitive information, VPNFast helps you stay safe and anonymous online.

github

: 80

taranis-ai

Taranis AI is an advanced Open-Source Intelligence (OSINT) tool that leverages Artificial Intelligence to revolutionize information gathering and situational analysis. It navigates through diverse data sources like websites to collect unstructured news articles, utilizing Natural Language Processing and Artificial Intelligence to enhance content quality. Analysts then refine these AI-augmented articles into structured reports that serve as the foundation for deliverables such as PDF files, which are ultimately published.

github

: 358

NightshadeAntidote

Nightshade Antidote is an image forensics tool used to analyze digital images for signs of manipulation or forgery. It implements several common techniques used in image forensics including metadata analysis, copy-move forgery detection, frequency domain analysis, and JPEG compression artifacts analysis. The tool takes an input image, performs analysis using the above techniques, and outputs a report summarizing the findings.

github

: 163

h4cker

This repository is a comprehensive collection of cybersecurity-related references, scripts, tools, code, and other resources. It is carefully curated and maintained by Omar Santos. The repository serves as a supplemental material provider to several books, video courses, and live training created by Omar Santos. It encompasses over 10,000 references that are instrumental for both offensive and defensive security professionals in honing their skills.

github

: 20.4k

AIMr

AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.

github

: 229