shannon

Fully autonomous AI hacker to find actual exploits in your web apps. Shannon has achieved a 96.15% success rate on the hint-free, source-aware XBOW Benchmark.

Stars: 20466

Visit

Shannon is an AI pentester that delivers actual exploits, not just alerts. It autonomously hunts for attack vectors in your code, then uses its built-in browser to execute real exploits, such as injection attacks, and auth bypass, to prove the vulnerability is actually exploitable. Shannon closes the security gap by acting as your on-demand whitebox pentester, providing concrete proof of vulnerabilities to let you ship with confidence. It is a core component of the Keygraph Security and Compliance Platform, automating penetration testing and compliance journey. Shannon Lite achieves a 96.15% success rate on a hint-free, source-aware XBOW benchmark.

README:

[!NOTE] Shannon Lite achieves a 96.15% success rate on a hint-free, source-aware XBOW benchmark. →

Shannon is your fully autonomous AI pentester.

Shannon’s job is simple: break your web app before anyone else does.
The Red Team to your vibe-coding Blue team.
Every Claude (coder) deserves their Shannon.

Website • Discord

🎯 What is Shannon?

Shannon is an AI pentester that delivers actual exploits, not just alerts.

Shannon's goal is to break your web app before someone else does. It autonomously hunts for attack vectors in your code, then uses its built-in browser to execute real exploits, such as injection attacks, and auth bypass, to prove the vulnerability is actually exploitable.

What Problem Does Shannon Solve?

Thanks to tools like Claude Code and Cursor, your team ships code non-stop. But your penetration test? That happens once a year. This creates a massive security gap. For the other 364 days, you could be unknowingly shipping vulnerabilities to production.

Shannon closes this gap by acting as your on-demand whitebox pentester. It doesn't just find potential issues. It executes real exploits, providing concrete proof of vulnerabilities. This lets you ship with confidence, knowing every build can be secured.

[!NOTE] From Autonomous Pentesting to Automated Compliance

Shannon is a core component of the Keygraph Security and Compliance Platform.

While Shannon automates the critical task of penetration testing for your application, our broader platform automates your entire compliance journey—from evidence collection to audit readiness. We're building the "Rippling for Cybersecurity," a single platform to manage your security posture and streamline compliance frameworks like SOC 2 and HIPAA.

➡️ Learn more about the Keygraph Platform

🎬 See Shannon in Action

Real Results: Shannon discovered 20+ critical vulnerabilities in OWASP Juice Shop, including complete auth bypass and database exfiltration. See full report →

✨ Features

Fully Autonomous Operation: Launch the pentest with a single command. The AI handles everything from advanced 2FA/TOTP logins (including sign in with Google) and browser navigation to the final report with zero intervention.
Pentester-Grade Reports with Reproducible Exploits: Delivers a final report focused on proven, exploitable findings, complete with copy-and-paste Proof-of-Concepts to eliminate false positives and provide actionable results.
Critical OWASP Vulnerability Coverage: Currently identifies and validates the following critical vulnerabilities: Injection, XSS, SSRF, and Broken Authentication/Authorization, with more types in development.
Code-Aware Dynamic Testing: Analyzes your source code to intelligently guide its attack strategy, then performs live, browser and command line based exploits on the running application to confirm real-world risk.
Powered by Integrated Security Tools: Enhances its discovery phase by leveraging leading reconnaissance and testing tools—including Nmap, Subfinder, WhatWeb, and Schemathesis—for deep analysis of the target environment.
Parallel Processing for Faster Results: Get your report faster. The system parallelizes the most time-intensive phases, running analysis and exploitation for all vulnerability types concurrently.

📦 Product Line

Shannon is available in two editions:

Edition	License	Best For
Shannon Lite	AGPL-3.0	Security teams, independent researchers, testing your own applications
Shannon Pro	Commercial	Enterprises requiring advanced features, CI/CD integration, and dedicated support

This repository contains Shannon Lite, which utilizes our core autonomous AI pentesting framework. Shannon Pro enhances this foundation with an advanced, LLM-powered data flow analysis engine (inspired by the LLMDFA paper) for enterprise-grade code analysis and deeper vulnerability detection.

[!IMPORTANT] White-box only. Shannon Lite is designed for white-box (source-available) application security testing.
It expects access to your application's source code and repository layout.

See feature comparison

🚀 Setup & Usage Instructions

Prerequisites

Docker - Container runtime (Install Docker)
AI Provider Credentials (choose one):
- Anthropic API key (recommended) - Get from Anthropic Console
- Claude Code OAuth token
- [EXPERIMENTAL - UNSUPPORTED] Alternative providers via Router Mode - OpenAI or Google Gemini via OpenRouter (see Router Mode)

Quick Start

# 1. Clone Shannon
git clone https://github.com/KeygraphHQ/shannon.git
cd shannon

# 2. Configure credentials (choose one method)

# Option A: Export environment variables
export ANTHROPIC_API_KEY="your-api-key"              # or CLAUDE_CODE_OAUTH_TOKEN

# Option B: Create a .env file
cat > .env << 'EOF'
ANTHROPIC_API_KEY=your-api-key
EOF

# 3. Run a pentest
./shannon start URL=https://your-app.com REPO=your-repo

Shannon will build the containers, start the workflow, and return a workflow ID. The pentest runs in the background.

Monitoring Progress

# View real-time worker logs
./shannon logs

# Query a specific workflow's progress
./shannon query ID=shannon-1234567890

# Open the Temporal Web UI for detailed monitoring
open http://localhost:8233

Stopping Shannon

# Stop all containers (preserves workflow data)
./shannon stop

# Full cleanup (removes all data)
./shannon stop CLEAN=true

Usage Examples

# Basic pentest
./shannon start URL=https://example.com REPO=repo-name

# With a configuration file
./shannon start URL=https://example.com REPO=repo-name CONFIG=./configs/my-config.yaml

# Custom output directory
./shannon start URL=https://example.com REPO=repo-name OUTPUT=./my-reports

Prepare Your Repository

Shannon expects target repositories to be placed under the ./repos/ directory at the project root. The REPO flag refers to a folder name inside ./repos/. Copy the repository you want to scan into ./repos/, or clone it directly there:

git clone https://github.com/your-org/your-repo.git ./repos/your-repo

For monorepos:

git clone https://github.com/your-org/your-monorepo.git ./repos/your-monorepo

For multi-repository applications (e.g., separate frontend/backend):

mkdir ./repos/your-app
cd ./repos/your-app
git clone https://github.com/your-org/frontend.git
git clone https://github.com/your-org/backend.git
git clone https://github.com/your-org/api.git

Platform-Specific Instructions

For Linux (Native Docker):

You may need to run commands with sudo depending on your Docker setup. If you encounter permission issues with output files, ensure your user has access to the Docker socket.

For macOS:

Works out of the box with Docker Desktop installed.

Testing Local Applications:

Docker containers cannot reach localhost on your host machine. Use host.docker.internal in place of localhost:

./shannon start URL=http://host.docker.internal:3000 REPO=repo-name

Configuration (Optional)

While you can run without a config file, creating one enables authenticated testing and customized analysis. Place your configuration files inside the ./configs/ directory — this folder is mounted into the Docker container automatically.

Create Configuration File

Copy and modify the example configuration:

cp configs/example-config.yaml configs/my-app-config.yaml

Basic Configuration Structure

authentication:
  login_type: form
  login_url: "https://your-app.com/login"
  credentials:
    username: "[email protected]"
    password: "yourpassword"
    totp_secret: "LB2E2RX7XFHSTGCK"  # Optional for 2FA

  login_flow:
    - "Type $username into the email field"
    - "Type $password into the password field"
    - "Click the 'Sign In' button"

  success_condition:
    type: url_contains
    value: "/dashboard"

rules:
  avoid:
    - description: "AI should avoid testing logout functionality"
      type: path
      url_path: "/logout"

  focus:
    - description: "AI should emphasize testing API endpoints"
      type: path
      url_path: "/api"

TOTP Setup for 2FA

If your application uses two-factor authentication, simply add the TOTP secret to your config file. The AI will automatically generate the required codes during testing.

[EXPERIMENTAL - UNSUPPORTED] Router Mode (Alternative Providers)

Shannon can experimentally route requests through alternative AI providers using claude-code-router. This mode is not officially supported and is intended primarily for:

Model experimentation — try Shannon with GPT-5.2 or Gemini 3–family models

Quick Setup

Add your provider API key to .env:

# Choose one provider:
OPENAI_API_KEY=sk-...
# OR
OPENROUTER_API_KEY=sk-or-...

# Set default model:
ROUTER_DEFAULT=openai,gpt-5.2  # provider,model format

Run with ROUTER=true:

./shannon start URL=https://example.com REPO=repo-name ROUTER=true

Experimental Models

Provider	Models
OpenAI	gpt-5.2, gpt-5-mini
OpenRouter	google/gemini-3-flash-preview

Disclaimer

This feature is experimental and unsupported. Output quality depends heavily on the model. Shannon is built on top of the Anthropic Agent SDK and is optimized and primarily tested with Anthropic Claude models. Alternative providers may produce inconsistent results (including failing early phases like Recon) depending on the model and routing setup.

Output and Results

All results are saved to ./audit-logs/{hostname}_{sessionId}/ by default. Use --output <path> to specify a custom directory.

Output structure:

audit-logs/{hostname}_{sessionId}/
├── session.json          # Metrics and session data
├── agents/               # Per-agent execution logs
├── prompts/              # Prompt snapshots for reproducibility
└── deliverables/
    └── comprehensive_security_assessment_report.md   # Final comprehensive security report

📊 Sample Reports

Looking for quantitative benchmarks? See full benchmark methodology and results →

See Shannon's capabilities in action with penetration test results from industry-standard vulnerable applications:

🧃 OWASP Juice Shop • GitHub

A notoriously insecure web application maintained by OWASP, designed to test a tool's ability to uncover a wide range of modern vulnerabilities.

Performance: Identified over 20 high-impact vulnerabilities across targeted OWASP categories in a single automated run.

Key Accomplishments:

Achieved complete authentication bypass and exfiltrated the entire user database via Injection attack
Executed a full privilege escalation by creating a new administrator account through a registration workflow bypass
Identified and exploited systemic authorization flaws (IDOR) to access and modify any user's private data and shopping cart
Discovered a Server-Side Request Forgery (SSRF) vulnerability, enabling internal network reconnaissance

📄 View Complete Report →

🔗 c{api}tal API • GitHub

An intentionally vulnerable API from Checkmarx, designed to test a tool's ability to uncover the OWASP API Security Top 10.

Performance: Identified nearly 15 critical and high-severity vulnerabilities, leading to full application compromise.

Key Accomplishments:

Executed a root-level Injection attack by bypassing a denylist via command chaining in a hidden debug endpoint
Achieved complete authentication bypass by discovering and targeting a legacy, unpatched v1 API endpoint
Escalated a regular user to full administrator privileges by exploiting a Mass Assignment vulnerability in the user profile update function
Demonstrated high accuracy by correctly confirming the application's robust XSS defenses, reporting zero false positives

📄 View Complete Report →

🚗 OWASP crAPI • GitHub

A modern, intentionally vulnerable API from OWASP, designed to benchmark a tool's effectiveness against the OWASP API Security Top 10.

Performance: Identified over 15 critical and high-severity vulnerabilities, achieving full application compromise.

Key Accomplishments:

Bypassed authentication using multiple advanced JWT attacks, including Algorithm Confusion, alg:none, and weak key (kid) injection
Achieved full database compromise via Injection attacks, exfiltrating user credentials from the PostgreSQL database
Executed a critical Server-Side Request Forgery (SSRF) attack that successfully forwarded internal authentication tokens to an external service
Demonstrated high accuracy by correctly identifying the application's robust XSS defenses, reporting zero false positives

📄 View Complete Report →

These results demonstrate Shannon's ability to move beyond simple scanning, performing deep contextual exploitation with minimal false positives and actionable proof-of-concepts.

🏗️ Architecture

Shannon emulates a human penetration tester's methodology using a sophisticated multi-agent architecture. It combines white-box source code analysis with black-box dynamic exploitation across four distinct phases:

                    ┌──────────────────────┐
                    │    Reconnaissance    │
                    └──────────┬───────────┘
                               │
                               ▼
                    ┌──────────┴───────────┐
                    │          │           │
                    ▼          ▼           ▼
        ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
        │ Vuln Analysis   │ │ Vuln Analysis   │ │      ...        │
        │  (Injection)    │ │     (XSS)       │ │                 │
        └─────────┬───────┘ └─────────┬───────┘ └─────────┬───────┘
                  │                   │                   │
                  ▼                   ▼                   ▼
        ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
        │  Exploitation   │ │  Exploitation   │ │      ...        │
        │  (Injection)    │ │     (XSS)       │ │                 │
        └─────────┬───────┘ └─────────┬───────┘ └─────────┬───────┘
                  │                   │                   │
                  └─────────┬─────────┴───────────────────┘
                            │
                            ▼
                    ┌──────────────────────┐
                    │      Reporting       │
                    └──────────────────────┘

Architectural Overview

Shannon is engineered to emulate the methodology of a human penetration tester. It leverages Anthropic's Claude Agent SDK as its core reasoning engine, but its true strength lies in the sophisticated multi-agent architecture built around it. This architecture combines the deep context of white-box source code analysis with the real-world validation of black-box dynamic exploitation, managed by an orchestrator through four distinct phases to ensure a focus on minimal false positives and intelligent context management.

Phase 1: Reconnaissance

The first phase builds a comprehensive map of the application's attack surface. Shannon analyzes the source code and integrates with tools like Nmap and Subfinder to understand the tech stack and infrastructure. Simultaneously, it performs live application exploration via browser automation to correlate code-level insights with real-world behavior, producing a detailed map of all entry points, API endpoints, and authentication mechanisms for the next phase.

Phase 2: Vulnerability Analysis

To maximize efficiency, this phase operates in parallel. Using the reconnaissance data, specialized agents for each OWASP category hunt for potential flaws in parallel. For vulnerabilities like Injection and SSRF, agents perform a structured data flow analysis, tracing user input to dangerous sinks. This phase produces a key deliverable: a list of hypothesized exploitable paths that are passed on for validation.

Phase 3: Exploitation

Continuing the parallel workflow to maintain speed, this phase is dedicated entirely to turning hypotheses into proof. Dedicated exploit agents receive the hypothesized paths and attempt to execute real-world attacks using browser automation, command-line tools, and custom scripts. This phase enforces a strict "No Exploit, No Report" policy: if a hypothesis cannot be successfully exploited to demonstrate impact, it is discarded as a false positive.

Phase 4: Reporting

The final phase compiles all validated findings into a professional, actionable report. An agent consolidates the reconnaissance data and the successful exploit evidence, cleaning up any noise or hallucinated artifacts. Only verified vulnerabilities are included, complete with reproducible, copy-and-paste Proof-of-Concepts, delivering a final pentest-grade report focused exclusively on proven risks.

📋 Coverage and Roadmap

For detailed information about Shannon's security testing coverage and development roadmap, see our Coverage and Roadmap documentation.

⚠️ Disclaimers

Important Usage Guidelines & Disclaimers

Please review the following guidelines carefully before using Shannon (Lite). As a user, you are responsible for your actions and assume all liability.

1. Potential for Mutative Effects & Environment Selection

This is not a passive scanner. The exploitation agents are designed to actively execute attacks to confirm vulnerabilities. This process can have mutative effects on the target application and its data.

[!WARNING] ⚠️ DO NOT run Shannon on production environments.

It is intended exclusively for use on sandboxed, staging, or local development environments where data integrity is not a concern.

Potential mutative effects include, but are not limited to: creating new users, modifying or deleting data, compromising test accounts, and triggering unintended side effects from injection attacks.

2. Legal & Ethical Use

Shannon is designed for legitimate security auditing purposes only.

[!CAUTION] You must have explicit, written authorization from the owner of the target system before running Shannon.

Unauthorized scanning and exploitation of systems you do not own is illegal and can be prosecuted under laws such as the Computer Fraud and Abuse Act (CFAA). Keygraph is not responsible for any misuse of Shannon.

3. LLM & Automation Caveats

Verification is Required: While significant engineering has gone into our "proof-by-exploitation" methodology to eliminate false positives, the underlying LLMs can still generate hallucinated or weakly-supported content in the final report. Human oversight is essential to validate the legitimacy and severity of all reported findings.
Comprehensiveness: The analysis in Shannon Lite may not be exhaustive due to the inherent limitations of LLM context windows. For a more comprehensive, graph-based analysis of your entire codebase, Shannon Pro leverages its advanced data flow analysis engine to ensure deeper and more thorough coverage.

4. Scope of Analysis

Targeted Vulnerabilities: The current version of Shannon Lite specifically targets the following classes of exploitable vulnerabilities:
- Broken Authentication & Authorization
- Injection
- Cross-Site Scripting (XSS)
- Server-Side Request Forgery (SSRF)
What Shannon Lite Does Not Cover: This list is not exhaustive of all potential security risks. Shannon Lite's "proof-by-exploitation" model means it will not report on issues it cannot actively exploit, such as vulnerable third-party libraries or insecure configurations. These types of deep static-analysis findings are a core focus of the advanced analysis engine in Shannon Pro.

5. Cost & Performance

Time: As of the current version, a full test run typically takes 1 to 1.5 hours to complete.
Cost: Running the full test using Anthropic's Claude 4.5 Sonnet model may incur costs of approximately $50 USD. Costs vary based on model pricing and application complexity.

6. Windows Antivirus False Positives

Windows Defender may flag files in xben-benchmark-results/ or deliverables/ as malware. These are false positives caused by exploit code in the reports. Add an exclusion for the Shannon directory in Windows Defender, or use Docker/WSL2.

📜 License

Shannon Lite is released under the GNU Affero General Public License v3.0 (AGPL-3.0).

Shannon is open source (AGPL v3). This license allows you to:

Use it freely for all internal security testing.
Modify the code privately for internal use without sharing your changes.

The AGPL's sharing requirements primarily apply to organizations offering Shannon as a public or managed service (such as a SaaS platform). In those specific cases, any modifications made to the core software must be open-sourced.

👥 Community & Support

Community Resources

Contributing: At this time, we’re not accepting external code contributions (PRs).
Issues are welcome for bug reports and feature requests.

🐛 Report bugs via GitHub Issues
💡 Suggest features in Discussions
💬 Join our Discord for real-time community support

Stay Connected

🐦 Twitter: @KeygraphHQ
💼 LinkedIn: Keygraph
🌐 Website: keygraph.io

💬 Get in Touch

Interested in Shannon Pro?

Shannon Pro is designed for organizations serious about application security. It offers enterprise-grade features, dedicated support, and seamless CI/CD integration, all powered by our most advanced LLM-based analysis engine. Find and fix complex vulnerabilities deep in your codebase before they ever reach production.

For a detailed breakdown of features, technical differences, and enterprise use cases, see our complete comparison guide.

Or contact us directly:

📧 Email: [email protected]

Built with ❤️ by the Keygraph team
Making application security accessible to everyone

For Tasks:

Click tags to check more tools for each tasks

test web apps find vulnerabilities execute exploits close security gaps automate compliance

For Jobs:

security analyst penetration tester security engineer security consultant ethical hacker

Alternative AI tools for shannon

Similar Open Source Tools

shannon

github

: 20.5k

OpenViking

OpenViking is an open-source Context Database designed specifically for AI Agents. It aims to solve challenges in agent development by unifying memories, resources, and skills in a filesystem management paradigm. The tool offers tiered context loading, directory recursive retrieval, visualized retrieval trajectory, and automatic session management. Developers can interact with OpenViking like managing local files, enabling precise context manipulation and intuitive traceable operations. The tool supports various model services like OpenAI and Volcengine, enhancing semantic retrieval and context understanding for AI Agents.

github

: 1.1k

tambourine-voice

Tambourine is a personal voice interface tool that allows users to speak naturally and have their words appear wherever the cursor is. It is powered by customizable AI voice dictation, providing a universal voice-to-text interface for emails, messages, documents, code editors, and terminals. Users can capture ideas quickly, type at the speed of thought, and benefit from AI formatting that cleans up speech, adds punctuation, and applies personal dictionaries. Tambourine offers full control and transparency, with the ability to customize AI providers, formatting, and extensions. The tool supports dual-mode recording, real-time speech-to-text, LLM text formatting, context-aware formatting, customizable prompts, and more, making it a versatile solution for dictation and transcription tasks.

github

: 258

cai

github

: 4.2k

BioAgents

BioAgents AgentKit is an advanced AI agent framework tailored for biological and scientific research. It offers powerful conversational AI capabilities with specialized knowledge in biology, life sciences, and scientific research methodologies. The framework includes state-of-the-art analysis agents, configurable research agents, and a variety of specialized agents for tasks such as file parsing, research planning, literature search, data analysis, hypothesis generation, research reflection, and user-facing responses. BioAgents also provides support for LLM libraries, multiple search backends for literature agents, and two backends for data analysis. The project structure includes backend source code, services for chat, job queue system, real-time notifications, and JWT authentication, as well as a frontend UI built with Preact.

github

: 80

marvin

Marvin is a lightweight AI toolkit for building natural language interfaces that are reliable, scalable, and easy to trust. Each of Marvin's tools is simple and self-documenting, using AI to solve common but complex challenges like entity extraction, classification, and generating synthetic data. Each tool is independent and incrementally adoptable, so you can use them on their own or in combination with any other library. Marvin is also multi-modal, supporting both image and audio generation as well using images as inputs for extraction and classification. Marvin is for developers who care more about _using_ AI than _building_ AI, and we are focused on creating an exceptional developer experience. Marvin users should feel empowered to bring tightly-scoped "AI magic" into any traditional software project with just a few extra lines of code. Marvin aims to merge the best practices for building dependable, observable software with the best practices for building with generative AI into a single, easy-to-use library. It's a serious tool, but we hope you have fun with it. Marvin is open-source, free to use, and made with 💙 by the team at Prefect.

github

: 5.9k

llmos

LLMos is an operating system designed for physical AI agents, providing a hybrid runtime environment where AI agents can perceive, reason, act on hardware, and evolve over time locally without cloud dependency. It allows natural language programming, dual-brain architecture for fast instinct and deep planner brains, markdown-as-code for defining agents and skills, and supports swarm intelligence and cognitive world models. The tool is built on a tech stack including Next.js, Electron, Python, and WebAssembly, and is structured around a dual-brain cognitive architecture, volume system, HAL for hardware abstraction, applet system for dynamic UI, and dreaming & evolution for robot improvement. The project is in Phase 1 (Foundation) and aims to move into Phase 2 (Dual-Brain & Local Intelligence), with contributions welcomed under the Apache 2.0 license by Evolving Agents Labs.

github

: 68

mcp-gateway-registry

The MCP Gateway & Registry is a unified, enterprise-ready platform that centralizes access to both MCP Servers and AI Agents using the Model Context Protocol (MCP). It serves as a Unified MCP Server Gateway, MCP Servers Registry, and Agent Registry & A2A Communication Hub. The platform integrates with external registries, providing a single control plane for tool access, agent orchestration, and communication patterns. It transforms the chaos of managing individual MCP server configurations into an organized approach with secure, governed access to curated servers and registered agents. The platform supports dynamic tool discovery, autonomous agent communication, and unified policies for server and agent access.

github

: 426

OpenManus

OpenManus is an open-source project aiming to replicate the capabilities of the Manus AI agent, known for autonomously executing complex tasks like travel planning and stock analysis. The project provides a modular, containerized framework using Docker, Python, and JavaScript, allowing developers to build, deploy, and experiment with a multi-agent AI system. Features include collaborative AI agents, Dockerized environment, task execution support, tool integration, modular design, and community-driven development. Users can interact with OpenManus via CLI, API, or web UI, and the project welcomes contributions to enhance its capabilities.

github

: 228

ramparts

Ramparts is a fast, lightweight security scanner designed for the Model Context Protocol (MCP) ecosystem. It scans MCP servers to identify vulnerabilities and provides security features such as discovering capabilities, multi-transport support, session management, static analysis, cross-origin analysis, LLM-powered analysis, and risk assessment. The tool is suitable for developers, MCP users, and MCP developers to ensure the security of their connections. It can be used for security audits, development testing, CI/CD integration, and compliance with security requirements for AI agent deployments.

github

: 58

probe

Probe is an AI-friendly, fully local, semantic code search tool designed to power the next generation of AI coding assistants. It combines the speed of ripgrep with the code-aware parsing of tree-sitter to deliver precise results with complete code blocks, making it perfect for large codebases and AI-driven development workflows. Probe is fully local, keeping code on the user's machine without relying on external APIs. It supports multiple languages, offers various search options, and can be used in CLI mode, MCP server mode, AI chat mode, and web interface. The tool is designed to be flexible, fast, and accurate, providing developers and AI models with full context and relevant code blocks for efficient code exploration and understanding.

github

: 110

ccprompts

ccprompts is a collection of ~70 Claude Code commands for software development workflows with agent generation capabilities. It includes safety validation and can be used directly with Claude Code or adapted for specific needs. The agent template system provides a wizard for creating specialized sub-agents (e.g., security auditors, systems architects) with standardized formatting and proper tool access. The repository is under active development, so caution is advised when using it in production environments.

github

: 51

paelladoc

PAELLADOC is an intelligent documentation system that uses AI to analyze code repositories and generate comprehensive technical documentation. It offers a modular architecture with MECE principles, interactive documentation process, key features like Orchestrator and Commands, and a focus on context for successful AI programming. The tool aims to streamline documentation creation, code generation, and product management tasks for software development teams, providing a definitive standard for AI-assisted development documentation.

github

: 221

c4-genai-suite

C4-GenAI-Suite is a comprehensive AI tool for generating code snippets and automating software development tasks. It leverages advanced machine learning models to assist developers in writing efficient and error-free code. The suite includes features such as code completion, refactoring suggestions, and automated testing, making it a valuable asset for enhancing productivity and code quality in software development projects.

github

: 164

llmxcpg

LLMxCPG is a framework for vulnerability detection using Code Property Graphs (CPG) and Large Language Models (LLM). It involves a two-phase process: Slice Construction where an LLM generates queries for a CPG to extract a code slice, and Vulnerability Detection where another LLM classifies the code slice as vulnerable or safe. The repository includes implementations of baseline models, information on datasets, scripts for running models, prompt templates, query generation examples, and configurations for fine-tuning models.

github

: 111

trendFinder

Trend Finder is a tool designed to help users stay updated on trending topics on social media by collecting and analyzing posts from key influencers. It sends Slack notifications when new trends or product launches are detected, saving time, keeping users informed, and enabling quick responses to emerging opportunities. The tool features AI-powered trend analysis, social media and website monitoring, instant Slack notifications, and scheduled monitoring using cron jobs. Built with Node.js and Express.js, Trend Finder integrates with Together AI, Twitter/X API, Firecrawl, and Slack Webhooks for notifications.

github

: 2.2k

For similar tasks

shannon

github

: 20.5k

PromptFuzz

**Description:** PromptFuzz is an automated tool that generates high-quality fuzz drivers for libraries via a fuzz loop constructed on mutating LLMs' prompts. The fuzz loop of PromptFuzz aims to guide the mutation of LLMs' prompts to generate programs that cover more reachable code and explore complex API interrelationships, which are effective for fuzzing. **Features:** * **Multiply LLM support** : Supports the general LLMs: Codex, Inocder, ChatGPT, and GPT4 (Currently tested on ChatGPT). * **Context-based Prompt** : Construct LLM prompts with the automatically extracted library context. * **Powerful Sanitization** : The program's syntax, semantics, behavior, and coverage are thoroughly analyzed to sanitize the problematic programs. * **Prioritized Mutation** : Prioritizes mutating the library API combinations within LLM's prompts to explore complex interrelationships, guided by code coverage. * **Fuzz Driver Exploitation** : Infers API constraints using statistics and extends fixed API arguments to receive random bytes from fuzzers. * **Fuzz engine integration** : Integrates with grey-box fuzz engine: LibFuzzer. **Benefits:** * **High branch coverage:** The fuzz drivers generated by PromptFuzz achieved a branch coverage of 40.12% on the tested libraries, which is 1.61x greater than _OSS-Fuzz_ and 1.67x greater than _Hopper_. * **Bug detection:** PromptFuzz detected 33 valid security bugs from 49 unique crashes. * **Wide range of bugs:** The fuzz drivers generated by PromptFuzz can detect a wide range of bugs, most of which are security bugs. * **Unique bugs:** PromptFuzz detects uniquely interesting bugs that other fuzzers may miss. **Usage:** 1. Build the library using the provided build scripts. 2. Export the LLM API KEY if using ChatGPT or GPT4. 3. Generate fuzz drivers using the `fuzzer` command. 4. Run the fuzz drivers using the `harness` command. 5. Deduplicate and analyze the reported crashes. **Future Works:** * **Custom LLMs suport:** Support custom LLMs. * **Close-source libraries:** Apply PromptFuzz to close-source libraries by fine tuning LLMs on private code corpus. * **Performance** : Reduce the huge time cost required in erroneous program elimination.

github

: 230

awesome-gpt-security

Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.

github

: 459

SWE-agent

SWE-agent is a tool that allows language models to autonomously fix issues in GitHub repositories, perform tasks on the web, find cybersecurity vulnerabilities, and handle custom tasks. It uses configurable agent-computer interfaces (ACIs) to interact with isolated computer environments. The tool is built and maintained by researchers from Princeton University and Stanford University.

github

: 17.4k

jadx-ai-mcp

JADX-AI-MCP is a plugin for the JADX decompiler that integrates with Model Context Protocol (MCP) to provide live reverse engineering support with LLMs like Claude. It allows for quick analysis, vulnerability detection, and AI code modification, all in real time. The tool combines JADX-AI-MCP and JADX MCP SERVER to analyze Android APKs effortlessly. It offers various prompts for code understanding, vulnerability detection, reverse engineering helpers, static analysis, AI code modification, and documentation. The tool is part of the Zin MCP Suite and aims to connect all android reverse engineering and APK modification tools with a single MCP server for easy reverse engineering of APK files.

github

: 493

For similar jobs

hackingBuddyGPT

hackingBuddyGPT is a framework for testing LLM-based agents for security testing. It aims to create common ground truth by creating common security testbeds and benchmarks, evaluating multiple LLMs and techniques against those, and publishing prototypes and findings as open-source/open-access reports. The initial focus is on evaluating the efficiency of LLMs for Linux privilege escalation attacks, but the framework is being expanded to evaluate the use of LLMs for web penetration-testing and web API testing. hackingBuddyGPT is released as open-source to level the playing field for blue teams against APTs that have access to more sophisticated resources.

github

: 374

aircrackauto

AirCrackAuto is a tool that automates the aircrack-ng process for Wi-Fi hacking. It is designed to make it easier for users to crack Wi-Fi passwords by automating the process of capturing packets, generating wordlists, and launching attacks. AirCrackAuto is a powerful tool that can be used to crack Wi-Fi passwords in a matter of minutes.

github

: 79

AIMr

AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.

github

: 229

aircrack-ng

Aircrack-ng is a comprehensive suite of tools designed to evaluate the security of WiFi networks. It covers various aspects of WiFi security, including monitoring, attacking (replay attacks, deauthentication, fake access points), testing WiFi cards and driver capabilities, and cracking WEP and WPA PSK. The tools are command line-based, allowing for extensive scripting and have been utilized by many GUIs. Aircrack-ng primarily works on Linux but also supports Windows, macOS, FreeBSD, OpenBSD, NetBSD, Solaris, and eComStation 2.

github

: 5.2k

Awesome_GPT_Super_Prompting

Awesome_GPT_Super_Prompting is a repository that provides resources related to Jailbreaks, Leaks, Injections, Libraries, Attack, Defense, and Prompt Engineering. It includes information on ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, and Adversarial Machine Learning. The repository contains curated lists of repositories, tools, and resources related to GPTs, prompt engineering, prompt libraries, and secure prompting. It also offers insights into Cyber-Albsecop GPT Agents and Super Prompts for custom GPT usage.

github

: 2.0k

ai-exploits

AI Exploits is a repository that showcases practical attacks against AI/Machine Learning infrastructure, aiming to raise awareness about vulnerabilities in the AI/ML ecosystem. It contains exploits and scanning templates for responsibly disclosed vulnerabilities affecting machine learning tools, including Metasploit modules, Nuclei templates, and CSRF templates. Users can use the provided Docker image to easily run the modules and templates. The repository also provides guidelines for using Metasploit modules, Nuclei templates, and CSRF templates to exploit vulnerabilities in machine learning tools.

github

: 1.3k

airgeddon

Airgeddon is a versatile bash script designed for Linux systems to conduct wireless network audits. It provides a comprehensive set of features and tools for auditing and securing wireless networks. The script is user-friendly and offers functionalities such as scanning, capturing handshakes, deauth attacks, and more. Airgeddon is regularly updated and supported, making it a valuable tool for both security professionals and enthusiasts.

github

: 6.8k

PentestGPT

PentestGPT is a penetration testing tool empowered by ChatGPT, designed to automate the penetration testing process. It operates interactively to guide penetration testers in overall progress and specific operations. The tool supports solving easy to medium HackTheBox machines and other CTF challenges. Users can use PentestGPT to perform tasks like testing connections, using different reasoning models, discussing with the tool, searching on Google, and generating reports. It also supports local LLMs with custom parsers for advanced users.

github

: 8.0k