pentest-agent

pentest-agent

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention. For more information, read our paper at https://dl.acm.org/doi/10.1145/3708821.3733882

Stars: 71

Visit
 screenshot

Pentest Agent is a lightweight and versatile tool designed for conducting penetration testing on network systems. It provides a user-friendly interface for scanning, identifying vulnerabilities, and generating detailed reports. The tool is highly customizable, allowing users to define specific targets and parameters for testing. Pentest Agent is suitable for security professionals and ethical hackers looking to assess the security posture of their systems and networks.

README:

PentestAgent

Overview

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention.

The framework is modular and consists of the following components:

  • Reconnaissance Agent: Gathers intelligence about the target system.
  • Planning Agent: Identifies and prioritizes vulnerabilities and potential exploits.
  • Execution Agent: Attempts to execute selected exploits in a controlled environment.

For further is information, please refer to our paper.


🔧 Installation & Setup

Note: We recommend deploying this project on a Kali Linux environment for better compatibility with penetration testing tools and workflows.

1. Clone the Repository

git clone https://github.com/nbshenxm/pentest-agent.git
cd pentest-agent

2. Set Environment Variables

Several environment variables need to be filled in. If you are not familiar with environment variables, set them in the .env file.

Required:

  • PDCP_API_KEY: ProjectDiscovery API key for accessing CVE data and vulnerability information.
  • GITLAB_TOKEN: GitLab token for ExploitDB access.
  • GITHUB_KEY: GitHub token for searching repositories and issues.
  • INDEX_STORAGE_DIR: Directory to store vector indexes for RAG.
  • PLANNING_OUTPUT_DIR: Directory to save planning results.
  • LOG_DIR: Directory to store logs.

Optional:

  • http_proxy, https_proxy: If using a proxy or VPN.

3. Install Python Dependencies

Python version: 3.12

Use a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt

or with Conda:

conda create -n pentest python=3.12
conda activate pentest
python -m pip install -r requirements.txt

4. Install CVEMAP

CVEMAP is needed to fetch CVE-related information. Follow their installation instructions.


⚙️ Configuration

File: pentest_agent/configs/config.yaml

(1) models

Specify the LLM provider, model name, temperature, and API key.

(2) cve

Set the model used for parsing CVE entries and its generation temperature.

(3) cve_scoring

Scoring criteria for evaluating CVEs:

  • Vulnerability type
  • Exploit maturity
  • Remote exploitability
  • Attack complexity
  • Source weighting (ExploitDB, GitHub, Google)

(4) runtime

Reconnaissance Agent:

  • current_topic: Topic identifier for current CVE task.
  • target_ip: IP address of the target host.

Planning Agent:

  • model: LLM Model used for searching exploits and analyzing vulnerability data.
  • keyword, app, version: Target application details.
  • vuln_type: Type of vulnerability to focus on.
  • cvemap_fuzzy_search: Enable fuzzy search for CVE matching.
  • output_dir: Directory to save analysis results.

Execution Agent:

  • current_topic: Task/topic identifier.
  • doc_dir: Directory containing exploit scripts or documents.
  • target_ip, target_port: IP and port of target host.
  • attacker_ip: IP of attacker's machine.
  • command_to_execute: Payload to validate exploitation.
  • model: LLM Model used for exploit execution guidance.

🚀 Running the Agents (Manual)

Reconnaissance Agent

  • File: pentest_agent/agents/recon_agent.py
  • Function: Given a target IP, gathers system and service info.
  • Usage: Set the topic, LLM model, and IP, then run the script.
python pentest_agent/agents/recon_agent.py

Planning Agent

  • File: pentest_agent/agents/planning_agent.py
  • Function: Identifies relevant CVEs and associated exploits from multiple sources.
  • Sources:
    • GitHub repositories and issues
    • ExploitDB entries
    • Google search results
  • Features: Multi-source intelligence aggregation with configurable LLM backends
  • Usage: Set the model and application information.
python pentest_agent/agents/planning_agent.py

Execution Agent

  • File: pentest_agent/agents/execution_agent.py
  • Function: Executes selected exploits based on previous analysis and collected context.
  • Usage: Set the topic, exploit document path, and target info.
python pentest_agent/agents/execution_agent.py

🐳 Docker Deployment

PentestAgent provides Docker support for isolated execution of each agent.

0. Pre-Configuration

Step 1: Edit pentest_agent/configs/config.yaml

Configure all agent parameters under the models, cve, cve_scoring, and runtime sections.

Step 2: Config .env in pentest_agent/docker

Example .env content:

GITHUB_KEY=your_github_token
OPENAI_API_KEY=your_openai_key
HUGGING_FACE_TOKEN=your_hf_token
INDEX_STORAGE_DIR=/path/to/indexes
PLANNING_OUTPUT_DIR=/path/to/output
LOG_DIR=/path/to/logs

1. Start and Run Reconnaissance Agent

cd pentest_agent/docker
docker-compose up --build -d recon

2. Start and Run Planning Agent

cd pentest_agent/docker
docker-compose up --build -d planning

3. Start and Run Execution Agent

cd pentest_agent/docker
docker-compose up --build -d execution

📊 Benchmark & Evaluation

Infrastructure

We adopt Vulhub for evaluating the system. Vulhub provides Docker-based vulnerable environments with real-world CVEs.

Target Selection

We select vulnerabilities based on the following criteria:

  • Must have a valid CVE ID
  • Must include a CVSS v3.x score
  • Additional labels include:
    • CWE ID
    • Exploitability sub-score
    • Difficulty levels derived from the CVSS vector

Our results

It's been a while since we performed our evaluation. We are working on including some new scenarios in addition to the VulHub in the benchmark, as well as evaluating PentestAgent on a variety of advanced LLM backbones. We will publish our results on the benchmark these works are finished.


🤝 Contribution

Feel free to open an issue if you:

  • Encounter any bugs
  • Have suggestions for improvement
  • Would like to contribute features or benchmarks

We welcome community contributions!

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for pentest-agent

Similar Open Source Tools

For similar tasks

For similar jobs