ProphetFuzz

[CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing.

Stars: 65

Visit

ProphetFuzz is a fully automated fuzzing tool based on Large Language Models (LLM) for testing high-risk option combinations with only documentation. It can predict and conduct fuzzing on high-risk option combinations without manual intervention. The tool consists of components for parsing documentation, extracting constraints, predicting combinations, assembling commands, generating files, and conducting fuzzing. ProphetFuzz has been used to uncover numerous vulnerabilities in various programs, earning CVE numbers for confirmed vulnerabilities. The tool has been credited to Dawei Wang and Geng Zhou.

README:

ProphetFuzz

The implementation of the paper titled "ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model"

ProphetFuzz is an LLM- based, fully automated fuzzing tool for option combination testing. ProphetFuzz can predict and conduct fuzzing on high-risk option combinations 1 with only documentation, and the entire process operates without manual intervention.

For more details, please refer to our paper from ACM CCS'24.

Due to page limitations, the Appendix of the paper could not be included within the main text. Please refer to Appendix.

Structure

.
├── Dockerfile
├── README.md
├── assets
│   ├──  dataset
│   │   ├── groundtruth_for_20_programs.json
│   │   └── precision.json
│   └── images
├── fuzzing_handler
│   ├── cmd_fixer.py
│   ├── code_checker.py
│   ├── config.json
│   ├── run_cmin.py
│   ├── run_fuzzing.sh
│   └── utils
│       ├── analysis_util.py
│       ├── code_utils.py
│       └── execution_util.py
├── llm_interface
│   ├── assemble.py
│   ├── config
│   │   └── .env
│   ├── constraint.py
│   ├── few-shot
│   │   ├── manpage_htmldoc.json
│   │   ├── manpage_jbig2.json
│   │   ├── manpage_jhead.json
│   │   ├── manpage_makeswf.json
│   │   ├── manpage_mp4box.json
│   │   ├── manpage_opj_compress.json
│   │   ├── manpage_pdf2swf.json
│   │   └── manpage_yasm.json
│   ├── few-shot_generate.py
│   ├── input
│   ├── output
│   ├── predict.py
│   ├── restruct_manpage.py
│   └── utils
│       ├── gpt_utils.py
│       └── opt_utils.py
├── manpage_parser
│   ├── input
│   ├── output
│   ├── parser.py
│   └── utils
│       └── groff_utils.py
└── run_all_in_one.sh

manpage_parser: Scripts for parsing documentation
llm_interface: Scripts for extracting constraints, predicting high-risk option combinations, and assembling commands.
fuzzing_handler: Scripts for preparing and conducting fuzzing.
assets/dataset: Dataset for eveluating constraint extraction module.
run_all_in_one.sh: Scripts for completing everything with one script.
Dockerfile: Building our experiment environment (Tested on Ubuntu 20.04)

The implementations for various components of ProphetFuzz can be found in the following functions,

Section	Component	File	Function
3.2	Constraint Extraction	llm_interface/constraint.py	extractRelationships
3.2	Self Check	llm_interface/constraint.py	checkRelationships
3.3	AutoCoT	llm_interface/few-shot_generate.py	generatePrompt
3.3	High-Risk Combination Prediction	llm_interface/predict.py	predictCombinations
3.4	Command Assembly	llm_interface/assembly.py	generateCommands
3.5	File Generation	fuzzing_handler/generate_combination.py	main
3.5	Corpus Minimization	fuzzing_handler/run_cmin.py	runCMinCommands
3.5	Fuzzing	fuzzing_handler/run_fuzzing.sh	runFuzzing

Usage Example

Using Docker to Configure the Running Environment
- If you only want to complete the part that interacts with the LLM, you can directly use our pre-installed image (4GB):
```
docker run -it 4ugustus/prophetfuzz_base bash
```
- If you want to complete the entire process, including seed generation, command repair, and fuzzing, please build the full image based on the pre-installed image:
```
docker build -t prophetfuzz:latest .
docker run -it --privileged=true prophetfuzz bash
# 'privileged' is used for setting up the fuzzing environment
```
Set Up Your API Key: Set your OpenAI API key in the llm_interface/config/.env file:
```
OPENAI_API_KEY="[Input Your API Key Here]"
```
Run the Script: Execute the script to start the automated fuzzing process:
```
bash run_all_in_one.sh bison
```
Note: If you are not within our Docker environment, you might need to manually install dependencies and adjust the fuzzing_handler/config.json file to specify the path to the program under test.

If you prefer to start fuzzing manually, use the following command:
```
fuzzer/afl-fuzz -i fuzzing_handler/input/bison -o fuzzing_handler/output/bison_prophet_1 -m none -K fuzzing_handler/argvs/argvs_bison.txt -- path/to/bison/bin/bison @@
```

CVEs Assigned

We employ ProphetFuzz to perform persistent fuzzing on the latest versions of the programs in our dataset. To date, ProphetFuzz has uncovered 140 zero-day or half-day vulnerabilities, 93 of which have been confirmed by the developers, earning 22 CVE numbers.

CVE	Program	Type
CVE-2024-3248	xpdf	stack-buffer-overflow
CVE-2024-4853	editcap	heap-buffer-overflow
CVE-2024-4855	editcap	bad free
CVE-2024-31743	ffmpeg	segmentation violation
CVE-2024-31744	jasper	assertion failure
CVE-2024-31745	dwarfdump	use-after-free
CVE-2024-31746	objdump	heap-buffer-overflow
CVE-2024-32154	ffmpeg	segmentation violation
CVE-2024-32157	mupdf	segmentation violation
CVE-2024-32158	mupdf	negative-size-param
CVE-2024-34960	ffmpeg	floating point exception
CVE-2024-34961	pspp	segmentation violation
CVE-2024-34962	pspp	segmentation violation
CVE-2024-34963	pspp	assertion failure
CVE-2024-34965	pspp	assertion failure
CVE-2024-34966	pspp	assertion failure
CVE-2024-34967	pspp	assertion failure
CVE-2024-34968	pspp	assertion failure
CVE-2024-34969	pspp	segmentation violation
CVE-2024-34971	pspp	segmentation violation
CVE-2024-34972	pspp	assertion failure
CVE-2024-35316	ffmpeg	segmentation violation

Credit

Thanks to Dawei Wang (@4ugustus) and Geng Zhou (@Arbusz) for their valuable contributions to this project.

Citing this paper

In case you would like to cite ProphetFuzz, you may use the following BibTex entry:

@inproceedings {wang2024prophet,
  title = {ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model},
  author = {Wang, Dawei and Zhou, Geng and Chen, Li and Li, Dan and Miao, Yukai},
  booktitle = {Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security},
  publisher = {Association for Computing Machinery},
  address = {Salt Lake City, UT, USA},
  pages = {735–749},
  year = {2024}
}

For Tasks:

Click tags to check more tools for each tasks

predict high-risk combinations assemble commands conduct fuzzing generate files parse documentation

For Jobs:

security researcher software developer quality assurance engineer penetration tester cybersecurity analyst

Alternative AI tools for ProphetFuzz

Similar Open Source Tools

ProphetFuzz

github

: 65

COLD-Attack

COLD-Attack is a framework designed for controllable jailbreaks on large language models (LLMs). It formulates the controllable attack generation problem and utilizes the Energy-based Constrained Decoding with Langevin Dynamics (COLD) algorithm to automate the search of adversarial LLM attacks with control over fluency, stealthiness, sentiment, and left-right-coherence. The framework includes steps for energy function formulation, Langevin dynamics sampling, and decoding process to generate discrete text attacks. It offers diverse jailbreak scenarios such as fluent suffix attacks, paraphrase attacks, and attacks with left-right-coherence.

github

: 84

llm-compression-intelligence

This repository presents the findings of the paper "Compression Represents Intelligence Linearly". The study reveals a strong linear correlation between the intelligence of LLMs, as measured by benchmark scores, and their ability to compress external text corpora. Compression efficiency, derived from raw text corpora, serves as a reliable evaluation metric that is linearly associated with model capabilities. The repository includes the compression corpora used in the paper, code for computing compression efficiency, and data collection and processing pipelines.

github

: 98

OpenAI-CLIP-Feature

This repository provides code for extracting image and text features using OpenAI CLIP models, supporting both global and local grid visual features. It aims to facilitate multi visual-and-language downstream tasks by allowing users to customize input and output grid resolution easily. The extracted features have shown comparable or superior results in image captioning tasks without hyperparameter tuning. The repo supports various CLIP models and provides detailed information on supported settings and results on MSCOCO image captioning. Users can get started by setting up experiments with the extracted features using X-modaler.

github

: 115

jailbreak_llms

This is the official repository for the ACM CCS 2024 paper 'Do Anything Now': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models. The project employs a new framework called JailbreakHub to conduct the first measurement study on jailbreak prompts in the wild, collecting 15,140 prompts from December 2022 to December 2023, including 1,405 jailbreak prompts. The dataset serves as the largest collection of in-the-wild jailbreak prompts. The repository contains examples of harmful language and is intended for research purposes only.

github

: 251

redis-ai-resources

A curated repository of code recipes, demos, and resources for basic and advanced Redis use cases in the AI ecosystem. It includes demos for ArxivChatGuru, Redis VSS, Vertex AI & Redis, Agentic RAG, ArXiv Search, and Product Search. Recipes cover topics like Getting started with RAG, Semantic Cache, Advanced RAG, and Recommendation systems. The repository also provides integrations/tools like RedisVL, AWS Bedrock, LangChain Python, LangChain JS, LlamaIndex, Semantic Kernel, RelevanceAI, and DocArray. Additional content includes blog posts, talks, reviews, and documentation related to Vector Similarity Search, AI-Powered Document Search, Vector Databases, Real-Time Product Recommendations, and more. Benchmarks compare Redis against other Vector Databases and ANN benchmarks. Documentation includes QuickStart guides, official literature for Vector Similarity Search, Redis-py client library docs, Redis Stack documentation, and Redis client list.

github

: 170

dl_model_infer

This project is a c++ version of the AI reasoning library that supports the reasoning of tensorrt models. It provides accelerated deployment cases of deep learning CV popular models and supports dynamic-batch image processing, inference, decode, and NMS. The project has been updated with various models and provides tutorials for model exports. It also includes a producer-consumer inference model for specific tasks. The project directory includes implementations for model inference applications, backend reasoning classes, post-processing, pre-processing, and target detection and tracking. Speed tests have been conducted on various models, and onnx downloads are available for different models.

github

: 87

LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing

LLM-PowerHouse is a comprehensive and curated guide designed to empower developers, researchers, and enthusiasts to harness the true capabilities of Large Language Models (LLMs) and build intelligent applications that push the boundaries of natural language understanding. This GitHub repository provides in-depth articles, codebase mastery, LLM PlayLab, and resources for cost analysis and network visualization. It covers various aspects of LLMs, including NLP, models, training, evaluation metrics, open LLMs, and more. The repository also includes a collection of code examples and tutorials to help users build and deploy LLM-based applications.

github

: 648

Groma

Groma is a grounded multimodal assistant that excels in region understanding and visual grounding. It can process user-defined region inputs and generate contextually grounded long-form responses. The tool presents a unique paradigm for multimodal large language models, focusing on visual tokenization for localization. Groma achieves state-of-the-art performance in referring expression comprehension benchmarks. The tool provides pretrained model weights and instructions for data preparation, training, inference, and evaluation. Users can customize training by starting from intermediate checkpoints. Groma is designed to handle tasks related to detection pretraining, alignment pretraining, instruction finetuning, instruction following, and more.

github

: 374

Awesome-LLM-Large-Language-Models-Notes

Awesome-LLM-Large-Language-Models-Notes is a repository that provides a comprehensive collection of information on various Large Language Models (LLMs) classified by year, size, and name. It includes details on known LLM models, their papers, implementations, and specific characteristics. The repository also covers LLM models classified by architecture, must-read papers, blog articles, tutorials, and implementations from scratch. It serves as a valuable resource for individuals interested in understanding and working with LLMs in the field of Natural Language Processing (NLP).

github

: 156

PredictorLLM

PredictorLLM is an advanced trading agent framework that utilizes large language models to automate trading in financial markets. It includes a profiling module to establish agent characteristics, a layered memory module for retaining and prioritizing financial data, and a decision-making module to convert insights into trading strategies. The framework mimics professional traders' behavior, surpassing human limitations in data processing and continuously evolving to adapt to market conditions for superior investment outcomes.

github

: 57

together-cookbook

The Together Cookbook is a collection of code and guides designed to help developers build with open source models using Together AI. The recipes provide examples on how to chain multiple LLM calls, create agents that route tasks to specialized models, run multiple LLMs in parallel, break down tasks into parallel subtasks, build agents that iteratively improve responses, perform LoRA fine-tuning and inference, fine-tune LLMs for repetition, improve summarization capabilities, fine-tune LLMs on multi-step conversations, implement retrieval-augmented generation, conduct multimodal search and conditional image generation, visualize vector embeddings, improve search results with rerankers, implement vector search with embedding models, extract structured text from images, summarize and evaluate outputs with LLMs, generate podcasts from PDF content, and get LLMs to generate knowledge graphs.

github

: 769

tamingLLMs

The 'Taming LLMs' repository provides a practical guide to the pitfalls and challenges associated with Large Language Models (LLMs) when building applications. It focuses on key limitations and implementation pitfalls, offering practical Python examples and open source solutions to help engineers and technical leaders navigate these challenges. The repository aims to equip readers with the knowledge to harness the power of LLMs while avoiding their inherent limitations.

github

: 233

farel-bench

The 'farel-bench' project is a benchmark tool for testing LLM reasoning abilities with family relationship quizzes. It generates quizzes based on family relationships of varying degrees and measures the accuracy of large language models in solving these quizzes. The project provides scripts for generating quizzes, running models locally or via APIs, and calculating benchmark metrics. The quizzes are designed to test logical reasoning skills using family relationship concepts, with the goal of evaluating the performance of language models in this specific domain.

github

: 57

nncf

Neural Network Compression Framework (NNCF) provides a suite of post-training and training-time algorithms for optimizing inference of neural networks in OpenVINO™ with a minimal accuracy drop. It is designed to work with models from PyTorch, TorchFX, TensorFlow, ONNX, and OpenVINO™. NNCF offers samples demonstrating compression algorithms for various use cases and models, with the ability to add different compression algorithms easily. It supports GPU-accelerated layers, distributed training, and seamless combination of pruning, sparsity, and quantization algorithms. NNCF allows exporting compressed models to ONNX or TensorFlow formats for use with OpenVINO™ toolkit, and supports Accuracy-Aware model training pipelines via Adaptive Compression Level Training and Early Exit Training.

github

: 992

rubra

Rubra is a collection of open-weight large language models enhanced with tool-calling capability. It allows users to call user-defined external tools in a deterministic manner while reasoning and chatting, making it ideal for agentic use cases. The models are further post-trained to teach instruct-tuned models new skills and mitigate catastrophic forgetting. Rubra extends popular inferencing projects for easy use, enabling users to run the models easily.

github

: 135

For similar tasks

ProphetFuzz

github

: 65

For similar jobs

ciso-assistant-community

CISO Assistant is a tool that helps organizations manage their cybersecurity posture and compliance. It provides a centralized platform for managing security controls, threats, and risks. CISO Assistant also includes a library of pre-built frameworks and tools to help organizations quickly and easily implement best practices.

github

: 2.8k

PurpleLlama

Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.

github

: 2.9k

vpnfast.github.io

VPNFast is a lightweight and fast VPN service provider that offers secure and private internet access. With VPNFast, users can protect their online privacy, bypass geo-restrictions, and secure their internet connection from hackers and snoopers. The service provides high-speed servers in multiple locations worldwide, ensuring a reliable and seamless VPN experience for users. VPNFast is easy to use, with a user-friendly interface and simple setup process. Whether you're browsing the web, streaming content, or accessing sensitive information, VPNFast helps you stay safe and anonymous online.

github

: 80

taranis-ai

Taranis AI is an advanced Open-Source Intelligence (OSINT) tool that leverages Artificial Intelligence to revolutionize information gathering and situational analysis. It navigates through diverse data sources like websites to collect unstructured news articles, utilizing Natural Language Processing and Artificial Intelligence to enhance content quality. Analysts then refine these AI-augmented articles into structured reports that serve as the foundation for deliverables such as PDF files, which are ultimately published.

github

: 358

NightshadeAntidote

Nightshade Antidote is an image forensics tool used to analyze digital images for signs of manipulation or forgery. It implements several common techniques used in image forensics including metadata analysis, copy-move forgery detection, frequency domain analysis, and JPEG compression artifacts analysis. The tool takes an input image, performs analysis using the above techniques, and outputs a report summarizing the findings.

github

: 163

h4cker

This repository is a comprehensive collection of cybersecurity-related references, scripts, tools, code, and other resources. It is carefully curated and maintained by Omar Santos. The repository serves as a supplemental material provider to several books, video courses, and live training created by Omar Santos. It encompasses over 10,000 references that are instrumental for both offensive and defensive security professionals in honing their skills.

github

: 20.4k

AIMr

AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.

github

: 229

admyral

Admyral is an open-source Cybersecurity Automation & Investigation Assistant that provides a unified console for investigations and incident handling, workflow automation creation, automatic alert investigation, and next step suggestions for analysts. It aims to tackle alert fatigue and automate security workflows effectively by offering features like workflow actions, AI actions, case management, alert handling, and more. Admyral combines security automation and case management to streamline incident response processes and improve overall security posture. The tool is open-source, transparent, and community-driven, allowing users to self-host, contribute, and collaborate on integrations and features.

github

: 293