finite-monkey-engine

finite-monkey-engine

AI engine for smart contract audit

Stars: 193

Visit
 screenshot

FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

README:

FiniteMonkey

FiniteMonkey is an intelligent vulnerability mining engine based on large language models, requiring no pre-trained knowledge base or fine-tuning. Its core feature is using task-driven and prompt engineering approaches to guide models in vulnerability analysis through carefully designed prompts.

🌟 Core Concepts

  • Task-driven rather than problem-driven
  • Prompt-driven rather than code-driven
  • Focus on prompt design rather than model design
  • Leverage "deception" and hallucination as key mechanisms

🏆 Achievements

As of May 2024, this tool has helped discover over $60,000 worth of bug bounties.

🚀 Latest Updates

2024.11.19: Released version 1.0 - Validated LLM-based auditing and productization feasibility

Earlier Updates:

  • 2024.08.02: Project renamed to finite-monkey-engine
  • 2024.08.01: Added Func, Tact language support
  • 2024.07.23: Added Cairo, Move language support
  • 2024.07.01: Updated license
  • 2024.06.01: Added Python language support
  • 2024.05.18: Improved false positive rate (~20%)
  • 2024.05.16: Added cross-contract vulnerability confirmation
  • 2024.04.29: Added basic Rust language support

📋 Requirements

  • PostgreSQL database
  • OpenAI API access
  • Python environment

🛠️ Installation & Configuration

  1. Place project in src/dataset/agent-v1-c4 directory

  2. Configure project in datasets.json:

{
    "StEverVault2": {
        "path": "StEverVault",
        "files": [],
        "functions": []
    }
}
  1. Create database using src/db.sql

  2. Configure .env:

# Database connection
DATABASE_URL=postgresql://user:password@localhost:5432/dbname

# API settings
OPENAI_API_BASE="api.example.com"
OPENAI_API_KEY=sk-your-api-key-here

# Model settings
VUL_MODEL_ID=gpt-4-turbo
CLAUDE_MODEL=claude-3-5-sonnet-20240620

# Azure configuration
AZURE_API_KEY="your-azure-api-key"
AZURE_API_BASE="https://your-resource.openai.azure.com/"
AZURE_API_VERSION="2024-02-15-preview"
AZURE_DEPLOYMENT_NAME="your-deployment"

# API selection
AZURE_OR_OPENAI="OPENAI"  # Options: OPENAI, AZURE, CLAUDE

# Scan parameters
BUSINESS_FLOW_COUNT=4
SWITCH_FUNCTION_CODE=False
SWITCH_BUSINESS_CODE=True

🌈 Supported Languages

  • Solidity (.sol)
  • Rust (.rs)
  • Python (.py)
  • Move (.move)
  • Cairo (.cairo)
  • Tact (.tact)
  • Func (.fc)
  • Java (.java)
  • Pseudo-Solidity (.fr) - For scanning Solidity pseudocode

📊 Scan Results Guide

  1. If interrupted due to network/API issues, resume scanning using the same project_id in main.py
  2. Results include detailed annotations:
    • Focus on entries marked "yes" in result column
    • Filter "dont need In-project other contract" in category column
    • Check specific code in business_flow_code column
    • Find code location in name column

🎯 Important Notes

  • Best suited for logic vulnerability mining in real projects
  • Not recommended for academic vulnerability testing
  • GPT-4-turbo recommended for best results
  • Average scan time for medium-sized projects: 2-3 hours
  • Estimated cost for 10 iterations on medium projects: $20-30
  • Current false positive rate: 30-65% (depends on project size)

🔍 Technical Notes

  1. claude 3.5 sonnet in scanning provides better results with acceptable time cost, GPT-3 not fully tested
  2. Deceptive prompt theory adaptable to any language with minor modifications
  3. ANTLR AST parsing recommended for better code slicing results
  4. Currently supports Solidity, plans to expand language support
  5. DeepSeek-R1 is recommended for better confirmation results

🛡️ Scanning Features

  • Excels at code understanding and logic vulnerability detection
  • Weaker at control flow vulnerability detection
  • Designed for real projects, not academic test cases

💡 Implementation Tips

  • Progress automatically saved per scan
  • claude-3.5-sonnet offers best performance in scanning compared to other models
  • deepseek-R1 offers best performance in confirmation compared to other models
  • 10 iterations for medium projects take about 4 hours
  • Results include detailed categorization

📝 License

Apache License 2.0

🤝 Contributing

Pull Requests welcome!


Note: Project name inspired by Large Language Monkeys paper

Would you like me to explain or break down the code?

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for finite-monkey-engine

Similar Open Source Tools

For similar tasks

For similar jobs