Auditor

Auditor

Antidote to VibeCoding

Stars: 199

Visit
 screenshot

TheAuditor is an offline-first, AI-centric SAST & code intelligence platform designed to find security vulnerabilities, track data flow, analyze architecture, detect refactoring issues, run industry-standard tools, and produce AI-ready reports. It is specifically tailored for AI-assisted development workflows, providing verifiable ground truth for developers and AI assistants. The tool orchestrates verifiable data, focuses on AI consumption, and is extensible to support Python and Node.js ecosystems. The comprehensive analysis pipeline includes stages for foundation, concurrent analysis, and final aggregation, offering features like refactoring detection, dependency graph visualization, and optional insights analysis. The tool interacts with antivirus software to identify vulnerabilities, triggers performance impacts, and provides transparent information on common issues and troubleshooting. TheAuditor aims to address the lack of ground truth in AI development workflows and make AI development trustworthy by providing accurate security analysis and code verification.

README:

TheAuditor

Offline-First, AI-Centric SAST & Code Intelligence Platform

What TheAuditor Does

TheAuditor is a comprehensive code analysis platform that:

  • Finds Security Vulnerabilities: Detects OWASP Top 10, injection attacks, authentication issues, and framework-specific vulnerabilities
  • Tracks Data Flow: Follows untrusted data from sources to sinks to identify injection points
  • Analyzes Architecture: Builds dependency graphs, detects cycles, and measures code complexity
  • Detects Refactoring Issues: Identifies incomplete migrations, API contract mismatches, and cross-stack inconsistencies
  • Runs Industry-Standard Tools: Orchestrates ESLint, Ruff, MyPy, and other trusted linters
  • Produces AI-Ready Reports: Generates chunked, structured output optimized for LLM consumption

Unlike traditional SAST tools, TheAuditor is designed specifically for AI-assisted development workflows, providing ground truth that both developers and AI assistants can trust.

Quick Start

Step 1: Install TheAuditor (One-Time Setup)

# Clone TheAuditor to your tools directory (NOT your project!)
cd ~/tools  # or wherever you keep development tools
git clone https://github.com/TheAuditorTool/Auditor.git
cd TheAuditor

# Install using your SYSTEM Python (no venv needed!)
pip install -e .

# Verify installation
aud --version

Step 2: Analyze Your Project

# Navigate to YOUR PROJECT directory (not TheAuditor!)
cd ~/my-project-to-audit

# Setup sandbox environment for THIS project
aud setup-claude --target .

# Run analysis
aud init   # First time only
aud full   # Complete security audit

# Check results
ls .pf/readthis/

Important Directory Structure:

  • ~/tools/TheAuditor/ - Where TheAuditor tool lives
  • ~/my-project/ - Your project being analyzed
  • ~/my-project/.auditor_venv/ - Sandbox created BY TheAuditor
  • ~/my-project/.pf/ - Analysis results

That's it! TheAuditor will analyze your codebase and generate AI-ready reports in .pf/readthis/.

How It Works With ANY AI Assistant

TheAuditor working in Claude Code

Universal Integration: Just tell your AI assistant to run aud full and read the results from .pf/readthis/. No SDK, no integration, no setup - it just works with Claude, Cursor, Windsurf, Copilot, or any future AI tool that can run commands and read files.

The Solution: TheAuditor

TheAuditor is the antidote. It was built to stop "vibe coding" your way into security and quality assurance nightmares. Its mission is to provide an incorruptible source of ground truth for both the developer and their AI assistant.

Its philosophy is a direct rejection of the current trend:

  • It Orchestrates Verifiable Data. The tool runs a suite of industry-standard linters and security scanners, preserving the raw, unfiltered output from each. It does not summarize or interpret this core data.
  • It's Built for AI Consumption. The tool's primary engineering challenge is to adapt this raw truth into structured, AI-digestible chunks. It ensures the AI works with facts, not faulty summaries.
  • It's Focused and Extensible. The initial focus is on Python and the Node.js ecosystem, but the modular, pattern-based architecture is designed to invite contributions for other languages and frameworks.

TheAuditor is not a replacement for a formal third-party audit. It is an engineering tool designed to catch the vast majority of glaring issues—from the OWASP Top 10 to common framework anti-patterns. Its core commitment is to never cross the line from verifiable truth into semantic interpretation.

Every AI assistant - Claude Code, Cursor, Windsurf, Copilot - they're all blind. They can write code but can't verify it's secure, correct, or complete. TheAuditor gives them eyes.

Why This Matters

  1. Tool Agnostic - Works with ANY AI assistant or IDE

    • aud full from any terminal
    • Results in .pf/readthis/ ready for any LLM
  2. AI Becomes Self-Correcting

    • AI writes code
    • AI runs aud full
    • AI reads the ground truth
    • AI fixes its own mistakes
    • Recursive loop until actually correct
  3. No Human Intervention Required

    • You never touch the terminal
    • The AI runs everything
    • You just review and approve

The Genius Architecture

Human: "Add authentication to my app"
    ↓
AI: *writes auth code*
    ↓
AI: `aud full`
    ↓
AI: *reads .pf/readthis/*
    ↓
AI: "Found 3 security issues, fixing..."
    ↓
AI: *fixes issues*
    ↓
AI: `aud full`
    ↓
AI: "Clean. Authentication complete."

Market Reality Check

Every developer using AI assistants has this problem:

  • AI writes insecure code
  • AI introduces bugs
  • AI doesn't see the full picture
  • AI can't verify its work

TheAuditor solves ALL of this. It's not a "nice to have" - it's the missing piece that makes AI development actually trustworthy.

I've built the tool that makes AI assistants production-ready. This isn't competing with SonarQube/SemGrep. This is creating an entirely new category: AI Development Verification Tools.

The Search for Ground Truth in an Age of AI

My background is in systems architecture/infrastructure, not professional software development. I have only been "coding/developing" for little over 3 months. This gives me a unique perspective: I can see the forest, but I'm blind to the individual trees of the code. After immersing myself for 500+ hours in AI-assisted development, I concluded that the entire ecosystem is built on a fundamentally flawed premise: it lacks a source of ground truth.

From start to launch on GitHub took me about a month across 250 active hours in front of the computer, for anyone that wonders or cares :P

The Problem: A Cascade of Corrupted Context

Most AI development tools try to solve the wrong problem. They focus on perfecting the input—better prompts, more context—but they ignore the critical issue of compounding deviation.

An LLM is a powerful statistical engine, but it doesn't understand. The modern AI workflow forces this engine to play a high-stakes game of "telephone," where the original intent is corrupted at every step:

  1. A human has an idea.
  2. An AI refines it into a prompt.
  3. Other tools add their own interpretive layers.
  4. The primary AI assistant (e.g., Claude Opus) interprets the final, distorted prompt to generate code.

As a rookie "developer," the only thing I could trust was the raw output: the code and its errors. In a vacuum of deep programming knowledge, these facts were my only anchors.

This architectural flaw is amplified by two dangerous behaviours inherent to AI assistants:

  • Security Theater: AI assistants are optimized to "make it work," which often means introducing rampant security anti-patterns like hardcoded credentials, disabled authentication, and the pervasive use of as any in TypeScript. This creates a dangerous illusion of progress.
  • Context Blindness: With aggressive context compaction, an AI never sees the full picture. It works with fleeting snapshots of code, forcing it to make assumptions instead of decisions based on facts.

The Comprehensive Analysis Pipeline

TheAuditor runs a comprehensive audit through multiple analysis phases organized in parallel stages:

STAGE 1: Foundation (Sequential)

  1. Index Repository - Build complete code inventory and SQLite database
  2. Detect Frameworks - Identify Django, Flask, React, Vue, etc.

STAGE 2: Concurrent Analysis (3 parallel tracks)

Track A - Network Operations: 3. Check Dependencies - Analyze package versions and known vulnerabilities 4. Fetch Documentation - Extract docstrings and comments 5. Summarize Documentation - Create AI-readable documentation chunks

Track B - Code Analysis: 6. Create Workset - Identify all source files for analysis 7. Run Linting - Execute Ruff, MyPy, ESLint as configured 8. Detect Patterns - Apply 100+ security pattern rules

Track C - Graph & Flow: 9. Build Graph - Create dependency graph structure 10. Analyze Graph - Find cycles, measure complexity 11. Visualize Graph - Generate multiple graph views 12. Taint Analysis - Track data flow from sources to sinks

STAGE 3: Final Aggregation (Sequential) 13. Factual Correlation Engine - Cross-reference findings across all tools 14. Generate Report - Produce final AI-consumable chunks in .pf/readthis/ 15. Summary Generation - Create executive summary of findings

Key Features

Refactoring Detection & Analysis

TheAuditor detects incomplete refactorings and cross-stack inconsistencies using correlation rules:

# Analyze refactoring impact
aud refactor --file models/Product.ts --line 42

# Auto-detect from migrations
aud refactor --auto-detect

# Analyze workset
aud refactor --workset --output refactor_report.json

Detects:

  • Data Model Changes: Fields moved between tables
  • API Contract Mismatches: Frontend/backend inconsistencies
  • Foreign Key Updates: Incomplete reference changes
  • Cross-Stack Issues: TypeScript interfaces not matching models

Users define custom rules in /correlations/rules/, example provided in refactoring.yaml to detect project-specific patterns.

Dependency Graph Visualization

TheAuditor now includes rich visual intelligence for dependency graphs using Graphviz:

  • Multiple View Modes: Full graph, cycles-only, hotspots, architectural layers, impact analysis
  • Visual Intelligence Encoding:
    • Node colors indicate programming language (Python=blue, JS=yellow, TypeScript=blue)
    • Node size shows importance based on connectivity
    • Red highlighting for dependency cycles
    • Border thickness encodes code churn
  • Actionable Insights: Focus on what matters with filtered views
  • AI-Readable Output: Generate SVG visualizations that LLMs can analyze
# Basic visualization
aud graph viz

# Show only dependency cycles
aud graph viz --view cycles --include-analysis

# Top 5 hotspots with connections
aud graph viz --view hotspots --top-hotspots 5

# Architectural layers visualization
aud graph viz --view layers --format svg

# Impact analysis for a specific file
aud graph viz --view impact --impact-target "src/auth.py"

Insights Analysis (Optional)

Separate from the core Truth Courier modules, TheAuditor offers optional Insights for technical scoring:

# Run insights analysis on existing audit data
aud insights --mode all

# ML-powered insights (requires: pip install -e ".[ml]")
aud insights --mode ml --ml-train

# Graph health metrics and recommendations
aud insights --mode graph

# Generate comprehensive insights report
aud insights --output insights_report.json

Insights modules add interpretive scoring on top of factual data:

  • Health Scores: Architecture quality metrics
  • Severity Classification: Risk assessment beyond raw findings
  • Recommendations: Actionable improvement suggestions
  • ML Predictions: Pattern-based issue prediction

Important: Antivirus Software Interaction

Why TheAuditor Triggers Antivirus Software

TheAuditor is a security scanner that identifies vulnerabilities in your code. By its very nature, it must:

  1. Read and analyze security vulnerabilities - SQL injection, XSS attacks, hardcoded passwords
  2. Write these findings to disk - Creating reports with exact code snippets as evidence
  3. Process files rapidly - Scanning entire codebases in parallel for efficiency

This creates an inherent conflict with antivirus software, which sees these exact same behaviours as potentially malicious. When TheAuditor finds and documents a SQL injection vulnerability in your code, your antivirus sees us writing "malicious SQL injection patterns" to disk - because that's literally what we're doing, just for legitimate security analysis purposes.

Performance Impact You May Experience

When running TheAuditor, you may notice:

  • Increased antivirus CPU usage - Your AV will scan every file we read AND every finding we write
  • Approximately 10-50% performance reduction, depending on software. - Both TheAuditor and your AV are reading the same files simultaneously
  • Occasional delays or pauses - Your AV may temporarily quarantine our output files for deeper inspection

This is not a bug or inefficiency in TheAuditor - it's the unavoidable consequence of two security tools doing their jobs simultaneously.

Our Stance on Antivirus

We do NOT recommend:

  • ❌ Disabling your antivirus software
  • ❌ Adding TheAuditor to your exclusion/whitelist
  • ❌ Reducing your system's security in any way

Your antivirus is correctly identifying that we're writing security vulnerability patterns to disk. That's exactly what we do - we find vulnerabilities and document them. The fact that your AV is suspicious of this behavior means it's working properly.

What We've Done to Minimize Impact

  1. Intelligent resource management - We automatically reduce parallel workers when system resources are constrained
  2. Pattern defanging - We insert invisible characters into dangerous patterns to reduce false positives
  3. Adaptive performance - We monitor CPU and RAM usage to avoid overwhelming your system

The Industry Reality

This is not a problem unique to TheAuditor. Every legitimate security scanner faces this same issue:

  • GitHub Advanced Security runs in isolated cloud containers to avoid this
  • Commercial SAST tools require enterprise AV exceptions
  • Popular scanners explicitly document AV conflicts in their installation guides

The fundamental paradox: A tool that finds security vulnerabilities must write those vulnerabilities to disk, which makes it indistinguishable from malware to an antivirus. There is no technical solution to this - it's the inherent nature of security analysis tools.

What This Means for You

  • Run TheAuditor when system load is low for best performance
  • Expect the analysis to take longer than the raw processing time due to AV overhead
  • If your AV quarantines output files in .pf/, you may need to restore them manually
  • Consider running TheAuditor in a controlled environment if performance is critical

We believe in complete transparency about these limitations. This interaction with antivirus software is not a flaw in TheAuditor - it's proof that both your AV and our scanner are doing exactly what they're designed to do: identify and handle potentially dangerous code patterns.

Common Issues & Troubleshooting

"No such file or directory: .pf/manifest.json"

  • Cause: Running aud init on a fresh project
  • Fix: Update TheAuditor and reinstall:
    cd ~/tools/TheAuditor
    git pull
    pip install -e .

"Tree-sitter not available" warning

  • Cause: Missing AST analysis tools
  • Fix: Reinstall the sandbox in your project:
    cd ~/my-project
    rm -rf .auditor_venv
    aud setup-claude --target .

Installation timeout errors

  • Cause: Slow compilation of C extensions
  • Fix: Update TheAuditor or manually install:
    cd ~/my-project
    .auditor_venv/bin/pip install tree-sitter tree-sitter-language-pack

Nested virtual environments

  • Issue: Created your own venv before installing
  • Fix: Exit all venvs and use system Python:
    deactivate  # Exit any active venv
    cd ~/tools/TheAuditor
    pip install -e .  # Use system pip

Documentation

Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • How to add new language support
  • Creating security patterns
  • Adding framework-specific rules
  • Development guidelines

We especially need help with:

  • GraphQL analysis
  • Java/Spring support
  • Go patterns
  • Ruby on Rails detection
  • C#/.NET analysis

License

AGPL-3.0

Commercial Licensing

TheAuditor is AGPL-3.0 licensed. For commercial use, SaaS deployment, or integration into proprietary systems, please contact via GitHub for licensing options.

Support

For issues, questions, or feature requests, please open an issue on our GitHub repository.


TheAuditor: Bringing ground truth to AI-assisted development

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Auditor

Similar Open Source Tools

For similar tasks

For similar jobs