chunkhound

chunkhound

Modern RAG for your codebase - semantic and regex search via MCP

Stars: 88

Visit
 screenshot

ChunkHound is a modern tool for transforming your codebase into a searchable knowledge base for AI assistants. It utilizes semantic search via the cAST algorithm and regex search, integrating with AI assistants through the Model Context Protocol (MCP). With features like cAST Algorithm, Multi-Hop Semantic Search, Regex search, and support for 22 languages, ChunkHound offers a local-first approach to code analysis and discovery. It provides intelligent code discovery, universal language support, and real-time indexing capabilities, making it a powerful tool for developers looking to enhance their coding experience.

README:

ChunkHound

Modern RAG for your codebase - semantic and regex search via MCP.

Tests License: MIT 100% AI Generated

Transform your codebase into a searchable knowledge base for AI assistants using semantic search via cAST algorithm and regex search. Integrates with AI assistants via the Model Context Protocol (MCP).

Features

  • cAST Algorithm - Research-backed semantic code chunking
  • Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
  • Semantic search - Natural language queries like "find authentication code"
  • Regex search - Pattern matching without API keys
  • Local-first - Your code stays on your machine
  • 22 languages with structured parsing
    • Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Bash, MATLAB, Makefile
    • Configuration (via Tree-sitter): JSON, YAML, TOML, Markdown
    • Text-based (custom parsers): Text files, PDF
  • MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc

Documentation

Visit ofriw.github.io/chunkhound for complete guides:

Requirements

Installation

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install ChunkHound
uv tool install chunkhound

Quick Start

Option 1: With Embeddings (Recommended)

  1. Create .chunkhound.json in project root file
{
  "embedding": {
    "provider": "openai",
    "api_key": "your-api-key-here"
  }
}
  1. Index your codebase
chunkhound index

Option 2: Without embeddings (regex search only)

chunkhound index --no-embeddings

For configuration, IDE setup, and advanced usage, see the documentation.

Real-Time Indexing

Automatic File Watching: MCP servers monitor your codebase and update the index automatically as you edit files. No manual re-indexing required.

Smart Content Diffs: Only changed code chunks get re-processed. Unchanged chunks keep their existing embeddings, making updates efficient even for large codebases.

Seamless Branch Switching: When you switch git branches, ChunkHound automatically detects and re-indexes only the files that actually changed between branches.

Live Memory Systems: Index markdown notes or documentation that updates in real-time while you work, creating a dynamic knowledge base.

Why ChunkHound?

Research Foundation: Built on the cAST (Chunking via Abstract Syntax Trees) algorithm from Carnegie Mellon University, providing:

  • 4.3 point gain in Recall@5 on RepoEval retrieval
  • 2.67 point gain in Pass@1 on SWE-bench generation
  • Structure-aware chunking that preserves code meaning

Local-First Architecture:

  • Your code never leaves your machine
  • Works offline with Ollama local models
  • No per-token charges for large codebases

Universal Language Support:

  • Structured parsing for 22 languages (Tree-sitter + custom parsers)
  • Same semantic concepts across all programming languages

Intelligent Code Discovery:

  • Multi-hop search follows semantic relationships to find related implementations
  • Automatically discovers complete feature patterns: find "authentication" to get password hashing, token validation, session management
  • Convergence detection prevents semantic drift while maximizing discovery

License

MIT

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for chunkhound

Similar Open Source Tools

For similar tasks

For similar jobs