mcp-documentation-server

MCP Documentation Server - Bridge the AI Knowledge Gap. ✨ Features: Document management • Gemini integration • AI-powered semantic search • File uploads • Smart chunking • Multilingual support • Zero-setup 🎯 Perfect for: New frameworks • API docs • Internal guides

Stars: 205

Visit

The mcp-documentation-server is a lightweight server application designed to serve documentation files for projects. It provides a simple and efficient way to host and access project documentation, making it easy for team members and stakeholders to find and reference important information. The server supports various file formats, such as markdown and HTML, and allows for easy navigation through the documentation. With mcp-documentation-server, teams can streamline their documentation process and ensure that project information is easily accessible to all involved parties.

README:

MCP Documentation Server

A TypeScript-based Model Context Protocol (MCP) server that provides local-first document management and semantic search using embeddings. The server exposes a collection of MCP tools and is optimized for performance with on-disk persistence, an in-memory index, and caching.

🚀 AI-Powered Document Intelligence

NEW! Enhanced with Google Gemini AI for advanced document analysis and contextual understanding. Ask complex questions and get intelligent summaries, explanations, and insights from your documents. To get API Key go to Google AI Studio

Key AI Features:

Intelligent Document Analysis: Gemini AI understands context, relationships, and concepts
Natural Language Queries: Ask a question, not just keywords
Smart Summarization: Get comprehensive overviews and explanations
Contextual Insights: Understand how different parts of your documents relate
File Mapping Cache: Avoid re-uploading the same files to Gemini for efficiency

Core capabilities

🔍 Search & Intelligence

AI-Powered Search 🤖: Advanced document analysis with Gemini AI for contextual understanding and intelligent insights
Traditional Semantic Search: Chunk-based search using embeddings plus in-memory keyword index
Context Window Retrieval: Gather surrounding chunks for richer LLM answers

⚡ Performance & Optimization

O(1) Document lookup and keyword index through DocumentIndex for instant retrieval
LRU EmbeddingCache to avoid recomputing embeddings and speed up repeated queries
Parallel chunking and batch processing to accelerate ingestion of large documents
Streaming file reader to process large files without high memory usage

📁 File Management

Intelligent file handling: copy-based storage with automatic backup preservation
Complete deletion: removes both JSON files and associated original files
Local-only storage: no external database required. All data resides in ~/.mcp-documentation-server/

Quick Start

Configure an MCP client

Example configuration for an MCP client (e.g., Claude Desktop):

{
  "mcpServers": {
    "documentation": {
      "command": "npx",
      "args": [
        "-y",
        "@andrea9293/mcp-documentation-server"
      ],
      "env": {
            "GEMINI_API_KEY": "your-api-key-here",  // Optional, enables AI-powered search
            "MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
      }
    }
  }
}

Basic workflow

Add documents using the add_document tool or by placing .txt, .md, or .pdf files into the uploads folder and calling process_uploads.
Search documents with search_documents to get ranked chunk hits.
Use get_context_window to fetch neighboring chunks and provide LLMs with richer context.

Exposed MCP tools

The server exposes several tools (validated with Zod schemas) for document lifecycle and search:

📄 Document Management

add_document — Add a document (title, content, metadata)
list_documents — List stored documents and metadata
get_document — Retrieve a full document by id
delete_document — Remove a document, its chunks, and associated original files

📁 File Processing

process_uploads — Convert files in uploads folder into documents (chunking + embeddings + backup preservation)
get_uploads_path — Returns the absolute uploads folder path
list_uploads_files — Lists files in uploads folder

🔍 Search & Intelligence

search_documents_with_ai — 🤖 AI-powered search using Gemini for advanced document analysis (requires GEMINI_API_KEY)
search_documents — Semantic search within a document (returns chunk hits and LLM hint)
get_context_window — Return a window of chunks around a target chunk index

Configuration & environment variables

Configure behavior via environment variables. Important options:

MCP_EMBEDDING_MODEL — embedding model name (default: Xenova/all-MiniLM-L6-v2). Changing the model requires re-adding documents.
GEMINI_API_KEY — Google Gemini API key for AI-powered search features (optional, enables search_documents_with_ai).
MCP_INDEXING_ENABLED — enable/disable the DocumentIndex (true/false). Default: true.
MCP_CACHE_SIZE — LRU embedding cache size (integer). Default: 1000.
MCP_PARALLEL_ENABLED — enable parallel chunking (true/false). Default: true.
MCP_MAX_WORKERS — number of parallel workers for chunking/indexing. Default: 4.
MCP_STREAMING_ENABLED — enable streaming reads for large files. Default: true.
MCP_STREAM_CHUNK_SIZE — streaming buffer size in bytes. Default: 65536 (64KB).
MCP_STREAM_FILE_SIZE_LIMIT — threshold (bytes) to switch to streaming path. Default: 10485760 (10MB).

Example .env (defaults applied when variables are not set):

MCP_INDEXING_ENABLED=true          # Enable O(1) indexing (default: true)
GEMINI_API_KEY=your-api-key-here   # Google Gemini API key (optional)
MCP_CACHE_SIZE=1000                # LRU cache size (default: 1000)
MCP_PARALLEL_ENABLED=true          # Enable parallel processing (default: true)
MCP_MAX_WORKERS=4                  # Parallel worker count (default: 4)
MCP_STREAMING_ENABLED=true         # Enable streaming (default: true)
MCP_STREAM_CHUNK_SIZE=65536        # Stream chunk size (default: 64KB)
MCP_STREAM_FILE_SIZE_LIMIT=10485760 # Streaming threshold (default: 10MB)

Default storage layout (data directory):

~/.mcp-documentation-server/
├── data/      # Document JSON files
└── uploads/   # Drop files (.txt, .md, .pdf) to import

Usage examples

Basic Document Operations

Add a document via MCP tool:

{
  "tool": "add_document",
  "arguments": {
    "title": "Python Basics",
    "content": "Python is a high-level programming language...",
    "metadata": {
      "category": "programming",
      "tags": ["python", "tutorial"]
    }
  }
}

Search a document:

{
  "tool": "search_documents",
  "arguments": {
    "document_id": "doc-123",
    "query": "variable assignment",
    "limit": 5
  }
}

🤖 AI-Powered Search Examples

Advanced Analysis (requires GEMINI_API_KEY):

{
  "tool": "search_documents_with_ai",
  "arguments": {
    "document_id": "doc-123",
    "query": "explain the main concepts and their relationships"
  }
}

Complex Questions:

{
  "tool": "search_documents_with_ai",
  "arguments": {
    "document_id": "doc-123",
    "query": "what are the key architectural patterns and how do they work together?"
  }
}

Summarization Requests:

{
  "tool": "search_documents_with_ai",
  "arguments": {
    "document_id": "doc-123",
    "query": "summarize the core principles and provide examples"
  }
}

Context Enhancement

Fetch context window:

{
  "tool": "get_context_window",
  "arguments": {
    "document_id": "doc-123",
    "chunk_index": 5,
    "before": 2,
    "after": 2
  }
}

When to Use AI-Powered Search:

Complex Questions: "How do these concepts relate to each other?"
Summarization: "Give me an overview of the main principles"
Analysis: "What are the key patterns and their trade-offs?"
Explanation: "Explain this topic as if I were new to it"
Comparison: "Compare these different approaches"

Performance Benefits:

Smart Caching: File mapping prevents re-uploading the same content
Efficient Processing: Only relevant sections are analyzed by Gemini
Contextual Results: More accurate and comprehensive answers
Natural Interaction: Ask questions in plain English
Embedding models are downloaded on first use; some models require several hundred MB of downloads.
The DocumentIndex persists an index file and can be rebuilt if necessary.
The EmbeddingCache can be warmed by calling process_uploads, issuing curated queries, or using a preload API when available.

Embedding Models

Set via MCP_EMBEDDING_MODEL environment variable:

Xenova/all-MiniLM-L6-v2 (default) - Fast, good quality (384 dimensions)
Xenova/paraphrase-multilingual-mpnet-base-v2 (recommended) - Best quality, multilingual (768 dimensions)

The system automatically manages the correct embedding dimension for each model. Embedding providers expose their dimension via getDimensions().

⚠️ Important: Changing models requires re-adding all documents as embeddings are incompatible.

Development

git clone https://github.com/andrea9293/mcp-documentation-server.git

cd mcp-documentation-server

npm run dev

npm run build

npm run inspect

Contributing

Fork the repository
Create a feature branch: git checkout -b feature/name
Follow Conventional Commits for messages
Open a pull request

License

MIT - see LICENSE file

Support

Star History

Built with FastMCP and TypeScript 🚀

For Tasks:

Click tags to check more tools for each tasks

manage documentation host project files share information access project details navigate documentation

For Jobs:

software developer technical writer project manager system administrator web developer

Alternative AI tools for mcp-documentation-server

Similar Open Source Tools

mcp-documentation-server

github

: 205

mcp-omnisearch

mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.

github

: 195

memento-mcp

Memento MCP is a scalable, high-performance knowledge graph memory system designed for LLMs. It offers semantic retrieval, contextual recall, and temporal awareness to any LLM client supporting the model context protocol. The system is built on core concepts like entities and relations, utilizing Neo4j as its storage backend for unified graph and vector search capabilities. With advanced features such as semantic search, temporal awareness, confidence decay, and rich metadata support, Memento MCP provides a robust solution for managing knowledge graphs efficiently and effectively.

github

: 217

supergateway

Supergateway is a tool that allows running MCP stdio-based servers over SSE (Server-Sent Events) with one command. It is useful for remote access, debugging, or connecting to SSE-based clients when your MCP server only speaks stdio. The tool supports running in SSE to Stdio mode as well, where it connects to a remote SSE server and exposes a local stdio interface for downstream clients. Supergateway can be used with ngrok to share local MCP servers with remote clients and can also be run in a Docker containerized deployment. It is designed with modularity in mind, ensuring compatibility and ease of use for AI tools exchanging data.

github

: 599

code_puppy

Code Puppy is an AI-powered code generation agent designed to understand programming tasks, generate high-quality code, and explain its reasoning. It supports multi-language code generation, interactive CLI, and detailed code explanations. The tool requires Python 3.9+ and API keys for various models like GPT, Google's Gemini, Cerebras, and Claude. It also integrates with MCP servers for advanced features like code search and documentation lookups. Users can create custom JSON agents for specialized tasks and access a variety of tools for file management, code execution, and reasoning sharing.

github

: 154

open-edison

OpenEdison is a secure MCP control panel that connects AI to data/software with additional security controls to reduce data exfiltration risks. It helps address the lethal trifecta problem by providing visibility, monitoring potential threats, and alerting on data interactions. The tool offers features like data leak monitoring, controlled execution, easy configuration, visibility into agent interactions, a simple API, and Docker support. It integrates with LangGraph, LangChain, and plain Python agents for observability and policy enforcement. OpenEdison helps gain observability, control, and policy enforcement for AI interactions with systems of records, existing company software, and data to reduce risks of AI-caused data leakage.

github

: 187

ck

ck (seek) is a semantic grep tool that finds code by meaning, not just keywords. It replaces traditional grep by understanding the user's search intent. It allows users to search for code based on concepts like 'error handling' and retrieves relevant code even if the exact keywords are not present. ck offers semantic search, drop-in grep compatibility, hybrid search combining keyword precision with semantic understanding, agent-friendly output in JSONL format, smart file filtering, and various advanced features. It supports multiple search modes, relevance scoring, top-K results, and smart exclusions. Users can index projects for semantic search, choose embedding models, and search specific files or directories. The tool is designed to improve code search efficiency and accuracy for developers and AI agents.

github

: 742

LightRAG

LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.

github

: 21.3k

quantalogic

QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

github

: 376

aider-desk

AiderDesk is a desktop application that enhances coding workflow by leveraging AI capabilities. It offers an intuitive GUI, project management, IDE integration, MCP support, settings management, cost tracking, structured messages, visual file management, model switching, code diff viewer, one-click reverts, and easy sharing. Users can install it by downloading the latest release and running the executable. AiderDesk also supports Python version detection and auto update disabling. It includes features like multiple project management, context file management, model switching, chat mode selection, question answering, cost tracking, MCP server integration, and MCP support for external tools and context. Development setup involves cloning the repository, installing dependencies, running in development mode, and building executables for different platforms. Contributions from the community are welcome following specific guidelines.

github

: 769

wikipedia-mcp

The Wikipedia MCP Server is a Model Context Protocol (MCP) server that provides real-time access to Wikipedia information for Large Language Models (LLMs). It allows AI assistants to retrieve accurate and up-to-date information from Wikipedia to enhance their responses. The server offers features such as searching Wikipedia, retrieving article content, getting article summaries, extracting specific sections, discovering links within articles, finding related topics, supporting multiple languages and country codes, optional caching for improved performance, and compatibility with Google ADK agents and other AI frameworks. Users can install the server using pipx, Smithery, PyPI, virtual environment, or from source. The server can be run with various options for transport protocol, language, country/locale, caching, access token, and more. It also supports Docker and Kubernetes deployment. The server provides MCP tools for interacting with Wikipedia, such as searching articles, getting article content, summaries, sections, links, coordinates, related topics, and extracting key facts. It also supports country/locale codes and language variants for languages like Chinese, Serbian, Kurdish, and Norwegian. The server includes example prompts for querying Wikipedia and provides MCP resources for interacting with Wikipedia through MCP endpoints. The project structure includes main packages, API implementation, core functionality, utility functions, and a comprehensive test suite for reliability and functionality testing.

github

: 99

superagent

Superagent is an open-source AI assistant framework and API that allows developers to add powerful AI assistants to their applications. These assistants use large language models (LLMs), retrieval augmented generation (RAG), and generative AI to help users with a variety of tasks, including question answering, chatbot development, content generation, data aggregation, and workflow automation. Superagent is backed by Y Combinator and is part of YC W24.

github

: 6.1k

nexus

Nexus is a tool that acts as a unified gateway for multiple LLM providers and MCP servers. It allows users to aggregate, govern, and control their AI stack by connecting multiple servers and providers through a single endpoint. Nexus provides features like MCP Server Aggregation, LLM Provider Routing, Context-Aware Tool Search, Protocol Support, Flexible Configuration, Security features, Rate Limiting, and Docker readiness. It supports tool calling, tool discovery, and error handling for STDIO servers. Nexus also integrates with AI assistants, Cursor, Claude Code, and LangChain for seamless usage.

github

: 339

zotero-mcp

Zotero MCP seamlessly connects your Zotero research library with AI assistants like ChatGPT and Claude via the Model Context Protocol. It offers AI-powered semantic search, access to library content, PDF annotation extraction, and easy updates. Users can search their library, analyze citations, and get summaries, making it ideal for research tasks. The tool supports multiple embedding models, intelligent search results, and flexible access methods for both local and remote collaboration. With advanced features like semantic search and PDF annotation extraction, Zotero MCP enhances research efficiency and organization.

github

: 513

LLMVoX

LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

github

: 167

oxylabs-mcp

The Oxylabs MCP Server acts as a bridge between AI models and the web, providing clean, structured data from any site. It enables scraping of URLs, rendering JavaScript-heavy pages, content extraction for AI use, bypassing anti-scraping measures, and accessing geo-restricted web data from 195+ countries. The implementation utilizes the Model Context Protocol (MCP) to facilitate secure interactions between AI assistants and web content. Key features include scraping content from any site, automatic data cleaning and conversion, bypassing blocks and geo-restrictions, flexible setup with cross-platform support, and built-in error handling and request management.

github

: 61

For similar tasks

mcp-documentation-server

github

: 205

Backlog.md

Backlog.md is a Markdown-native Task Manager & Kanban visualizer for any Git repository. It turns any folder with a Git repo into a self-contained project board powered by plain Markdown files and a zero-config CLI. Features include managing tasks as plain .md files, private & offline usage, instant terminal Kanban visualization, board export, modern web interface, AI-ready CLI, rich query commands, cross-platform support, and MIT-licensed open-source. Users can create tasks, view board, assign tasks to AI, manage documentation, make decisions, and configure settings easily.

github

: 3.6k

coco-app

Coco AI is a unified search platform that connects enterprise applications and data into a single, powerful search interface. The COCO App allows users to search and interact with their enterprise data across platforms. It also offers a Gen-AI Chat for Teams tailored to team's unique knowledge and internal resources, enhancing collaboration by making information instantly accessible and providing AI-driven insights based on enterprise's specific data.

github

: 281

For similar jobs

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136

quivr

Quivr is a personal assistant powered by Generative AI, designed to be a second brain for users. It offers fast and efficient access to data, ensuring security and compatibility with various file formats. Quivr is open source and free to use, allowing users to share their brains publicly or keep them private. The marketplace feature enables users to share and utilize brains created by others, boosting productivity. Quivr's offline mode provides anytime, anywhere access to data. Key features include speed, security, OS compatibility, file compatibility, open source nature, public/private sharing options, a marketplace, and offline mode.

github

: 37.6k

Avalonia-Assistant

Avalonia-Assistant is an open-source desktop intelligent assistant that aims to provide a user-friendly interactive experience based on the Avalonia UI framework and the integration of Semantic Kernel with OpenAI or other large LLM models. By utilizing Avalonia-Assistant, you can perform various desktop operations through text or voice commands, enhancing your productivity and daily office experience.

github

: 113

MetaGPT

MetaGPT is a multi-agent framework that enables GPT to work in a software company, collaborating to tackle more complex tasks. It assigns different roles to GPTs to form a collaborative entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories, competitive analysis, requirements, data structures, APIs, documents, etc. Internally, MetaGPT includes product managers, architects, project managers, and engineers. It provides the entire process of a software company along with carefully orchestrated SOPs. MetaGPT's core philosophy is "Code = SOP(Team)", materializing SOP and applying it to teams composed of LLMs.

github

: 51.4k

UFO

UFO is a UI-focused dual-agent framework to fulfill user requests on Windows OS by seamlessly navigating and operating within individual or spanning multiple applications.

github

: 6.6k

timefold-solver

Timefold Solver is an optimization engine evolved from OptaPlanner. Developed by the original OptaPlanner team, our aim is to free the world of wasteful planning.

github

: 1.3k

MateCat

Matecat is an enterprise-level, web-based CAT tool designed to make post-editing and outsourcing easy and to provide a complete set of features to manage and monitor translation projects.

github

: 460

crewAI

crewAI is a cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It provides a flexible and structured approach to AI collaboration, enabling users to define agents with specific roles, goals, and tools, and assign them tasks within a customizable process. crewAI supports integration with various LLMs, including OpenAI, and offers features such as autonomous task delegation, flexible task management, and output parsing. It is open-source and welcomes contributions, with a focus on improving the library based on usage data collected through anonymous telemetry.

github

: 17.2k