
mcp-documentation-server
MCP Documentation Server - Bridge the AI Knowledge Gap. ✨ Features: Document management • Gemini integration • AI-powered semantic search • File uploads • Smart chunking • Multilingual support • Zero-setup 🎯 Perfect for: New frameworks • API docs • Internal guides
Stars: 205

The mcp-documentation-server is a lightweight server application designed to serve documentation files for projects. It provides a simple and efficient way to host and access project documentation, making it easy for team members and stakeholders to find and reference important information. The server supports various file formats, such as markdown and HTML, and allows for easy navigation through the documentation. With mcp-documentation-server, teams can streamline their documentation process and ensure that project information is easily accessible to all involved parties.
README:
A TypeScript-based Model Context Protocol (MCP) server that provides local-first document management and semantic search using embeddings. The server exposes a collection of MCP tools and is optimized for performance with on-disk persistence, an in-memory index, and caching.
NEW! Enhanced with Google Gemini AI for advanced document analysis and contextual understanding. Ask complex questions and get intelligent summaries, explanations, and insights from your documents. To get API Key go to Google AI Studio
- Intelligent Document Analysis: Gemini AI understands context, relationships, and concepts
- Natural Language Queries: Ask a question, not just keywords
- Smart Summarization: Get comprehensive overviews and explanations
- Contextual Insights: Understand how different parts of your documents relate
- File Mapping Cache: Avoid re-uploading the same files to Gemini for efficiency
- AI-Powered Search 🤖: Advanced document analysis with Gemini AI for contextual understanding and intelligent insights
- Traditional Semantic Search: Chunk-based search using embeddings plus in-memory keyword index
- Context Window Retrieval: Gather surrounding chunks for richer LLM answers
-
O(1) Document lookup and keyword index through
DocumentIndex
for instant retrieval -
LRU
EmbeddingCache
to avoid recomputing embeddings and speed up repeated queries - Parallel chunking and batch processing to accelerate ingestion of large documents
- Streaming file reader to process large files without high memory usage
- Intelligent file handling: copy-based storage with automatic backup preservation
- Complete deletion: removes both JSON files and associated original files
-
Local-only storage: no external database required. All data resides in
~/.mcp-documentation-server/
Example configuration for an MCP client (e.g., Claude Desktop):
{
"mcpServers": {
"documentation": {
"command": "npx",
"args": [
"-y",
"@andrea9293/mcp-documentation-server"
],
"env": {
"GEMINI_API_KEY": "your-api-key-here", // Optional, enables AI-powered search
"MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
}
}
}
}
- Add documents using the
add_document
tool or by placing.txt
,.md
, or.pdf
files into the uploads folder and callingprocess_uploads
. - Search documents with
search_documents
to get ranked chunk hits. - Use
get_context_window
to fetch neighboring chunks and provide LLMs with richer context.
The server exposes several tools (validated with Zod schemas) for document lifecycle and search:
-
add_document
— Add a document (title, content, metadata) -
list_documents
— List stored documents and metadata -
get_document
— Retrieve a full document by id -
delete_document
— Remove a document, its chunks, and associated original files
-
process_uploads
— Convert files in uploads folder into documents (chunking + embeddings + backup preservation) -
get_uploads_path
— Returns the absolute uploads folder path -
list_uploads_files
— Lists files in uploads folder
-
search_documents_with_ai
— 🤖 AI-powered search using Gemini for advanced document analysis (requiresGEMINI_API_KEY
) -
search_documents
— Semantic search within a document (returns chunk hits and LLM hint) -
get_context_window
— Return a window of chunks around a target chunk index
Configure behavior via environment variables. Important options:
-
MCP_EMBEDDING_MODEL
— embedding model name (default:Xenova/all-MiniLM-L6-v2
). Changing the model requires re-adding documents. -
GEMINI_API_KEY
— Google Gemini API key for AI-powered search features (optional, enablessearch_documents_with_ai
). -
MCP_INDEXING_ENABLED
— enable/disable theDocumentIndex
(true/false). Default:true
. -
MCP_CACHE_SIZE
— LRU embedding cache size (integer). Default:1000
. -
MCP_PARALLEL_ENABLED
— enable parallel chunking (true/false). Default:true
. -
MCP_MAX_WORKERS
— number of parallel workers for chunking/indexing. Default:4
. -
MCP_STREAMING_ENABLED
— enable streaming reads for large files. Default:true
. -
MCP_STREAM_CHUNK_SIZE
— streaming buffer size in bytes. Default:65536
(64KB). -
MCP_STREAM_FILE_SIZE_LIMIT
— threshold (bytes) to switch to streaming path. Default:10485760
(10MB).
Example .env
(defaults applied when variables are not set):
MCP_INDEXING_ENABLED=true # Enable O(1) indexing (default: true)
GEMINI_API_KEY=your-api-key-here # Google Gemini API key (optional)
MCP_CACHE_SIZE=1000 # LRU cache size (default: 1000)
MCP_PARALLEL_ENABLED=true # Enable parallel processing (default: true)
MCP_MAX_WORKERS=4 # Parallel worker count (default: 4)
MCP_STREAMING_ENABLED=true # Enable streaming (default: true)
MCP_STREAM_CHUNK_SIZE=65536 # Stream chunk size (default: 64KB)
MCP_STREAM_FILE_SIZE_LIMIT=10485760 # Streaming threshold (default: 10MB)
Default storage layout (data directory):
~/.mcp-documentation-server/
├── data/ # Document JSON files
└── uploads/ # Drop files (.txt, .md, .pdf) to import
Add a document via MCP tool:
{
"tool": "add_document",
"arguments": {
"title": "Python Basics",
"content": "Python is a high-level programming language...",
"metadata": {
"category": "programming",
"tags": ["python", "tutorial"]
}
}
}
Search a document:
{
"tool": "search_documents",
"arguments": {
"document_id": "doc-123",
"query": "variable assignment",
"limit": 5
}
}
Advanced Analysis (requires GEMINI_API_KEY
):
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "explain the main concepts and their relationships"
}
}
Complex Questions:
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "what are the key architectural patterns and how do they work together?"
}
}
Summarization Requests:
{
"tool": "search_documents_with_ai",
"arguments": {
"document_id": "doc-123",
"query": "summarize the core principles and provide examples"
}
}
Fetch context window:
{
"tool": "get_context_window",
"arguments": {
"document_id": "doc-123",
"chunk_index": 5,
"before": 2,
"after": 2
}
}
- Complex Questions: "How do these concepts relate to each other?"
- Summarization: "Give me an overview of the main principles"
- Analysis: "What are the key patterns and their trade-offs?"
- Explanation: "Explain this topic as if I were new to it"
- Comparison: "Compare these different approaches"
-
Smart Caching: File mapping prevents re-uploading the same content
-
Efficient Processing: Only relevant sections are analyzed by Gemini
-
Contextual Results: More accurate and comprehensive answers
-
Natural Interaction: Ask questions in plain English
-
Embedding models are downloaded on first use; some models require several hundred MB of downloads.
-
The
DocumentIndex
persists an index file and can be rebuilt if necessary. -
The
EmbeddingCache
can be warmed by callingprocess_uploads
, issuing curated queries, or using a preload API when available.
Set via MCP_EMBEDDING_MODEL
environment variable:
-
Xenova/all-MiniLM-L6-v2
(default) - Fast, good quality (384 dimensions) -
Xenova/paraphrase-multilingual-mpnet-base-v2
(recommended) - Best quality, multilingual (768 dimensions)
The system automatically manages the correct embedding dimension for each model. Embedding providers expose their dimension via getDimensions()
.
git clone https://github.com/andrea9293/mcp-documentation-server.git
cd mcp-documentation-server
npm run dev
npm run build
npm run inspect
- Fork the repository
- Create a feature branch:
git checkout -b feature/name
- Follow Conventional Commits for messages
- Open a pull request
MIT - see LICENSE file
Built with FastMCP and TypeScript 🚀
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mcp-documentation-server
Similar Open Source Tools

mcp-documentation-server
The mcp-documentation-server is a lightweight server application designed to serve documentation files for projects. It provides a simple and efficient way to host and access project documentation, making it easy for team members and stakeholders to find and reference important information. The server supports various file formats, such as markdown and HTML, and allows for easy navigation through the documentation. With mcp-documentation-server, teams can streamline their documentation process and ensure that project information is easily accessible to all involved parties.

mcp-omnisearch
mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.

memento-mcp
Memento MCP is a scalable, high-performance knowledge graph memory system designed for LLMs. It offers semantic retrieval, contextual recall, and temporal awareness to any LLM client supporting the model context protocol. The system is built on core concepts like entities and relations, utilizing Neo4j as its storage backend for unified graph and vector search capabilities. With advanced features such as semantic search, temporal awareness, confidence decay, and rich metadata support, Memento MCP provides a robust solution for managing knowledge graphs efficiently and effectively.

supergateway
Supergateway is a tool that allows running MCP stdio-based servers over SSE (Server-Sent Events) with one command. It is useful for remote access, debugging, or connecting to SSE-based clients when your MCP server only speaks stdio. The tool supports running in SSE to Stdio mode as well, where it connects to a remote SSE server and exposes a local stdio interface for downstream clients. Supergateway can be used with ngrok to share local MCP servers with remote clients and can also be run in a Docker containerized deployment. It is designed with modularity in mind, ensuring compatibility and ease of use for AI tools exchanging data.

code_puppy
Code Puppy is an AI-powered code generation agent designed to understand programming tasks, generate high-quality code, and explain its reasoning. It supports multi-language code generation, interactive CLI, and detailed code explanations. The tool requires Python 3.9+ and API keys for various models like GPT, Google's Gemini, Cerebras, and Claude. It also integrates with MCP servers for advanced features like code search and documentation lookups. Users can create custom JSON agents for specialized tasks and access a variety of tools for file management, code execution, and reasoning sharing.

open-edison
OpenEdison is a secure MCP control panel that connects AI to data/software with additional security controls to reduce data exfiltration risks. It helps address the lethal trifecta problem by providing visibility, monitoring potential threats, and alerting on data interactions. The tool offers features like data leak monitoring, controlled execution, easy configuration, visibility into agent interactions, a simple API, and Docker support. It integrates with LangGraph, LangChain, and plain Python agents for observability and policy enforcement. OpenEdison helps gain observability, control, and policy enforcement for AI interactions with systems of records, existing company software, and data to reduce risks of AI-caused data leakage.

ck
ck (seek) is a semantic grep tool that finds code by meaning, not just keywords. It replaces traditional grep by understanding the user's search intent. It allows users to search for code based on concepts like 'error handling' and retrieves relevant code even if the exact keywords are not present. ck offers semantic search, drop-in grep compatibility, hybrid search combining keyword precision with semantic understanding, agent-friendly output in JSONL format, smart file filtering, and various advanced features. It supports multiple search modes, relevance scoring, top-K results, and smart exclusions. Users can index projects for semantic search, choose embedding models, and search specific files or directories. The tool is designed to improve code search efficiency and accuracy for developers and AI agents.

LightRAG
LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.

quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

aider-desk
AiderDesk is a desktop application that enhances coding workflow by leveraging AI capabilities. It offers an intuitive GUI, project management, IDE integration, MCP support, settings management, cost tracking, structured messages, visual file management, model switching, code diff viewer, one-click reverts, and easy sharing. Users can install it by downloading the latest release and running the executable. AiderDesk also supports Python version detection and auto update disabling. It includes features like multiple project management, context file management, model switching, chat mode selection, question answering, cost tracking, MCP server integration, and MCP support for external tools and context. Development setup involves cloning the repository, installing dependencies, running in development mode, and building executables for different platforms. Contributions from the community are welcome following specific guidelines.

wikipedia-mcp
The Wikipedia MCP Server is a Model Context Protocol (MCP) server that provides real-time access to Wikipedia information for Large Language Models (LLMs). It allows AI assistants to retrieve accurate and up-to-date information from Wikipedia to enhance their responses. The server offers features such as searching Wikipedia, retrieving article content, getting article summaries, extracting specific sections, discovering links within articles, finding related topics, supporting multiple languages and country codes, optional caching for improved performance, and compatibility with Google ADK agents and other AI frameworks. Users can install the server using pipx, Smithery, PyPI, virtual environment, or from source. The server can be run with various options for transport protocol, language, country/locale, caching, access token, and more. It also supports Docker and Kubernetes deployment. The server provides MCP tools for interacting with Wikipedia, such as searching articles, getting article content, summaries, sections, links, coordinates, related topics, and extracting key facts. It also supports country/locale codes and language variants for languages like Chinese, Serbian, Kurdish, and Norwegian. The server includes example prompts for querying Wikipedia and provides MCP resources for interacting with Wikipedia through MCP endpoints. The project structure includes main packages, API implementation, core functionality, utility functions, and a comprehensive test suite for reliability and functionality testing.

superagent
Superagent is an open-source AI assistant framework and API that allows developers to add powerful AI assistants to their applications. These assistants use large language models (LLMs), retrieval augmented generation (RAG), and generative AI to help users with a variety of tasks, including question answering, chatbot development, content generation, data aggregation, and workflow automation. Superagent is backed by Y Combinator and is part of YC W24.

nexus
Nexus is a tool that acts as a unified gateway for multiple LLM providers and MCP servers. It allows users to aggregate, govern, and control their AI stack by connecting multiple servers and providers through a single endpoint. Nexus provides features like MCP Server Aggregation, LLM Provider Routing, Context-Aware Tool Search, Protocol Support, Flexible Configuration, Security features, Rate Limiting, and Docker readiness. It supports tool calling, tool discovery, and error handling for STDIO servers. Nexus also integrates with AI assistants, Cursor, Claude Code, and LangChain for seamless usage.

zotero-mcp
Zotero MCP seamlessly connects your Zotero research library with AI assistants like ChatGPT and Claude via the Model Context Protocol. It offers AI-powered semantic search, access to library content, PDF annotation extraction, and easy updates. Users can search their library, analyze citations, and get summaries, making it ideal for research tasks. The tool supports multiple embedding models, intelligent search results, and flexible access methods for both local and remote collaboration. With advanced features like semantic search and PDF annotation extraction, Zotero MCP enhances research efficiency and organization.

LLMVoX
LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

oxylabs-mcp
The Oxylabs MCP Server acts as a bridge between AI models and the web, providing clean, structured data from any site. It enables scraping of URLs, rendering JavaScript-heavy pages, content extraction for AI use, bypassing anti-scraping measures, and accessing geo-restricted web data from 195+ countries. The implementation utilizes the Model Context Protocol (MCP) to facilitate secure interactions between AI assistants and web content. Key features include scraping content from any site, automatic data cleaning and conversion, bypassing blocks and geo-restrictions, flexible setup with cross-platform support, and built-in error handling and request management.
For similar tasks

mcp-documentation-server
The mcp-documentation-server is a lightweight server application designed to serve documentation files for projects. It provides a simple and efficient way to host and access project documentation, making it easy for team members and stakeholders to find and reference important information. The server supports various file formats, such as markdown and HTML, and allows for easy navigation through the documentation. With mcp-documentation-server, teams can streamline their documentation process and ensure that project information is easily accessible to all involved parties.

Backlog.md
Backlog.md is a Markdown-native Task Manager & Kanban visualizer for any Git repository. It turns any folder with a Git repo into a self-contained project board powered by plain Markdown files and a zero-config CLI. Features include managing tasks as plain .md files, private & offline usage, instant terminal Kanban visualization, board export, modern web interface, AI-ready CLI, rich query commands, cross-platform support, and MIT-licensed open-source. Users can create tasks, view board, assign tasks to AI, manage documentation, make decisions, and configure settings easily.

coco-app
Coco AI is a unified search platform that connects enterprise applications and data into a single, powerful search interface. The COCO App allows users to search and interact with their enterprise data across platforms. It also offers a Gen-AI Chat for Teams tailored to team's unique knowledge and internal resources, enhancing collaboration by making information instantly accessible and providing AI-driven insights based on enterprise's specific data.
For similar jobs

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

quivr
Quivr is a personal assistant powered by Generative AI, designed to be a second brain for users. It offers fast and efficient access to data, ensuring security and compatibility with various file formats. Quivr is open source and free to use, allowing users to share their brains publicly or keep them private. The marketplace feature enables users to share and utilize brains created by others, boosting productivity. Quivr's offline mode provides anytime, anywhere access to data. Key features include speed, security, OS compatibility, file compatibility, open source nature, public/private sharing options, a marketplace, and offline mode.

Avalonia-Assistant
Avalonia-Assistant is an open-source desktop intelligent assistant that aims to provide a user-friendly interactive experience based on the Avalonia UI framework and the integration of Semantic Kernel with OpenAI or other large LLM models. By utilizing Avalonia-Assistant, you can perform various desktop operations through text or voice commands, enhancing your productivity and daily office experience.

MetaGPT
MetaGPT is a multi-agent framework that enables GPT to work in a software company, collaborating to tackle more complex tasks. It assigns different roles to GPTs to form a collaborative entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories, competitive analysis, requirements, data structures, APIs, documents, etc. Internally, MetaGPT includes product managers, architects, project managers, and engineers. It provides the entire process of a software company along with carefully orchestrated SOPs. MetaGPT's core philosophy is "Code = SOP(Team)", materializing SOP and applying it to teams composed of LLMs.

UFO
UFO is a UI-focused dual-agent framework to fulfill user requests on Windows OS by seamlessly navigating and operating within individual or spanning multiple applications.

timefold-solver
Timefold Solver is an optimization engine evolved from OptaPlanner. Developed by the original OptaPlanner team, our aim is to free the world of wasteful planning.

MateCat
Matecat is an enterprise-level, web-based CAT tool designed to make post-editing and outsourcing easy and to provide a complete set of features to manage and monitor translation projects.

crewAI
crewAI is a cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It provides a flexible and structured approach to AI collaboration, enabling users to define agents with specific roles, goals, and tools, and assign them tasks within a customizable process. crewAI supports integration with various LLMs, including OpenAI, and offers features such as autonomous task delegation, flexible task management, and output parsing. It is open-source and welcomes contributions, with a focus on improving the library based on usage data collected through anonymous telemetry.