mcpdoc

Expose llms-txt to IDEs for development

Stars: 148

Visit

The MCP LLMS-TXT Documentation Server is an open-source server that provides developers full control over tools used by applications like Cursor, Windsurf, and Claude Code/Desktop. It allows users to create a user-defined list of `llms.txt` files and use a `fetch_docs` tool to read URLs within these files, enabling auditing of tool calls and context returned. The server supports various applications and provides a way to connect to them, configure rules, and test tool calls for tasks related to documentation retrieval and processing.

README:

MCP LLMS-TXT Documentation Server

Overview

llms.txt is a website index for LLMs, providing background information, guidance, and links to detailed markdown files. IDEs like Cursor and Windsurf or apps like Claude Code/Desktop can use llms.txt to retrieve context for tasks. However, these apps use different built-in tools to read and process files like llms.txt. The retrieval process can be opaque, and there is not always a way to audit the tool calls or the context returned.

MCP offers a way for developers to have full control over tools used by these applications. Here, we create an open source MCP server to provide MCP host applications (e.g., Cursor, Windsurf, Claude Code/Desktop) with (1) a user-defined list of llms.txt files and (2) a simple fetch_docs tool read URLs within any of the provided llms.txt files. This allows the user to audit each tool call as well as the context returned.

llms-txt

You can find llms.txt files for langgraph and langchain here:

Library	llms.txt
LangGraph Python	https://langchain-ai.github.io/langgraph/llms.txt
LangGraph JS	https://langchain-ai.github.io/langgraphjs/llms.txt
LangChain Python	https://python.langchain.com/llms.txt
LangChain JS	https://js.langchain.com/llms.txt

Quickstart

Install uv

Please see official uv docs for other ways to install uv.

curl -LsSf https://astral.sh/uv/install.sh | sh

Choose an `llms.txt` file to use.

For example, here's the LangGraph llms.txt file.

Note: Security and Domain Access Control

For security reasons, mcpdoc implements strict domain access controls:

Remote llms.txt files: When you specify a remote llms.txt URL (e.g., https://langchain-ai.github.io/langgraph/llms.txt), mcpdoc automatically adds only that specific domain (langchain-ai.github.io) to the allowed domains list. This means the tool can only fetch documentation from URLs on that domain.

Local llms.txt files: When using a local file, NO domains are automatically added to the allowed list. You MUST explicitly specify which domains to allow using the --allowed-domains parameter.

Adding additional domains: To allow fetching from domains beyond those automatically included:

Use --allowed-domains domain1.com domain2.com to add specific domains

Use --allowed-domains '*' to allow all domains (use with caution)

This security measure prevents unauthorized access to domains not explicitly approved by the user, ensuring that documentation can only be retrieved from trusted sources.

(Optional) Test the MCP server locally with your `llms.txt` file(s) of choice:

uvx --from mcpdoc mcpdoc \
    --urls LangGraph:https://langchain-ai.github.io/langgraph/llms.txt \
    --urls LangChain:https://python.langchain.com/llms.txt \
    --transport sse \
    --port 8082 \
    --host localhost

This should run at: http://localhost:8082

Run MCP inspector and connect to the running server:

npx @modelcontextprotocol/inspector

Here, you can test the tool calls.

Connect to Cursor

Open Cursor Settings and MCP tab.
This will open the ~/.cursor/mcp.json file.

Paste the following into the file (we use the langgraph-docs-mcp name and link to the LangGraph llms.txt).

{
  "mcpServers": {
    "langgraph-docs-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "mcpdoc",
        "mcpdoc",
        "--urls",
        "LangGraph:https://langchain-ai.github.io/langgraph/llms.txt",
        "--urls",
        "LangChain:https://python.langchain.com/llms.txt",
        "--transport",
        "stdio"
      ]
    }
  }
}

Confirm that the server is running in your Cursor Settings/MCP tab.
Best practice is to then update Cursor Global (User) rules.
Open Cursor Settings/Rules and update User Rules with the following (or similar):

for ANY question about LangGraph, use the langgraph-docs-mcp server to help answer -- 
+ call list_doc_sources tool to get the available llms.txt file
+ call fetch_docs tool to read it
+ reflect on the urls in llms.txt 
+ reflect on the input question 
+ call fetch_docs on any urls relevant to the question
+ use this to answer the question

CMD+L (on Mac) to open chat.
Ensure agent is selected.

Then, try an example prompt, such as:

what are types of memory in LangGraph?

Connect to Windsurf

Open Cascade with CMD+L (on Mac).
Click Configure MCP to open the config file, ~/.codeium/windsurf/mcp_config.json.
Update with langgraph-docs-mcp as noted above.

Update Windsurf Rules/Global rules with the following (or similar):

for ANY question about LangGraph, use the langgraph-docs-mcp server to help answer -- 
+ call list_doc_sources tool to get the available llms.txt file
+ call fetch_docs tool to read it
+ reflect on the urls in llms.txt 
+ reflect on the input question 
+ call fetch_docs on any urls relevant to the question

Then, try the example prompt:

It will perform your tool calls.

Connect to Claude Desktop

Open Settings/Developer to update ~/Library/Application\ Support/Claude/claude_desktop_config.json.
Update with langgraph-docs-mcp as noted above.
Restart Claude Desktop app.

[!Note] If you run into issues with Python version incompatibility when trying to add MCPDoc tools to Claude Desktop, you can explicitly specify the filepath to python executable in the uvx command.
Example configuration
{
  "mcpServers": {
    "langgraph-docs-mcp": {
      "command": "uvx",
      "args": [
        "--python",
        "/path/to/python",
        "--from",
        "mcpdoc",
        "mcpdoc",
        "--urls",
        "LangGraph:https://langchain-ai.github.io/langgraph/llms.txt",
        "--transport",
        "stdio"
      ]
    }
  }
}

[!Note] Currently (3/21/25) it appears that Claude Desktop does not support rules for global rules, so appending the following to your prompt.

<rules>
for ANY question about LangGraph, use the langgraph-docs-mcp server to help answer -- 
+ call list_doc_sources tool to get the available llms.txt file
+ call fetch_docs tool to read it
+ reflect on the urls in llms.txt 
+ reflect on the input question 
+ call fetch_docs on any urls relevant to the question
</rules>

You will see your tools visible in the bottom right of your chat input.

Then, try the example prompt:

It will ask to approve tool calls as it processes your request.

Connect to Claude Code

In a terminal after installing Claude Code, run this command to add the MCP server to your project:

claude mcp add-json langgraph-docs '{"type":"stdio","command":"uvx" ,"args":["--from", "mcpdoc", "mcpdoc", "--urls", "langgraph:https://langchain-ai.github.io/langgraph/llms.txt", "--urls", "LangChain:https://python.langchain.com/llms.txt"]}' -s local

You will see ~/.claude.json updated.
Test by launching Claude Code and running to view your tools:

$ Claude
$ /mcp

[!Note] Currently (3/21/25) it appears that Claude Code does not support rules for global rules, so appending the following to your prompt.

<rules>
for ANY question about LangGraph, use the langgraph-docs-mcp server to help answer -- 
+ call list_doc_sources tool to get the available llms.txt file
+ call fetch_docs tool to read it
+ reflect on the urls in llms.txt 
+ reflect on the input question 
+ call fetch_docs on any urls relevant to the question
</rules>

Then, try the example prompt:

It will ask to approve tool calls.

Command-line Interface

The mcpdoc command provides a simple CLI for launching the documentation server.

You can specify documentation sources in three ways, and these can be combined:

Using a YAML config file:

This will load the LangGraph Python documentation from the sample_config.yaml file in this repo.

mcpdoc --yaml sample_config.yaml

Using a JSON config file:

This will load the LangGraph Python documentation from the sample_config.json file in this repo.

mcpdoc --json sample_config.json

Directly specifying llms.txt URLs with optional names:

URLs can be specified either as plain URLs or with optional names using the format name:url.
You can specify multiple URLs by using the --urls parameter multiple times.
This is how we loaded llms.txt for the MCP server above.

mcpdoc --urls LangGraph:https://langchain-ai.github.io/langgraph/llms.txt --urls LangChain:https://python.langchain.com/llms.txt

You can also combine these methods to merge documentation sources:

mcpdoc --yaml sample_config.yaml --json sample_config.json --urls LangGraph:https://langchain-ai.github.io/langgraph/llms.txt --urls LangChain:https://python.langchain.com/llms.txt

Additional Options

--follow-redirects: Follow HTTP redirects (defaults to False)
--timeout SECONDS: HTTP request timeout in seconds (defaults to 10.0)

Example with additional options:

mcpdoc --yaml sample_config.yaml --follow-redirects --timeout 15

This will load the LangGraph Python documentation with a 15-second timeout and follow any HTTP redirects if necessary.

Configuration Format

Both YAML and JSON configuration files should contain a list of documentation sources.

Each source must include an llms_txt URL and can optionally include a name:

YAML Configuration Example (sample_config.yaml)

# Sample configuration for mcp-mcpdoc server
# Each entry must have a llms_txt URL and optionally a name
- name: LangGraph Python
  llms_txt: https://langchain-ai.github.io/langgraph/llms.txt

JSON Configuration Example (sample_config.json)

[
  {
    "name": "LangGraph Python",
    "llms_txt": "https://langchain-ai.github.io/langgraph/llms.txt"
  }
]

Programmatic Usage

from mcpdoc.main import create_server

# Create a server with documentation sources
server = create_server(
    [
        {
            "name": "LangGraph Python",
            "llms_txt": "https://langchain-ai.github.io/langgraph/llms.txt",
        },
        # You can add multiple documentation sources
        # {
        #     "name": "Another Documentation",
        #     "llms_txt": "https://example.com/llms.txt",
        # },
    ],
    follow_redirects=True,
    timeout=15.0,
)

# Run the server
server.run(transport="stdio")

For Tasks:

Click tags to check more tools for each tasks

fetch documentation audit tool calls configure rules test tool calls retrieve context

For Jobs:

software developer technical writer devops engineer data scientist ai engineer

Alternative AI tools for mcpdoc

Similar Open Source Tools

mcpdoc

github

: 148

cursor-tools

cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

github

: 3.5k

shortest

Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

github

: 4.4k

langserve

LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

github

: 1.9k

exif-photo-blog

EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.

github

: 1.4k

code2prompt

Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.

github

: 734

bedrock-claude-chat

This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.

github

: 1.1k

python-tgpt

Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.

github

: 95

aidermacs

Aidermacs is an AI pair programming tool for Emacs that integrates Aider, a powerful open-source AI pair programming tool. It provides top performance on the SWE Bench, support for multi-file edits, real-time file synchronization, and broad language support. Aidermacs delivers an Emacs-centric experience with features like intelligent model selection, flexible terminal backend support, smarter syntax highlighting, enhanced file management, and streamlined transient menus. It thrives on community involvement, encouraging contributions, issue reporting, idea sharing, and documentation improvement.

github

: 376

openedai-speech

OpenedAI Speech is a free, private text-to-speech server compatible with the OpenAI audio/speech API. It offers custom voice cloning and supports various models like tts-1 and tts-1-hd. Users can map their own piper voices and create custom cloned voices. The server provides multilingual support with XTTS voices and allows fixing incorrect sounds with regex. Recent changes include bug fixes, improved error handling, and updates for multilingual support. Installation can be done via Docker or manual setup, with usage instructions provided. Custom voices can be created using Piper or Coqui XTTS v2, with guidelines for preparing audio files. The tool is suitable for tasks like generating speech from text, creating custom voices, and multilingual text-to-speech applications.

github

: 243

HippoRAG

HippoRAG is a novel retrieval augmented generation (RAG) framework inspired by the neurobiology of human long-term memory that enables Large Language Models (LLMs) to continuously integrate knowledge across external documents. It provides RAG systems with capabilities that usually require a costly and high-latency iterative LLM pipeline for only a fraction of the computational cost. The tool facilitates setting up retrieval corpus, indexing, and retrieval processes for LLMs, offering flexibility in choosing different online LLM APIs or offline LLM deployments through LangChain integration. Users can run retrieval on pre-defined queries or integrate directly with the HippoRAG API. The tool also supports reproducibility of experiments and provides data, baselines, and hyperparameter tuning scripts for research purposes.

github

: 2.8k

well-architected-iac-analyzer

Well-Architected Infrastructure as Code (IaC) Analyzer is a project demonstrating how generative AI can evaluate infrastructure code for alignment with best practices. It features a modern web application allowing users to upload IaC documents, complete IaC projects, or architecture diagrams for assessment. The tool provides insights into infrastructure code alignment with AWS best practices, offers suggestions for improving cloud architecture designs, and can generate IaC templates from architecture diagrams. Users can analyze CloudFormation, Terraform, or AWS CDK templates, architecture diagrams in PNG or JPEG format, and complete IaC projects with supporting documents. Real-time analysis against Well-Architected best practices, integration with AWS Well-Architected Tool, and export of analysis results and recommendations are included.

github

: 196

agenticSeek

AgenticSeek is a voice-enabled AI assistant powered by DeepSeek R1 agents, offering a fully local alternative to cloud-based AI services. It allows users to interact with their filesystem, code in multiple languages, and perform various tasks autonomously. The tool is equipped with memory to remember user preferences and past conversations, and it can divide tasks among multiple agents for efficient execution. AgenticSeek prioritizes privacy by running entirely on the user's hardware without sending data to the cloud.

github

: 21.8k

gemini-srt-translator

Gemini SRT Translator is a tool that utilizes Google Generative AI to provide accurate and efficient translations for subtitle files. Users can customize translation settings, such as model name and batch size, and list available models from the Gemini API. The tool requires a free API key from Google AI Studio for setup and offers features like translating subtitles to a specified target language and resuming partial translations. Users can further customize translation settings with optional parameters like gemini_api_key2, output_file, start_line, model_name, batch_size, and more.

github

: 67

Construction-Hazard-Detection

Construction-Hazard-Detection is an AI-driven tool focused on improving safety at construction sites by utilizing the YOLOv8 model for object detection. The system identifies potential hazards like overhead heavy loads and steel pipes, providing real-time analysis and warnings. Users can configure the system via a YAML file and run it using Docker. The primary dataset used for training is the Construction Site Safety Image Dataset enriched with additional annotations. The system logs are accessible within the Docker container for debugging, and notifications are sent through the LINE messaging API when hazards are detected.

github

: 153

ChatOpsLLM

ChatOpsLLM is a project designed to empower chatbots with effortless DevOps capabilities. It provides an intuitive interface and streamlined workflows for managing and scaling language models. The project incorporates robust MLOps practices, including CI/CD pipelines with Jenkins and Ansible, monitoring with Prometheus and Grafana, and centralized logging with the ELK stack. Developers can find detailed documentation and instructions on the project's website.

github

: 87

For similar tasks

mcpdoc

github

: 148

pacha

Pacha is an AI tool designed for retrieving context for natural language queries using a SQL interface and Python programming environment. It is optimized for working with Hasura DDN for multi-source querying. Pacha is used in conjunction with language models to produce informed responses in AI applications, agents, and chatbots.

github

: 75

superagentx

SuperAgentX is a lightweight open-source AI framework designed for multi-agent applications with Artificial General Intelligence (AGI) capabilities. It offers goal-oriented multi-agents with retry mechanisms, easy deployment through WebSocket, RESTful API, and IO console interfaces, streamlined architecture with no major dependencies, contextual memory using SQL + Vector databases, flexible LLM configuration supporting various Gen AI models, and extendable handlers for integration with diverse APIs and data sources. It aims to accelerate the development of AGI by providing a powerful platform for building autonomous AI agents capable of executing complex tasks with minimal human intervention.

github

: 57

chatmemory

ChatMemory is a simple yet powerful long-term memory manager that facilitates communication between AI and users. It organizes conversation data into history, summary, and knowledge entities, enabling quick retrieval of context and generation of clear, concise answers. The tool leverages vector search on summaries/knowledge and detailed history to provide accurate responses. It balances speed and accuracy by using lightweight retrieval and fallback detailed search mechanisms, ensuring efficient memory management and response generation beyond mere data retrieval.

github

: 73

mem0-chrome-extension

Mem0 Chrome Extension is a tool that enhances AI interactions by providing a universal memory layer across various AI assistants. It allows users to seamlessly share context, automatically capture relevant information, and retrieve memories intelligently. The extension offers features like one-click sync with existing ChatGPT memories and a memory dashboard for easy management. Users can install the extension in Google Chrome, sign in with Google, and start using it with supported AI assistants. Mem0 is free to use with no usage limits or ads, and it prioritizes privacy and data security by sending messages to the Mem0 API for memory extraction and retrieval.

github

: 385

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

mcpdoc

README:

MCP LLMS-TXT Documentation Server

Overview

llms-txt

Quickstart

Install uv

Choose an llms.txt file to use.

(Optional) Test the MCP server locally with your llms.txt file(s) of choice:

Connect to Cursor

Connect to Windsurf

Connect to Claude Desktop

Connect to Claude Code

Command-line Interface

Additional Options

Configuration Format

YAML Configuration Example (sample_config.yaml)

JSON Configuration Example (sample_config.json)

Programmatic Usage

For Tasks:

For Jobs:

Alternative AI tools for mcpdoc

Similar Open Source Tools

mcpdoc

cursor-tools

shortest

langserve

exif-photo-blog

code2prompt

bedrock-claude-chat

python-tgpt

aidermacs

openedai-speech

HippoRAG

well-architected-iac-analyzer

agenticSeek

gemini-srt-translator

Construction-Hazard-Detection

ChatOpsLLM

For similar tasks

mcpdoc

pacha

superagentx

chatmemory

mem0-chrome-extension

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape

Choose an `llms.txt` file to use.

(Optional) Test the MCP server locally with your `llms.txt` file(s) of choice: