mcp-memory-service
Universal MCP memory service with semantic search, multi-client support, and autonomous consolidation for Claude Desktop, VS Code, and 13+ AI applications
Stars: 724
The MCP Memory Service is a universal memory service designed for AI assistants, providing semantic memory search and persistent storage. It works with various AI applications and offers fast local search using SQLite-vec and global distribution through Cloudflare. The service supports intelligent memory management, universal compatibility with AI tools, flexible storage options, and is production-ready with cross-platform support and secure connections. Users can store and recall memories, search by tags, check system health, and configure the service for Claude Desktop integration and environment variables.
README:
Universal MCP memory service with OAuth 2.1 team collaboration and semantic memory search for AI assistants. Features Claude Code HTTP transport, zero-configuration authentication, and enterprise security. Works with Claude Desktop, VS Code, Cursor, Continue, and 13+ AI applications with SQLite-vec for fast local search and Cloudflare for global distribution.
π Claude Code Team Collaboration (Zero Configuration):
# 1. Start OAuth-enabled server
export MCP_OAUTH_ENABLED=true
uv run memory server --http
# 2. Add HTTP transport to Claude Code
claude mcp add --transport http memory-service http://localhost:8000/mcp
# β
Done! Claude Code automatically handles OAuth registration and team collaborationπ Complete Setup Guide: OAuth 2.1 Setup Guide
Universal Installer (Most Compatible):
# Clone and install with automatic platform detection
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
python install.pyDocker (Fastest):
# For MCP protocol (Claude Desktop)
docker-compose up -d
# For HTTP API + OAuth (Team Collaboration)
docker-compose -f docker-compose.http.yml up -dSmithery (Claude Desktop):
# Auto-install for Claude Desktop
npx -y @smithery/cli install @doobidoo/mcp-memory-service --client claudeUpdating from an older version? Scripts have been reorganized for better maintainability:
-
Recommended: Use
python -m mcp_memory_service.serverin your Claude Desktop config (no path dependencies!) -
Alternative 1: Use
uv run memory serverwith UV tooling -
Alternative 2: Update path from
scripts/run_memory_server.pytoscripts/server/run_memory_server.py - Backward compatible: Old path still works with a migration notice
On your first run, you'll see some warnings that are completely normal:
- "WARNING: Failed to load from cache: No snapshots directory" - The service is checking for cached models (first-time setup)
- "WARNING: Using TRANSFORMERS_CACHE is deprecated" - Informational warning, doesn't affect functionality
- Model download in progress - The service automatically downloads a ~25MB embedding model (takes 1-2 minutes)
These warnings disappear after the first successful run. The service is working correctly! For details, see our First-Time Setup Guide.
sqlite-vec may not have pre-built wheels for Python 3.13 yet. If installation fails:
- The installer will automatically try multiple installation methods
- Consider using Python 3.12 for the smoothest experience:
brew install [email protected] - Alternative: Use ChromaDB backend with
--storage-backend chromadb - See Troubleshooting Guide for details
macOS users may encounter enable_load_extension errors with sqlite-vec:
- System Python on macOS lacks SQLite extension support by default
-
Solution: Use Homebrew Python:
brew install python && rehash -
Alternative: Use pyenv:
PYTHON_CONFIGURE_OPTS='--enable-loadable-sqlite-extensions' pyenv install 3.12.0 -
Fallback: Use ChromaDB backend:
export MCP_MEMORY_STORAGE_BACKEND=chromadb - See Troubleshooting Guide for details
π Visit our comprehensive Wiki for detailed guides:
- π OAuth 2.1 Setup Guide - NEW! Complete OAuth 2.1 Dynamic Client Registration guide
- π Integration Guide - Claude Desktop, Claude Code HTTP transport, VS Code, and more
- π‘οΈ Advanced Configuration - Updated! OAuth security, enterprise features
- π Installation Guide - Complete installation for all platforms and use cases
- π₯οΈ Platform Setup Guide - Windows, macOS, and Linux optimizations
- β‘ Performance Optimization - Speed up queries, optimize resources, scaling
- π¨βπ» Development Reference - Claude Code hooks, API reference, debugging
- π§ Troubleshooting Guide - Updated! OAuth troubleshooting + common issues
- β FAQ - Frequently asked questions
- π Examples - Practical code examples and workflows
- OAuth 2.1 Dynamic Client Registration - RFC 7591 & RFC 8414 compliant
- Claude Code HTTP Transport - Zero-configuration team collaboration
- JWT Authentication - Enterprise-grade security with scope validation
- Auto-Discovery Endpoints - Seamless client registration and authorization
- Multi-Auth Support - OAuth + API keys + optional anonymous access
- Semantic search with vector embeddings
- Natural language time queries ("yesterday", "last week")
- Tag-based organization with smart categorization
- Memory consolidation with dream-inspired algorithms
- Claude Desktop - Native MCP integration
- Claude Code - HTTP transport + Memory-aware development with hooks
- VS Code, Cursor, Continue - IDE extensions
- 13+ AI applications - REST API compatibility
- SQLite-vec - Fast local storage (recommended)
- ChromaDB - Multi-client collaboration
- Cloudflare - Global edge distribution
- Automatic backups and synchronization
- Cross-platform - Windows, macOS, Linux
- Service installation - Auto-start background operation
- HTTPS/SSL - Secure connections with OAuth 2.1
- Docker support - Easy deployment with team collaboration
# Start OAuth-enabled server for team collaboration
export MCP_OAUTH_ENABLED=true
uv run memory server --http
# Claude Code team members connect via HTTP transport
claude mcp add --transport http memory-service http://your-server:8000/mcp
# β Automatic OAuth discovery, registration, and authentication# Store a memory
uv run memory store "Fixed race condition in authentication by adding mutex locks"
# Search for relevant memories
uv run memory recall "authentication race condition"
# Search by tags
uv run memory search --tags python debugging
# Check system health (shows OAuth status)
uv run memory healthRecommended approach - Add to your Claude Desktop config (~/.claude/config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "mcp_memory_service.server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}Alternative approaches:
// Option 1: UV tooling (if using UV)
{
"mcpServers": {
"memory": {
"command": "uv",
"args": ["--directory", "/path/to/mcp-memory-service", "run", "memory", "server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}
// Option 2: Direct script path (v6.17.0+)
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["/path/to/mcp-memory-service/scripts/server/run_memory_server.py"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}# Storage backend (sqlite_vec recommended)
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
# Enable HTTP API
export MCP_HTTP_ENABLED=true
export MCP_HTTP_PORT=8000
# Security
export MCP_API_KEY="your-secure-key"βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β AI Clients β β MCP Memory β β Storage Backend β
β β β Service v7.0 β β β
β β’ Claude DesktopβββββΊβ β’ MCP Protocol βββββΊβ β’ SQLite-vec β
β β’ Claude Code β β β’ HTTP Transportβ β β’ ChromaDB β
β (HTTP/OAuth) β β β’ OAuth 2.1 Authβ β β’ Cloudflare β
β β’ VS Code β β β’ Memory Store β β β’ Hybrid β
β β’ Cursor β β β’ Semantic β β β
β β’ 13+ AI Apps β β Search β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
mcp-memory-service/
βββ src/mcp_memory_service/ # Core application
β βββ models/ # Data models
β βββ storage/ # Storage backends
β βββ web/ # HTTP API & dashboard
β βββ server.py # MCP server
βββ scripts/ # Utilities & installation
βββ tests/ # Test suite
βββ tools/docker/ # Docker configuration
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
See CONTRIBUTING.md for detailed guidelines.
- π Documentation: Wiki - Comprehensive guides
- π Bug Reports: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Troubleshooting: Troubleshooting Guide
-
β
Configuration Validator: Run
python scripts/validation/validate_configuration_complete.pyto check your setup - π Backend Sync Tools: See scripts/README.md for CloudflareβSQLite sync
Real-world metrics from active deployments:
- 750+ memories stored and actively used across teams
- <500ms response time for semantic search (local & HTTP transport)
- 65% token reduction in Claude Code sessions with OAuth collaboration
- 96.7% faster context setup (15min β 30sec)
- 100% knowledge retention across sessions and team members
- Zero-configuration OAuth setup success rate: 98.5%
-
Verified MCP Server
-
Featured AI Tool
- Production-tested across 13+ AI applications
- Community-driven with real-world feedback and improvements
Apache License 2.0 - see LICENSE for details.
Ready to supercharge your AI workflow? π
π Start with our Installation Guide or explore the Wiki for comprehensive documentation.
Transform your AI conversations into persistent, searchable knowledge that grows with you.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mcp-memory-service
Similar Open Source Tools
mcp-memory-service
The MCP Memory Service is a universal memory service designed for AI assistants, providing semantic memory search and persistent storage. It works with various AI applications and offers fast local search using SQLite-vec and global distribution through Cloudflare. The service supports intelligent memory management, universal compatibility with AI tools, flexible storage options, and is production-ready with cross-platform support and secure connections. Users can store and recall memories, search by tags, check system health, and configure the service for Claude Desktop integration and environment variables.
GitVizz
GitVizz is an AI-powered repository analysis tool that helps developers understand and navigate codebases quickly. It transforms complex code structures into interactive documentation, dependency graphs, and intelligent conversations. With features like interactive dependency graphs, AI-powered code conversations, advanced code visualization, and automatic documentation generation, GitVizz offers instant understanding and insights for any repository. The tool is built with modern technologies like Next.js, FastAPI, and OpenAI, making it scalable and efficient for analyzing large codebases. GitVizz also provides a standalone Python library for core code analysis and dependency graph generation, offering multi-language parsing, AST analysis, dependency graphs, visualizations, and extensibility for custom applications.
Lynkr
Lynkr is a self-hosted proxy server that unlocks various AI coding tools like Claude Code CLI, Cursor IDE, and Codex Cli. It supports multiple LLM providers such as Databricks, AWS Bedrock, OpenRouter, Ollama, llama.cpp, Azure OpenAI, Azure Anthropic, OpenAI, and LM Studio. Lynkr offers cost reduction, local/private execution, remote or local connectivity, zero code changes, and enterprise-ready features. It is perfect for developers needing provider flexibility, cost control, self-hosted AI with observability, local model execution, and cost reduction strategies.
pluely
Pluely is a versatile and user-friendly tool for managing tasks and projects. It provides a simple interface for creating, organizing, and tracking tasks, making it easy to stay on top of your work. With features like task prioritization, due date reminders, and collaboration options, Pluely helps individuals and teams streamline their workflow and boost productivity. Whether you're a student juggling assignments, a professional managing multiple projects, or a team coordinating tasks, Pluely is the perfect solution to keep you organized and efficient.
aegra
Aegra is a self-hosted AI agent backend platform that provides LangGraph power without vendor lock-in. Built with FastAPI + PostgreSQL, it offers complete control over agent orchestration for teams looking to escape vendor lock-in, meet data sovereignty requirements, enable custom deployments, and optimize costs. Aegra is Agent Protocol compliant and perfect for teams seeking a free, self-hosted alternative to LangGraph Platform with zero lock-in, full control, and compatibility with existing LangGraph Client SDK.
Call
Call is an open-source AI-native alternative to Google Meet and Zoom, offering video calling, team collaboration, contact management, meeting scheduling, AI-powered features, security, and privacy. It is cross-platform, web-based, mobile responsive, and supports offline capabilities. The tech stack includes Next.js, TypeScript, Tailwind CSS, Mediasoup-SFU, React Query, Zustand, Hono, PostgreSQL, Drizzle ORM, Better Auth, Turborepo, Docker, Vercel, and Rate Limiting.
xln
XLN (Cross-Local Network) is a platform that enables instant off-chain settlement with on-chain finality. It combines Byzantine consensus, Bloomberg Terminal functionalities, and VR capabilities to run economic simulations in the browser without the need for a backend. The architecture includes layers for jurisdictions, entities, and accounts, with features like Solidity contracts, BFT consensus, and bilateral channels. The tool offers a panel system similar to Bloomberg Terminal for workspace organization and visualization, along with support for offline blockchain simulations in the browser and VR/Quest compatibility.
botserver
General Bots is a self-hosted AI automation platform and LLM conversational platform focused on convention over configuration and code-less approaches. It serves as the core API server handling LLM orchestration, business logic, database operations, and multi-channel communication. The platform offers features like multi-vendor LLM API, MCP + LLM Tools Generation, Semantic Caching, Web Automation Engine, Enterprise Data Connectors, and Git-like Version Control. It enforces a ZERO TOLERANCE POLICY for code quality and security, with strict guidelines for error handling, performance optimization, and code patterns. The project structure includes modules for core functionalities like Rhai BASIC interpreter, security, shared types, tasks, auto task system, file operations, learning system, and LLM assistance.
Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers. By processing both the audio and text of each call, it provides insights such as sentiment analysis, topic detection, conflict detection, profanity word detection, and summary. These cutting-edge techniques help businesses optimize customer interactions, identify areas for improvement, and enhance overall service quality. When an audio file is placed in the .data/input directory, the entire pipeline automatically starts running, and the resulting data is inserted into the database. This is only a v1.1.0 version; many new features will be added, models will be fine-tuned or trained from scratch, and various optimization efforts will be applied.
DeepMCPAgent
DeepMCPAgent is a model-agnostic tool that enables the creation of LangChain/LangGraph agents powered by MCP tools over HTTP/SSE. It allows for dynamic discovery of tools, connection to remote MCP servers, and integration with any LangChain chat model instance. The tool provides a deep agent loop for enhanced functionality and supports typed tool arguments for validated calls. DeepMCPAgent emphasizes the importance of MCP-first approach, where agents dynamically discover and call tools rather than hardcoding them.
DeepTutor
DeepTutor is an AI-powered personalized learning assistant that offers a suite of modules for massive document knowledge Q&A, interactive learning visualization, knowledge reinforcement with practice exercise generation, deep research, and idea generation. The tool supports multi-agent collaboration, dynamic topic queues, and structured outputs for various tasks. It provides a unified system entry for activity tracking, knowledge base management, and system status monitoring. DeepTutor is designed to streamline learning and research processes by leveraging AI technologies and interactive features.
AutoAgents
AutoAgents is a cutting-edge multi-agent framework built in Rust that enables the creation of intelligent, autonomous agents powered by Large Language Models (LLMs) and Ractor. Designed for performance, safety, and scalability. AutoAgents provides a robust foundation for building complex AI systems that can reason, act, and collaborate. With AutoAgents you can create Cloud Native Agents, Edge Native Agents and Hybrid Models as well. It is so extensible that other ML Models can be used to create complex pipelines using Actor Framework.
timeline-studio
Timeline Studio is a next-generation professional video editor with AI integration that automates content creation for social media. It combines the power of desktop applications with the convenience of web interfaces. With 257 AI tools, GPU acceleration, plugin system, multi-language interface, and local processing, Timeline Studio offers complete video production automation. Users can create videos for various social media platforms like TikTok, YouTube, Vimeo, Telegram, and Instagram with optimized versions. The tool saves time, understands trends, provides professional quality, and allows for easy feature extension through plugins. Timeline Studio is open source, transparent, and offers significant time savings and quality improvements for video editing tasks.
ChordMiniApp
ChordMini is an advanced music analysis platform with AI-powered chord recognition, beat detection, and synchronized lyrics. It features a clean and intuitive interface for YouTube search, chord progression visualization, interactive guitar diagrams with accurate fingering patterns, lead sheet with AI assistant for synchronized lyrics transcription, and various add-on features like Roman Numeral Analysis, Key Modulation Signals, Simplified Chord Notation, and Enhanced Chord Correction. The tool requires Node.js, Python 3.9+, and a Firebase account for setup. It offers a hybrid backend architecture for local development and production deployments, with features like beat detection, chord recognition, lyrics processing, rate limiting, and audio processing supporting MP3, WAV, and FLAC formats. ChordMini provides a comprehensive music analysis workflow from user input to visualization, including dual input support, environment-aware processing, intelligent caching, advanced ML pipeline, and rich visualization options.
chatgpt-webui
ChatGPT WebUI is a user-friendly web graphical interface for various LLMs like ChatGPT, providing simplified features such as core ChatGPT conversation and document retrieval dialogues. It has been optimized for better RAG retrieval accuracy and supports various search engines. Users can deploy local language models easily and interact with different LLMs like GPT-4, Azure OpenAI, and more. The tool offers powerful functionalities like GPT4 API configuration, system prompt setup for role-playing, and basic conversation features. It also provides a history of conversations, customization options, and a seamless user experience with themes, dark mode, and PWA installation support.
For similar tasks
mcp-memory-service
The MCP Memory Service is a universal memory service designed for AI assistants, providing semantic memory search and persistent storage. It works with various AI applications and offers fast local search using SQLite-vec and global distribution through Cloudflare. The service supports intelligent memory management, universal compatibility with AI tools, flexible storage options, and is production-ready with cross-platform support and secure connections. Users can store and recall memories, search by tags, check system health, and configure the service for Claude Desktop integration and environment variables.
sonarqube-mcp-server
The SonarQube MCP Server is a Model Context Protocol (MCP) server that enables seamless integration with SonarQube Server or Cloud for code quality and security. It supports the analysis of code snippets directly within the agent context. The server provides various tools for analyzing code, managing issues, accessing metrics, and interacting with SonarQube projects. It also supports advanced features like dependency risk analysis, enterprise portfolio management, and system health checks. The server can be configured for different transport modes, proxy settings, and custom certificates. Telemetry data collection can be disabled if needed.
mem0
Mem0 is a tool that provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications. It offers persistent memory for users, sessions, and agents, self-improving personalization, a simple API for easy integration, and cross-platform consistency. Users can store memories, retrieve memories, search for related memories, update memories, get the history of a memory, and delete memories using Mem0. It is designed to enhance AI experiences by enabling long-term memory storage and retrieval.
redcache-ai
RedCache-ai is a memory framework designed for Large Language Models and Agents. It provides a dynamic memory framework for developers to build various applications, from AI-powered dating apps to healthcare diagnostics platforms. Users can store, retrieve, search, update, and delete memories using RedCache-ai. The tool also supports integration with OpenAI for enhancing memories. RedCache-ai aims to expand its functionality by integrating with more LLM providers, adding support for AI Agents, and providing a hosted version.
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.