
persistent-ai-memory
A persistent local memory for AI, LLMs, or Copilot in VS Code.
Stars: 138

Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.
README:
π Community Call to Action: Have you made improvements or additions to this system? We want to include your work! Every contributor will be properly credited in the final product. Whether it's bug fixes, new features, or documentation improvements - your contributions matter and will help shape the future of AI memory systems. Submit a pull request today!
GITHUB LINK - https://github.com/savantskie/persistent-ai-memory.git
π Recent Changes (2025-09-04)
-
π§ Enhanced Embedding System: Implemented intelligent embedding service with primary/fallback providers
- Preservation Strategy: All existing embeddings (15,000+) are automatically preserved
- LM Studio Primary: High-quality embeddings for new content via LM Studio
- Ollama Fallback: Fast local embeddings when LM Studio unavailable
- Real-time Provider Monitoring: Automatic availability detection and graceful fallback
-
βοΈ Standardized Configuration: Added
embedding_config.json
for easy provider management -
π Improved Organization: Moved all test files to proper
tests/
folder structure -
π§ Enhanced Core Systems: Updated
ai_memory_core.py
with intelligent provider selection - π‘οΈ Backward Compatibility: All existing functionality preserved while adding new capabilities
- π Performance Optimization: Better semantic search quality with preserved data integrity
Previous Changes (2025-09-01):
- Updated
ai-memory-mcp_server.py
to include enhanced tool registration logic forupdate_memory
and other tools. - Improved MCP server functionality to dynamically detect and register tools based on client context.
- Added robust error handling and logging for tool execution.
- Enhanced automatic maintenance tasks with centralized scheduling and improved database optimization routines.
A comprehensive AI memory system that provides persistent, searchable storage for AI assistants with conversation tracking, MCP tool call logging, and intelligent scheduling.
We're thrilled to announce the development of a new desktop application that will make the Persistent AI Memory System even more powerful and user-friendly!
-
Universal LLM Integration:
- LM Studio - Direct API integration and conversation tracking
- Ollama - Real-time chat capture and model switching
- llama.cpp - Native support for local models
- Text Generation WebUI - Full conversation history
- KoboldCpp - Seamless integration
- More platforms coming soon!
-
Enhanced GUI Features:
- Real-time conversation visualization
- Advanced memory search interface
- Interactive context management
- Visual relationship mapping
- Customizable dashboard
- Dark/Light theme support
-
Extended Capabilities:
- Multiple MCP protocol support
- Cross-platform conversation sync
- Enhanced embedding options
- Visual memory navigation
- Bulk import/export tools
- Custom plugin support
Stay tuned for the beta release! Follow this repository for updates.
π― Multiple Installation Options Available: We've created 4 different ways to install this system - from one-command installation to manual setup - so you can get started immediately regardless of your platform or preference!
π New to this from Reddit? Check out the Reddit Quick Start Guide for a super simple setup!
curl -sSL https://raw.githubusercontent.com/savantskie/persistent-ai-memory/main/install.sh | bash
curl -sSL https://raw.githubusercontent.com/savantskie/persistent-ai-memory/main/install.bat -o install.bat && install.bat
git clone https://github.com/savantskie/persistent-ai-memory.git
cd persistent-ai-memory
pip install -r requirements.txt
pip install -e .
pip install git+https://github.com/savantskie/persistent-ai-memory.git
After installation, verify everything is working:
python tests/test_health_check.py
These tools are available in all environments (LM Studio, VS Code, etc.):
-
Memory Management:
-
search_memories
- Search through stored memories using semantic similarity -
store_conversation
- Store conversation messages -
create_memory
- Create a new curated memory entry -
update_memory
- Update an existing memory entry -
get_recent_context
- Get recent conversation context
-
-
Schedule Management:
-
create_appointment
- Create calendar appointments -
create_reminder
- Set reminders with priorities
-
-
System Tools:
-
get_system_health
- Check system status and database health -
get_tool_usage_summary
- Get AI tool usage statistics -
reflect_on_tool_usage
- AI self-reflection on tool patterns -
get_ai_insights
- Get AI's insights and patterns
-
These tools are only available in specific development environments:
-
save_development_session
- Save VS Code development context -
store_project_insight
- Store development insights -
search_project_history
- Search project development history -
link_code_context
- Link conversations to specific code -
get_project_continuity
- Get context for continuing development work
-
Enhanced Memory System:
- SQLite-based persistent storage across all databases
- Registry-based extensible import system
- Advanced duplicate detection and migration logic in all major database classes
- Centralized, generic maintenance and deduplication routines
- Robust, explicit startup (no auto background tasks)
- Database-backed deduplication across all sources
- Incremental imports (only new messages)
- Enhanced error handling with detailed logging
- Automatic system maintenance and optimization
- AI-driven self-reflection and pattern analysis
- Cross-database relationship tracking
- Smart memory pruning and archival
-
Dedicated Chat Format Support:
- Independent parsers for each chat GUI
- No merged/refactored import logic
- Easy addition of new chat formats
- Format-specific metadata preservation
- Source-aware deduplication
-
Core Features:
- Vector Search using LM Studio embeddings
- Real-time conversation monitoring
- MCP server with tool call logging
- Advanced AI self-reflection system:
- Usage pattern detection and analysis
- Automated performance optimization
- Tool effectiveness tracking
- Learning from past interactions
- Continuous system improvement
- Multi-platform compatibility
- Zero configuration needed
- Full feature parity with Fridayβs main repo
-
Platform Support:
- LM Studio integration
- VS Code & GitHub Copilot
- Koboldcpp compatibility
- Ollama chat tracking
- Cross-platform (Windows/Linux/macOS)
- Learning from past interactions
The system supports multiple embedding providers with automatic fallback for optimal performance:
-
Ollama (Default): Uses
qwen2.5:1.5b
for fast, lightweight embeddings -
LM Studio: Uses
text-embedding-nomic-embed-text-v1.5
for quality embeddings -
OpenAI: Uses
text-embedding-3-small
for high-quality cloud embeddings - Custom: Support for custom embedding servers
Edit embedding_config.json
to customize your embedding setup:
{
"embedding_configuration": {
"primary": {
"provider": "ollama",
"model": "qwen2.5:1.5b",
"base_url": "http://localhost:11434"
},
"fallback": {
"provider": "lm_studio",
"model": "text-embedding-nomic-embed-text-v1.5",
"base_url": "http://localhost:1234"
}
}
}
-
For Ollama (Recommended):
ollama pull qwen2.5:1.5b
-
For LM Studio: Load an embedding model in LM Studio
-
For OpenAI: Add your API key to the config file
-
For Custom: Configure your server URL and model name
The system will automatically try the primary provider first, then fallback to the secondary if needed.
import asyncio
from ai_memory_core import PersistentAIMemorySystem
async def main():
# Initialize the memory system
memory = PersistentAIMemorySystem()
# Store a memory
await memory.store_memory("I learned about Python async programming today")
# Search memories
results = await memory.search_memories("Python programming")
print(f"Found {len(results)} related memories")
# Store conversation
await memory.store_conversation("user", "What is async programming?")
await memory.store_conversation("assistant", "Async programming allows...")
if __name__ == "__main__":
asyncio.run(main())
# Run as MCP server
python ai_memory_core.py
# Monitor conversation files (like ChatGPT exports)
from ai_memory_core import PersistentAIMemorySystem
memory = PersistentAIMemorySystem()
memory.start_conversation_monitoring("/path/to/conversation/files")
The system includes 5 specialized databases with enhanced cross-source integration:
-
Conversations:
- Multi-source chat history with embeddings
- Registry-based extensible import system
- Independent parsers per chat format
- Database-backed deduplication
- Source tracking and sync status
- Cross-conversation relationships
- Incremental import tracking
- Comprehensive metadata per source
-
AI Memories:
- Long-term persistent AI memories
- Cross-source knowledge synthesis
- Relationship tracking between memories
-
Schedule:
- Time-based events and reminders
- Cross-platform calendar integration
- Smart scheduling with context
-
VS Code Projects:
- Project context and file tracking
- Development conversation tracking
- Code change history integration
- Context-aware project insights
-
MCP Tool Calls:
- Model Context Protocol interaction logging
- Tool usage analytics
- Self-reflection capabilities
- Performance monitoring
The system works with zero configuration but can be customized:
memory = PersistentAIMemorySystem(
db_path="custom_memory.db",
embedding_service_url="http://localhost:1234/v1/embeddings"
)
The system now includes automatic and centralized maintenance features:
-
Centralized Maintenance & Deduplication:
- Generic, registry-based maintenance routines
- Advanced duplicate detection and migration logic for all database classes
- No format-specific or auto-startup tasks; all maintenance is explicit and robust
-
Database Optimization:
- Automatic vacuum and reindex
- Smart memory pruning
- Performance monitoring
- Index optimization
-
Error Management:
- Comprehensive error logging
- Automatic recovery procedures
- Failed operation retry
- Data consistency checks
-
AI Self-Reflection:
- Tool usage pattern analysis
- Performance optimization suggestions
- Automated system improvements
- Usage statistics and insights
Check the examples/
directory for:
- Basic memory operations
- Conversation tracking
- MCP server setup
- Vector search demonstrations
- Custom chat format integration
- Deduplication system usage
- Registry-based importing
- Source tracking setup
- Koboldcpp Integration - Complete setup guide for Koboldcpp compatibility
- LM Studio - Built-in support for embeddings and conversation capture
- VS Code - MCP server integration for development workflows
- SillyTavern - MCP server support with character-specific memory tools
- Ollama - Compatible through file monitoring and HTTP API approaches
- File Monitoring - Automatic conversation capture from chat logs
- HTTP API - Real-time memory access via REST endpoints
- MCP Protocol - Standardized tool interface for compatible platforms
The system now provides comprehensive cross-source memory management:
-
Source Tracking:
- Automatic source detection and monitoring
- Per-source metadata and sync status
- Error tracking and recovery
- Active source health monitoring
-
Relationship Management:
- Cross-conversation linking
- Context preservation across platforms
- Conversation continuation tracking
- Reference and fork management
-
Supported Sources:
- VS Code/GitHub Copilot
- ChatGPT desktop app
- Claude/Anthropic
- Character.ai
- SillyTavern (file monitoring + MCP server)
- text-generation-webui
- Ollama
- Generic text/markdown formats
- Custom source support via plugins
-
Sync Features:
- Real-time sync status tracking
- Source-specific metadata preservation
- Robust deduplication across sources
- Failure recovery and retry logic
Run the complete test suite:
python tests/test_health_check.py
python tests/test_memory_operations.py
python tests/test_conversation_tracking.py
python tests/test_mcp_integration.py
-
store_memory(content, metadata=None)
- Store a persistent memory -
search_memories(query, limit=10)
- Semantic search of memories -
list_recent_memories(limit=10)
- Get recent memories
-
store_conversation(role, content, metadata=None)
- Store conversation turn -
search_conversations(query, limit=10)
- Search conversation history -
get_conversation_history(limit=100)
- Get recent conversations
-
log_tool_call(tool_name, arguments, result, metadata=None)
- Log MCP tool usage -
get_tool_call_history(tool_name=None, limit=100)
- Get tool usage history -
reflect_on_tool_usage()
- AI self-reflection on tool patterns
-
get_system_health()
- Check system status and database health
git clone https://github.com/savantskie/persistent-ai-memory.git
cd persistent-ai-memory
pip install -e ".[dev]"
pytest tests/
We welcome contributions! This system is designed to be:
- Modular: Easy to extend with new memory types
- Platform-agnostic: Works with any AI assistant that supports MCP
- Scalable: Handles large conversation histories efficiently
- [ ] Semantic Tagging Assistant - AI-powered memory categorization
- [ ] Memory Summarization - Automatic TL;DR for long conversations
- [ ] Deferred Retry Queue - Resilient file import with retry logic
- [ ] Memory Reflection Engine - Meta-insights from memory patterns
- [ ] Export/Import Tools - Backup and migration utilities
Recently Added Platforms (Based on Reddit Community Feedback & Local Storage Verification):
- β SillyTavern - AI character chat interface with conversation logging
- β Gemini CLI - Google's Gemini command line interface support
- β Open WebUI - Local web-based LLM interface (multiple install locations)
- β ChatGPT & Claude Desktop - Removed after verification (cloud-only, no local storage)
Total Platform Support: 9 Chat Platforms (Based on Verified Local Storage)
- β LM Studio - Local conversations in JSON format
- β Ollama - SQLite database with chat history
- β VS Code Copilot - Development conversation tracking
- β Open WebUI - SQLite database with conversation storage
- β Text Generation WebUI - Local chat logs and history
- β SillyTavern - AI character chat interface with conversation logging
- β Gemini CLI - Google's Gemini command line interface support
- β Jan AI - Local AI assistant with conversation storage
- β Perplexity - Local conversation tracking (where applicable)
β Cloud-Only Applications (No Local Storage - Removed After Verification):
- β ChatGPT Desktop - Cloud-only, no local conversation storage
- β Claude Desktop - Cloud-only, no local conversation files
- β Perplexity Web - Cloud-based, no local storage
Upcoming Community Requests:
- [ ] GraphDB Integration - Graph database support for relationship mapping (community requested)
- [ ] Discord Bot Integration - Chat logging for Discord AI bots
- [ ] Telegram Bot Support - Conversation tracking for Telegram bots
- [ ] API Standardization - Universal chat format for easier platform integration
Have a platform request? Open an issue or submit a PR - all contributors get credited!
MIT License - feel free to use this in your own AI projects!
This project is the result of a collaborative effort between humans and AI assistants:
- @savantskie - Project vision, architecture design, and testing
- GitHub Copilot - Core implementation, database design, MCP server development, and tool call logging system
- ChatGPT - Initial concept development, feature recommendations, and architectural guidance over 3 months of development
This project represents a unique collaboration between human creativity and AI assistance. After 3 months of conceptual development with ChatGPT and intensive implementation with GitHub Copilot, we've created something that could genuinely change how AI assistants maintain memory and context.
Special thanks to:
- ChatGPT for the original insight that "If this ever becomes open source? It'll become the standard."
- GitHub Copilot for the breakthrough implementation that solved foreign key constraints and made real-time conversation capture work flawlessly
- The open source community for inspiring us to share this foundational technology
Built with determination, debugged with patience, and designed for the future of AI assistance.
GITHUB LINK - https://github.com/savantskie/persistent-ai-memory.git
β If this project helps you build better AI assistants, please give it a star!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for persistent-ai-memory
Similar Open Source Tools

persistent-ai-memory
Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.

llamafarm
LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.

evi-run
evi-run is a powerful, production-ready multi-agent AI system built on Python using the OpenAI Agents SDK. It offers instant deployment, ultimate flexibility, built-in analytics, Telegram integration, and scalable architecture. The system features memory management, knowledge integration, task scheduling, multi-agent orchestration, custom agent creation, deep research, web intelligence, document processing, image generation, DEX analytics, and Solana token swap. It supports flexible usage modes like private, free, and pay mode, with upcoming features including NSFW mode, task scheduler, and automatic limit orders. The technology stack includes Python 3.11, OpenAI Agents SDK, Telegram Bot API, PostgreSQL, Redis, and Docker & Docker Compose for deployment.

OpenChat
OS Chat is a free, open-source AI personal assistant that combines 40+ language models with powerful automation capabilities. It allows users to deploy background agents, connect services like Gmail, Calendar, Notion, GitHub, and Slack, and get things done through natural conversation. With features like smart automation, service connectors, AI models, chat management, interface customization, and premium features, OS Chat offers a comprehensive solution for managing digital life and workflows. It prioritizes privacy by being open source and self-hostable, with encrypted API key storage.

animal-crossing-llm-mod
The Animal Crossing LLM Mod transforms the game into an AI-powered conversation experience by generating dynamic, contextual dialogue for villagers in real-time. It reads dialogue memory, generates AI responses, writes new dialogue back to the game, and creates natural and contextual conversations. The mod is experimental software with known bugs and is currently only tested on macOS. Users can interact with villagers in Animal Crossing using AI-generated responses, enhancing the gameplay experience.

J.A.R.V.I.S.2.0
J.A.R.V.I.S. 2.0 is an AI-powered assistant designed for voice commands, capable of tasks like providing weather reports, summarizing news, sending emails, and more. It features voice activation, speech recognition, AI responses, and handles multiple tasks including email sending, weather reports, news reading, image generation, database functions, phone call automation, AI-based task execution, website & application automation, and knowledge-based interactions. The assistant also includes timeout handling, automatic input processing, and the ability to call multiple functions simultaneously. It requires Python 3.9 or later and specific API keys for weather, news, email, and AI access. The tool integrates Gemini AI for function execution and Ollama as a fallback mechanism. It utilizes a RAG-based knowledge system and ADB integration for phone automation. Future enhancements include deeper mobile integration, advanced AI-driven automation, improved NLP-based command execution, and multi-modal interactions.

Sage
Sage is a production-ready, modular, and intelligent multi-agent orchestration framework for complex problem solving. It intelligently breaks down complex tasks into manageable subtasks through seamless agent collaboration. Sage provides Deep Research Mode for comprehensive analysis and Rapid Execution Mode for quick task completion. It offers features like intelligent task decomposition, agent orchestration, extensible tool system, dual execution modes, interactive web interface, advanced token tracking, rich configuration, developer-friendly APIs, and robust error recovery mechanisms. Sage supports custom workflows, multi-agent collaboration, custom agent development, agent flow orchestration, rule preferences system, message manager for smart token optimization, task manager for comprehensive state management, advanced file system operations, advanced tool system with plugin architecture, token usage & cost monitoring, and rich configuration system. It also includes real-time streaming & monitoring, advanced tool development, error handling & reliability, performance monitoring, MCP server integration, and security features.

llmchat
LLMChat is an all-in-one AI chat interface that supports multiple language models, offers a plugin library for enhanced functionality, enables web search capabilities, allows customization of AI assistants, provides text-to-speech conversion, ensures secure local data storage, and facilitates data import/export. It also includes features like knowledge spaces, prompt library, personalization, and can be installed as a Progressive Web App (PWA). The tech stack includes Next.js, TypeScript, Pglite, LangChain, Zustand, React Query, Supabase, Tailwind CSS, Framer Motion, Shadcn, and Tiptap. The roadmap includes upcoming features like speech-to-text and knowledge spaces.

llama.ui
llama.ui is an open-source desktop application that provides a beautiful, user-friendly interface for interacting with large language models powered by llama.cpp. It is designed for simplicity and privacy, allowing users to chat with powerful quantized models on their local machine without the need for cloud services. The project offers multi-provider support, conversation management with indexedDB storage, rich UI components including markdown rendering and file attachments, advanced features like PWA support and customizable generation parameters, and is privacy-focused with all data stored locally in the browser.

AIPex
AIPex is a revolutionary Chrome extension that transforms your browser into an intelligent automation platform. Using natural language commands and AI-powered intelligence, AIPex can automate virtually any browser task - from complex multi-step workflows to simple repetitive actions. It offers features like natural language control, AI-powered intelligence, multi-step automation, universal compatibility, smart data extraction, precision actions, form automation, visual understanding, developer-friendly with extensive API, and lightning-fast execution of automation tasks.

lyraios
LYRAIOS (LLM-based Your Reliable AI Operating System) is an advanced AI assistant platform built with FastAPI and Streamlit, designed to serve as an operating system for AI applications. It offers core features such as AI process management, memory system, and I/O system. The platform includes built-in tools like Calculator, Web Search, Financial Analysis, File Management, and Research Tools. It also provides specialized assistant teams for Python and research tasks. LYRAIOS is built on a technical architecture comprising FastAPI backend, Streamlit frontend, Vector Database, PostgreSQL storage, and Docker support. It offers features like knowledge management, process control, and security & access control. The roadmap includes enhancements in core platform, AI process management, memory system, tools & integrations, security & access control, open protocol architecture, multi-agent collaboration, and cross-platform support.

aigne-hub
AIGNE Hub is a unified AI gateway that manages connections to multiple LLM and AIGC providers, eliminating the complexity of handling API keys, usage tracking, and billing across different AI services. It provides self-hosting capabilities, multi-provider management, unified security, usage analytics, flexible billing, and seamless integration with the AIGNE framework. The tool supports various AI providers and deployment scenarios, catering to both enterprise self-hosting and service provider modes. Users can easily deploy and configure AI providers, enable billing, and utilize core capabilities such as chat completions, image generation, embeddings, and RESTful APIs. AIGNE Hub ensures secure access, encrypted API key management, user permissions, and audit logging. Built with modern technologies like AIGNE Framework, Node.js, TypeScript, React, SQLite, and Blocklet for cloud-native deployment.

gemini-cli
Gemini CLI is an open-source AI agent that provides lightweight access to Gemini, offering powerful capabilities like code understanding, generation, automation, integration, and advanced features. It is designed for developers who prefer working in the command line and offers extensibility through MCP support. The tool integrates directly into GitHub workflows and offers various authentication options for individual developers, enterprise teams, and production workloads. With features like code querying, editing, app generation, debugging, and GitHub integration, Gemini CLI aims to streamline development workflows and enhance productivity.

finite-monkey-engine
FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

DreamLayer
DreamLayer AI is an open-source Stable Diffusion WebUI designed for AI researchers, labs, and developers. It automates prompts, seeds, and metrics for benchmarking models, datasets, and samplers, enabling reproducible evaluations across multiple seeds and configurations. The tool integrates custom metrics and evaluation pipelines, providing a streamlined workflow for AI research. With features like automated benchmarking, reproducibility, built-in metrics, multi-modal readiness, and researcher-friendly interface, DreamLayer AI aims to simplify and accelerate the model evaluation process.

opcode
opcode is a powerful desktop application built with Tauri 2 that serves as a command center for interacting with Claude Code. It offers a visual GUI for managing Claude Code sessions, creating custom agents, tracking usage, and more. Users can navigate projects, create specialized AI agents, monitor usage analytics, manage MCP servers, create session checkpoints, edit CLAUDE.md files, and more. The tool bridges the gap between command-line tools and visual experiences, making AI-assisted development more intuitive and productive.
For similar tasks

persistent-ai-memory
Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.
For similar jobs

promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.