
memori
Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems
Stars: 1099

Memori is a lightweight and user-friendly memory management tool for developers. It helps in tracking memory usage, detecting memory leaks, and optimizing memory allocation in software projects. With Memori, developers can easily monitor and analyze memory consumption to improve the performance and stability of their applications. The tool provides detailed insights into memory usage patterns and helps in identifying areas for optimization. Memori is designed to be easy to integrate into existing projects and offers a simple yet powerful interface for managing memory resources effectively.
README:
Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems
From Postgres to MySQL, Memori plugs into the SQL databases you already use. Simple setup, infinite scale without new infrastructure.
- Second-memory for all your LLM work - Never repeat context again
- Dual-mode memory injection - Conscious short-term memory + Auto intelligent search
- Flexible database connections - SQLite, PostgreSQL, MySQL support
- Pydantic-based intelligence - Structured memory processing with validation
- Simple, reliable architecture - Just works out of the box
Install Memori:
pip install memorisdk
- Install OpenAI:
pip install openai
- Set OpenAI API Key:
export OPENAI_API_KEY="sk-your-openai-key-here"
- Run this Python script:
from memori import Memori
from openai import OpenAI
# Initialize OpenAI client
openai_client = OpenAI()
# Initialize memory
memori = Memori(conscious_ingest=True)
memori.enable()
print("=== First Conversation - Establishing Context ===")
response1 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "I'm working on a Python FastAPI project"
}]
)
print("Assistant:", response1.choices[0].message.content)
print("\n" + "="*50)
print("=== Second Conversation - Memory Provides Context ===")
response2 = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "user",
"content": "Help me add user authentication"
}]
)
print("Assistant:", response2.choices[0].message.content)
print("\n๐ก Notice: Memori automatically knows about your FastAPI Python project!")
By default, Memori uses in-memory SQLite database. Get FREE serverless database instance in GibsonAI platform.
๐ Ready to explore more?
- ๐ Examples - Basic usage patterns and code samples
- ๐ Framework Integrations - LangChain, Agno & CrewAI examples
- ๐ฎ Interactive Demos - Live applications & tutorials
office_work.enable() # Records ALL LLM conversations automatically
- Entity Extraction: Extracts people, technologies, projects
- Smart Categorization: Facts, preferences, skills, rules
- Pydantic Validation: Structured, type-safe memory storage
conscious_ingest=True # One-shot short-term memory injection
- At Startup: Conscious agent analyzes long-term memory patterns
- Memory Promotion: Moves essential conversations to short-term storage
- One-Shot Injection: Injects working memory once at conversation start
- Like Human Short-Term Memory: Names, current projects, preferences readily available
auto_ingest=True # Continuous intelligent memory retrieval
- Every LLM Call: Retrieval agent analyzes user query intelligently
- Full Database Search: Searches through entire memory database
- Context-Aware: Injects relevant memories based on current conversation
- Performance Optimized: Caching, async processing, background threads
# Mimics human conscious memory - essential info readily available
memori = Memori(
database_connect="sqlite:///my_memory.db",
conscious_ingest=True, # ๐ง Short-term working memory
openai_api_key="sk-..."
)
How Conscious Mode Works:
- At Startup: Conscious agent analyzes long-term memory patterns
- Essential Selection: Promotes 5-10 most important conversations to short-term
- One-Shot Injection: Injects this working memory once at conversation start
- No Repeats: Won't inject again during the same session
# Searches entire database dynamically based on user queries
memori = Memori(
database_connect="sqlite:///my_memory.db",
auto_ingest=True, # ๐ Smart database search
openai_api_key="sk-..."
)
How Auto Mode Works:
- Every LLM Call: Retrieval agent analyzes user input
- Query Planning: Uses AI to understand what memories are needed
- Smart Search: Searches through entire database (short-term + long-term)
- Context Injection: Injects 3-5 most relevant memories per call
# Get both working memory AND dynamic search
memori = Memori(
conscious_ingest=True, # Working memory once
auto_ingest=True, # Dynamic search every call
openai_api_key="sk-..."
)
- Memory Agent - Processes every conversation with Pydantic structured outputs
- Conscious Agent - Analyzes patterns, promotes long-term โ short-term memories
- Retrieval Agent - Intelligently searches and selects relevant context
- ๐ค Personal Identity: Your name, role, location, basic info
- โค๏ธ Preferences & Habits: What you like, work patterns, routines
- ๐ ๏ธ Skills & Tools: Technologies you use, expertise areas
- ๐ Current Projects: Ongoing work, learning goals
- ๐ค Relationships: Important people, colleagues, connections
- ๐ Repeated References: Information you mention frequently
Type | Purpose | Example | Auto-Promoted |
---|---|---|---|
Facts | Objective information | "I use PostgreSQL for databases" | โ High frequency |
Preferences | User choices | "I prefer clean, readable code" | โ Personal identity |
Skills | Abilities & knowledge | "Experienced with FastAPI" | โ Expertise areas |
Rules | Constraints & guidelines | "Always write tests first" | โ Work patterns |
Context | Session information | "Working on e-commerce project" | โ Current projects |
from memori import Memori
# Conscious mode - Short-term working memory
memori = Memori(
database_connect="sqlite:///my_memory.db",
template="basic",
conscious_ingest=True, # One-shot context injection
openai_api_key="sk-..."
)
# Auto mode - Dynamic database search
memori = Memori(
database_connect="sqlite:///my_memory.db",
auto_ingest=True, # Continuous memory retrieval
openai_api_key="sk-..."
)
# Combined mode - Best of both worlds
memori = Memori(
conscious_ingest=True, # Working memory +
auto_ingest=True, # Dynamic search
openai_api_key="sk-..."
)
from memori import Memori, ConfigManager
# Load from memori.json or environment
config = ConfigManager()
config.auto_load()
memori = Memori()
memori.enable()
Create memori.json
:
{
"database": {
"connection_string": "postgresql://user:pass@localhost/memori"
},
"agents": {
"openai_api_key": "sk-...",
"conscious_ingest": true,
"auto_ingest": false
},
"memory": {
"namespace": "my_project",
"retention_policy": "30_days"
}
}
Works with ANY LLM library:
memori.enable() # Enable universal recording
# OpenAI
from openai import OpenAI
client = OpenAI()
client.chat.completions.create(...)
# LiteLLM
from litellm import completion
completion(model="gpt-4", messages=[...])
# Anthropic
import anthropic
client = anthropic.Anthropic()
client.messages.create(...)
# All automatically recorded and contextualized!
# Automatic analysis every 6 hours (when conscious_ingest=True)
memori.enable() # Starts background conscious agent
# Manual analysis trigger
memori.trigger_conscious_analysis()
# Get essential conversations
essential = memori.get_essential_conversations(limit=5)
from memori.tools import create_memory_tool
# Create memory search tool for your LLM
memory_tool = create_memory_tool(memori)
# Use in function calling
tools = [memory_tool]
completion(model="gpt-4", messages=[...], tools=tools)
# Get relevant context for a query
context = memori.retrieve_context("Python testing", limit=5)
# Returns: 3 essential + 2 specific memories
# Search by category
skills = memori.search_memories_by_category("skill", limit=10)
# Get memory statistics
stats = memori.get_memory_stats()
-- Core tables created automatically
chat_history # All conversations
short_term_memory # Recent context (expires)
long_term_memory # Permanent insights
rules_memory # User preferences
memory_entities # Extracted entities
memory_relationships # Entity connections
memori/
โโโ core/ # Main Memori class, database manager
โโโ agents/ # Memory processing with Pydantic
โโโ database/ # SQLite/PostgreSQL/MySQL support
โโโ integrations/ # LiteLLM, OpenAI, Anthropic
โโโ config/ # Configuration management
โโโ utils/ # Helpers, validation, logging
โโโ tools/ # Memory search tools
- Basic Usage - Simple memory setup with conscious ingestion
- Personal Assistant - AI assistant with intelligent memory
- Memory Retrieval - Function calling with memory tools
- Advanced Config - Production configuration
- Interactive Demo - Live conscious ingestion showcase
- Simple Multi-User - Basic demonstration of user memory isolation with namespaces
- FastAPI Multi-User App - Full-featured REST API with Swagger UI for testing multi-user functionality
Memori works seamlessly with popular AI frameworks:
Framework | Description | Example |
---|---|---|
AgentOps | Track and monitor Memori memory operations with comprehensive observability | Memory operation tracking with AgentOps analytics |
Agno | Memory-enhanced agent framework integration with persistent conversations | Simple chat agent with memory search |
AWS Strands | Professional development coach with Strands SDK and persistent memory | Career coaching agent with goal tracking |
Azure AI Foundry | Azure AI Foundry agents with persistent memory across conversations | Enterprise AI agents with Azure integration |
CamelAI | Multi-agent communication framework with automatic memory recording and retrieval | Memory-enhanced chat agents with conversation continuity |
CrewAI | Multi-agent system with shared memory across agent interactions | Collaborative agents with memory |
Digital Ocean AI | Memory-enhanced customer support using Digital Ocean's AI platform | Customer support assistant with conversation history |
LangChain | Enterprise-grade agent framework with advanced memory integration | AI assistant with LangChain tools and memory |
OpenAI Agent | Memory-enhanced OpenAI Agent with function calling and user preference tracking | Interactive assistant with memory search and user info storage |
Swarms | Multi-agent system framework with persistent memory capabilities | Memory-enhanced Swarms agents with auto/conscious ingestion |
Explore Memori's capabilities through these interactive demonstrations:
Title | Description | Tools Used | Live Demo |
---|---|---|---|
๐ Personal Diary Assistant | A comprehensive diary assistant with mood tracking, pattern analysis, and personalized recommendations. | Streamlit, LiteLLM, OpenAI, SQLite | Run Demo |
๐ Travel Planner Agent | Intelligent travel planning with CrewAI agents, real-time web search, and memory-based personalization. Plans complete itineraries with budget analysis. | CrewAI, Streamlit, OpenAI, SQLite | |
๐งโ๐ฌ Researcher Agent | Advanced AI research assistant with persistent memory, real-time web search, and comprehensive report generation. Builds upon previous research sessions. | Agno, Streamlit, OpenAI, ExaAI, SQLite | Run Demo |
- See CONTRIBUTING.md for development setup and guidelines.
- Community: Discord
MIT License - see LICENSE for details.
Made for developers who want their AI agents to remember and learn
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for memori
Similar Open Source Tools

memori
Memori is a lightweight and user-friendly memory management tool for developers. It helps in tracking memory usage, detecting memory leaks, and optimizing memory allocation in software projects. With Memori, developers can easily monitor and analyze memory consumption to improve the performance and stability of their applications. The tool provides detailed insights into memory usage patterns and helps in identifying areas for optimization. Memori is designed to be easy to integrate into existing projects and offers a simple yet powerful interface for managing memory resources effectively.

pointer
Pointer is a lightweight and efficient tool for analyzing and visualizing data structures in C and C++ programs. It provides a user-friendly interface to track memory allocations, pointer references, and data structures, helping developers to identify memory leaks, pointer errors, and optimize memory usage. With Pointer, users can easily navigate through complex data structures, visualize memory layouts, and debug pointer-related issues in their codebase. The tool offers interactive features such as memory snapshots, pointer tracking, and memory visualization, making it a valuable asset for C and C++ developers working on memory-intensive applications.

deepflow
DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.

ml-retreat
ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

LightLLM
LightLLM is a lightweight library for linear and logistic regression models. It provides a simple and efficient way to train and deploy machine learning models for regression tasks. The library is designed to be easy to use and integrate into existing projects, making it suitable for both beginners and experienced data scientists. With LightLLM, users can quickly build and evaluate regression models using a variety of algorithms and hyperparameters. The library also supports feature engineering and model interpretation, allowing users to gain insights from their data and make informed decisions based on the model predictions.

FLAME
FLAME is a lightweight and efficient deep learning framework designed for edge devices. It provides a simple and user-friendly interface for developing and deploying deep learning models on resource-constrained devices. With FLAME, users can easily build and optimize neural networks for tasks such as image classification, object detection, and natural language processing. The framework supports various neural network architectures and optimization techniques, making it suitable for a wide range of applications in the field of edge computing.

pentest-agent
Pentest Agent is a lightweight and versatile tool designed for conducting penetration testing on network systems. It provides a user-friendly interface for scanning, identifying vulnerabilities, and generating detailed reports. The tool is highly customizable, allowing users to define specific targets and parameters for testing. Pentest Agent is suitable for security professionals and ethical hackers looking to assess the security posture of their systems and networks.

deeppowers
Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

SpecForge
SpecForge is a powerful tool for generating API specifications from code. It helps developers to easily create and maintain accurate API documentation by extracting information directly from the codebase. With SpecForge, users can streamline the process of documenting APIs, ensuring consistency and reducing manual effort. The tool supports various programming languages and frameworks, making it versatile and adaptable to different development environments. By automating the generation of API specifications, SpecForge enhances collaboration between developers and stakeholders, improving overall project efficiency and quality.

WorkflowAI
WorkflowAI is a powerful tool designed to streamline and automate various tasks within the workflow process. It provides a user-friendly interface for creating custom workflows, automating repetitive tasks, and optimizing efficiency. With WorkflowAI, users can easily design, execute, and monitor workflows, allowing for seamless integration of different tools and systems. The tool offers advanced features such as conditional logic, task dependencies, and error handling to ensure smooth workflow execution. Whether you are managing project tasks, processing data, or coordinating team activities, WorkflowAI simplifies the workflow management process and enhances productivity.

lemonai
LemonAI is a versatile machine learning library designed to simplify the process of building and deploying AI models. It provides a wide range of tools and algorithms for data preprocessing, model training, and evaluation. With LemonAI, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is well-documented and beginner-friendly, making it suitable for both novice and experienced data scientists. LemonAI aims to streamline the development of AI applications and empower users to create innovative solutions using state-of-the-art machine learning methods.

promptl
Promptl is a versatile command-line tool designed to streamline the process of creating and managing prompts for user input in various programming projects. It offers a simple and efficient way to prompt users for information, validate their input, and handle different scenarios based on their responses. With Promptl, developers can easily integrate interactive prompts into their scripts, applications, and automation workflows, enhancing user experience and improving overall usability. The tool provides a range of customization options and features, making it suitable for a wide range of use cases across different programming languages and environments.

airbrussh
Airbrussh is a Capistrano plugin that enhances the output of Capistrano's deploy command. It provides a more detailed and structured view of the deployment process, including color-coded output, timestamps, and improved formatting. Airbrussh aims to make the deployment logs easier to read and understand, helping developers troubleshoot issues and monitor deployments more effectively. It is a useful tool for teams working with Capistrano to streamline their deployment workflows and improve visibility into the deployment process.

Awesome-Efficient-MoE
Awesome Efficient MoE is a GitHub repository that provides an implementation of Mixture of Experts (MoE) models for efficient deep learning. The repository includes code for training and using MoE models, which are neural network architectures that combine multiple expert networks to improve performance on complex tasks. MoE models are particularly useful for handling diverse data distributions and capturing complex patterns in data. The implementation in this repository is designed to be efficient and scalable, making it suitable for training large-scale MoE models on modern hardware. The code is well-documented and easy to use, making it accessible for researchers and practitioners interested in leveraging MoE models for their deep learning projects.

monoscope
Monoscope is an open-source monitoring and observability platform that uses artificial intelligence to understand and monitor systems automatically. It allows users to ingest and explore logs, traces, and metrics in S3 buckets, query in natural language via LLMs, and create AI agents to detect anomalies. Key capabilities include universal data ingestion, AI-powered understanding, natural language interface, cost-effective storage, and zero configuration. Monoscope is designed to reduce alert fatigue, catch issues before they impact users, and provide visibility across complex systems.

koog
Koog is a Kotlin-based framework for building and running AI agents entirely in idiomatic Kotlin. It allows users to create agents that interact with tools, handle complex workflows, and communicate with users. Key features include pure Kotlin implementation, MCP integration, embedding capabilities, custom tool creation, ready-to-use components, intelligent history compression, powerful streaming API, persistent agent memory, comprehensive tracing, flexible graph workflows, modular feature system, scalable architecture, and multiplatform support.
For similar tasks

memori
Memori is a lightweight and user-friendly memory management tool for developers. It helps in tracking memory usage, detecting memory leaks, and optimizing memory allocation in software projects. With Memori, developers can easily monitor and analyze memory consumption to improve the performance and stability of their applications. The tool provides detailed insights into memory usage patterns and helps in identifying areas for optimization. Memori is designed to be easy to integrate into existing projects and offers a simple yet powerful interface for managing memory resources effectively.

scalene
Scalene is a high-performance CPU, GPU, and memory profiler for Python that provides detailed information and runs faster than many other profilers. It incorporates AI-powered proposed optimizations, allowing users to generate optimization suggestions by clicking on specific lines or regions of code. Scalene separates time spent in Python from native code, highlights hotspots, and identifies memory usage per line. It supports GPU profiling on NVIDIA-based systems and detects memory leaks. Users can generate reduced profiles, profile specific functions using decorators, and suspend/resume profiling for background processes. Scalene is available as a pip or conda package and works on various platforms. It offers features like profiling at the line level, memory trends, copy volume reporting, and leak detection.

bpf-developer-tutorial
This is a development tutorial for eBPF based on CO-RE (Compile Once, Run Everywhere). It provides practical eBPF development practices from beginner to advanced, including basic concepts, code examples, and real-world applications. The tutorial focuses on eBPF examples in observability, networking, security, and more. It aims to help eBPF application developers quickly grasp eBPF development methods and techniques through examples in languages such as C, Go, and Rust. The tutorial is structured with independent eBPF tool examples in each directory, covering topics like kprobes, fentry, opensnoop, uprobe, sigsnoop, execsnoop, exitsnoop, runqlat, hardirqs, and more. The project is based on libbpf and frameworks like libbpf, Cilium, libbpf-rs, and eunomia-bpf for development.
For similar jobs

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

nvidia_gpu_exporter
Nvidia GPU exporter for prometheus, using `nvidia-smi` binary to gather metrics.

tracecat
Tracecat is an open-source automation platform for security teams. It's designed to be simple but powerful, with a focus on AI features and a practitioner-obsessed UI/UX. Tracecat can be used to automate a variety of tasks, including phishing email investigation, evidence collection, and remediation plan generation.

openinference
OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

kong
Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.