memori

Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems

Stars: 1099

Visit

Memori is a lightweight and user-friendly memory management tool for developers. It helps in tracking memory usage, detecting memory leaks, and optimizing memory allocation in software projects. With Memori, developers can easily monitor and analyze memory consumption to improve the performance and stability of their applications. The tool provides detailed insights into memory usage patterns and helps in identifying areas for optimization. Memori is designed to be easy to integrate into existing projects and offers a simple yet powerful interface for managing memory resources effectively.

README:

memori

Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems

From Postgres to MySQL, Memori plugs into the SQL databases you already use. Simple setup, infinite scale without new infrastructure.

Learn more · Join Discord

🎯 Philosophy

Second-memory for all your LLM work - Never repeat context again
Dual-mode memory injection - Conscious short-term memory + Auto intelligent search
Flexible database connections - SQLite, PostgreSQL, MySQL support
Pydantic-based intelligence - Structured memory processing with validation
Simple, reliable architecture - Just works out of the box

⚡ Quick Start

Install Memori:

pip install memorisdk

Example with OpenAI

Install OpenAI:

pip install openai

Set OpenAI API Key:

export OPENAI_API_KEY="sk-your-openai-key-here"

Run this Python script:

from memori import Memori
from openai import OpenAI

# Initialize OpenAI client
openai_client = OpenAI()

# Initialize memory
memori = Memori(conscious_ingest=True)
memori.enable()

print("=== First Conversation - Establishing Context ===")
response1 = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user", 
        "content": "I'm working on a Python FastAPI project"
    }]
)

print("Assistant:", response1.choices[0].message.content)
print("\n" + "="*50)
print("=== Second Conversation - Memory Provides Context ===")

response2 = openai_client.chat.completions.create(
    model="gpt-4o-mini", 
    messages=[{
        "role": "user",
        "content": "Help me add user authentication"
    }]
)
print("Assistant:", response2.choices[0].message.content)
print("\n💡 Notice: Memori automatically knows about your FastAPI Python project!")

By default, Memori uses in-memory SQLite database. Get FREE serverless database instance in GibsonAI platform.

🚀 Ready to explore more?

📖 Examples - Basic usage patterns and code samples
🔌 Framework Integrations - LangChain, Agno & CrewAI examples
🎮 Interactive Demos - Live applications & tutorials

🧠 How It Works

1. Universal Recording

office_work.enable()  # Records ALL LLM conversations automatically

2. Intelligent Processing

Entity Extraction: Extracts people, technologies, projects
Smart Categorization: Facts, preferences, skills, rules
Pydantic Validation: Structured, type-safe memory storage

3. Dual Memory Modes

🧠 Conscious Mode - Short-Term Working Memory

conscious_ingest=True  # One-shot short-term memory injection

At Startup: Conscious agent analyzes long-term memory patterns
Memory Promotion: Moves essential conversations to short-term storage
One-Shot Injection: Injects working memory once at conversation start
Like Human Short-Term Memory: Names, current projects, preferences readily available

🔍 Auto Mode - Dynamic Database Search

auto_ingest=True  # Continuous intelligent memory retrieval

Every LLM Call: Retrieval agent analyzes user query intelligently
Full Database Search: Searches through entire memory database
Context-Aware: Injects relevant memories based on current conversation
Performance Optimized: Caching, async processing, background threads

🧠 Memory Modes Explained

Conscious Mode - Short-Term Working Memory

# Mimics human conscious memory - essential info readily available
memori = Memori(
    database_connect="sqlite:///my_memory.db",
    conscious_ingest=True,  # 🧠 Short-term working memory
    openai_api_key="sk-..."
)

How Conscious Mode Works:

At Startup: Conscious agent analyzes long-term memory patterns
Essential Selection: Promotes 5-10 most important conversations to short-term
One-Shot Injection: Injects this working memory once at conversation start
No Repeats: Won't inject again during the same session

Auto Mode - Dynamic Intelligent Search

# Searches entire database dynamically based on user queries
memori = Memori(
    database_connect="sqlite:///my_memory.db", 
    auto_ingest=True,  # 🔍 Smart database search
    openai_api_key="sk-..."
)

How Auto Mode Works:

Every LLM Call: Retrieval agent analyzes user input
Query Planning: Uses AI to understand what memories are needed
Smart Search: Searches through entire database (short-term + long-term)
Context Injection: Injects 3-5 most relevant memories per call

Combined Mode - Best of Both Worlds

# Get both working memory AND dynamic search
memori = Memori(
    conscious_ingest=True,  # Working memory once
    auto_ingest=True,       # Dynamic search every call
    openai_api_key="sk-..."
)

Intelligence Layers:

Memory Agent - Processes every conversation with Pydantic structured outputs
Conscious Agent - Analyzes patterns, promotes long-term → short-term memories
Retrieval Agent - Intelligently searches and selects relevant context

What gets prioritized in Conscious Mode:

👤 Personal Identity: Your name, role, location, basic info
❤️ Preferences & Habits: What you like, work patterns, routines
🛠️ Skills & Tools: Technologies you use, expertise areas
📊 Current Projects: Ongoing work, learning goals
🤝 Relationships: Important people, colleagues, connections
🔄 Repeated References: Information you mention frequently

🗄️ Memory Types

Type	Purpose	Example	Auto-Promoted
Facts	Objective information	"I use PostgreSQL for databases"	✅ High frequency
Preferences	User choices	"I prefer clean, readable code"	✅ Personal identity
Skills	Abilities & knowledge	"Experienced with FastAPI"	✅ Expertise areas
Rules	Constraints & guidelines	"Always write tests first"	✅ Work patterns
Context	Session information	"Working on e-commerce project"	✅ Current projects

🔧 Configuration

Simple Setup

from memori import Memori

# Conscious mode - Short-term working memory
memori = Memori(
    database_connect="sqlite:///my_memory.db",
    template="basic", 
    conscious_ingest=True,  # One-shot context injection
    openai_api_key="sk-..."
)

# Auto mode - Dynamic database search
memori = Memori(
    database_connect="sqlite:///my_memory.db",
    auto_ingest=True,  # Continuous memory retrieval
    openai_api_key="sk-..."
)

# Combined mode - Best of both worlds
memori = Memori(
    conscious_ingest=True,  # Working memory + 
    auto_ingest=True,       # Dynamic search
    openai_api_key="sk-..."
)

Advanced Configuration

from memori import Memori, ConfigManager

# Load from memori.json or environment
config = ConfigManager()
config.auto_load()

memori = Memori()
memori.enable()

Create memori.json:

{
  "database": {
    "connection_string": "postgresql://user:pass@localhost/memori"
  },
  "agents": {
    "openai_api_key": "sk-...",
    "conscious_ingest": true,
    "auto_ingest": false
  },
  "memory": {
    "namespace": "my_project",
    "retention_policy": "30_days"
  }
}

🔌 Universal Integration

Works with ANY LLM library:

memori.enable()  # Enable universal recording

# OpenAI
from openai import OpenAI
client = OpenAI()
client.chat.completions.create(...)

# LiteLLM
from litellm import completion
completion(model="gpt-4", messages=[...])

# Anthropic  
import anthropic
client = anthropic.Anthropic()
client.messages.create(...)

# All automatically recorded and contextualized!

🛠️ Memory Management

Automatic Background Analysis

# Automatic analysis every 6 hours (when conscious_ingest=True)
memori.enable()  # Starts background conscious agent

# Manual analysis trigger
memori.trigger_conscious_analysis()

# Get essential conversations
essential = memori.get_essential_conversations(limit=5)

Memory Retrieval Tools

from memori.tools import create_memory_tool

# Create memory search tool for your LLM
memory_tool = create_memory_tool(memori)

# Use in function calling
tools = [memory_tool]
completion(model="gpt-4", messages=[...], tools=tools)

Context Control

# Get relevant context for a query
context = memori.retrieve_context("Python testing", limit=5)
# Returns: 3 essential + 2 specific memories

# Search by category
skills = memori.search_memories_by_category("skill", limit=10)

# Get memory statistics
stats = memori.get_memory_stats()

📋 Database Schema

-- Core tables created automatically
chat_history        # All conversations
short_term_memory   # Recent context (expires)
long_term_memory    # Permanent insights  
rules_memory        # User preferences
memory_entities     # Extracted entities
memory_relationships # Entity connections

📁 Project Structure

memori/
├── core/           # Main Memori class, database manager
├── agents/         # Memory processing with Pydantic  
├── database/       # SQLite/PostgreSQL/MySQL support
├── integrations/   # LiteLLM, OpenAI, Anthropic
├── config/         # Configuration management
├── utils/          # Helpers, validation, logging
└── tools/          # Memory search tools

Examples

Basic Usage - Simple memory setup with conscious ingestion
Personal Assistant - AI assistant with intelligent memory
Memory Retrieval - Function calling with memory tools
Advanced Config - Production configuration
Interactive Demo - Live conscious ingestion showcase
Simple Multi-User - Basic demonstration of user memory isolation with namespaces
FastAPI Multi-User App - Full-featured REST API with Swagger UI for testing multi-user functionality

Framework Integrations

Memori works seamlessly with popular AI frameworks:

Framework	Description	Example
AgentOps	Track and monitor Memori memory operations with comprehensive observability	Memory operation tracking with AgentOps analytics
Agno	Memory-enhanced agent framework integration with persistent conversations	Simple chat agent with memory search
AWS Strands	Professional development coach with Strands SDK and persistent memory	Career coaching agent with goal tracking
Azure AI Foundry	Azure AI Foundry agents with persistent memory across conversations	Enterprise AI agents with Azure integration
CamelAI	Multi-agent communication framework with automatic memory recording and retrieval	Memory-enhanced chat agents with conversation continuity
CrewAI	Multi-agent system with shared memory across agent interactions	Collaborative agents with memory
Digital Ocean AI	Memory-enhanced customer support using Digital Ocean's AI platform	Customer support assistant with conversation history
LangChain	Enterprise-grade agent framework with advanced memory integration	AI assistant with LangChain tools and memory
OpenAI Agent	Memory-enhanced OpenAI Agent with function calling and user preference tracking	Interactive assistant with memory search and user info storage
Swarms	Multi-agent system framework with persistent memory capabilities	Memory-enhanced Swarms agents with auto/conscious ingestion

Interactive Demos

Explore Memori's capabilities through these interactive demonstrations:

Title	Description	Tools Used	Live Demo
🌟 Personal Diary Assistant	A comprehensive diary assistant with mood tracking, pattern analysis, and personalized recommendations.	Streamlit, LiteLLM, OpenAI, SQLite	Run Demo
🌍 Travel Planner Agent	Intelligent travel planning with CrewAI agents, real-time web search, and memory-based personalization. Plans complete itineraries with budget analysis.	CrewAI, Streamlit, OpenAI, SQLite
🧑‍🔬 Researcher Agent	Advanced AI research assistant with persistent memory, real-time web search, and comprehensive report generation. Builds upon previous research sessions.	Agno, Streamlit, OpenAI, ExaAI, SQLite	Run Demo

🤝 Contributing

See CONTRIBUTING.md for development setup and guidelines.
Community: Discord

📄 License

MIT License - see LICENSE for details.

Made for developers who want their AI agents to remember and learn

For Tasks:

Click tags to check more tools for each tasks

monitor memory usage detect memory leaks optimize memory allocation analyze memory consumption improve application performance

For Jobs:

software developer quality assurance engineer system administrator devops engineer technical support specialist

Alternative AI tools for memori

Similar Open Source Tools

memori

github

: 1.1k

pointer

Pointer is a lightweight and efficient tool for analyzing and visualizing data structures in C and C++ programs. It provides a user-friendly interface to track memory allocations, pointer references, and data structures, helping developers to identify memory leaks, pointer errors, and optimize memory usage. With Pointer, users can easily navigate through complex data structures, visualize memory layouts, and debug pointer-related issues in their codebase. The tool offers interactive features such as memory snapshots, pointer tracking, and memory visualization, making it a valuable asset for C and C++ developers working on memory-intensive applications.

github

: 86

deepflow

DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.

github

: 3.5k

ml-retreat

ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

github

: 2.2k

LightLLM

LightLLM is a lightweight library for linear and logistic regression models. It provides a simple and efficient way to train and deploy machine learning models for regression tasks. The library is designed to be easy to use and integrate into existing projects, making it suitable for both beginners and experienced data scientists. With LightLLM, users can quickly build and evaluate regression models using a variety of algorithms and hyperparameters. The library also supports feature engineering and model interpretation, allowing users to gain insights from their data and make informed decisions based on the model predictions.

github

: 3.6k

FLAME

FLAME is a lightweight and efficient deep learning framework designed for edge devices. It provides a simple and user-friendly interface for developing and deploying deep learning models on resource-constrained devices. With FLAME, users can easily build and optimize neural networks for tasks such as image classification, object detection, and natural language processing. The framework supports various neural network architectures and optimization techniques, making it suitable for a wide range of applications in the field of edge computing.

github

: 61

pentest-agent

Pentest Agent is a lightweight and versatile tool designed for conducting penetration testing on network systems. It provides a user-friendly interface for scanning, identifying vulnerabilities, and generating detailed reports. The tool is highly customizable, allowing users to define specific targets and parameters for testing. Pentest Agent is suitable for security professionals and ethical hackers looking to assess the security posture of their systems and networks.

github

: 71

deeppowers

Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

github

: 183

SpecForge

SpecForge is a powerful tool for generating API specifications from code. It helps developers to easily create and maintain accurate API documentation by extracting information directly from the codebase. With SpecForge, users can streamline the process of documenting APIs, ensuring consistency and reducing manual effort. The tool supports various programming languages and frameworks, making it versatile and adaptable to different development environments. By automating the generation of API specifications, SpecForge enhances collaboration between developers and stakeholders, improving overall project efficiency and quality.

github

: 407

WorkflowAI

WorkflowAI is a powerful tool designed to streamline and automate various tasks within the workflow process. It provides a user-friendly interface for creating custom workflows, automating repetitive tasks, and optimizing efficiency. With WorkflowAI, users can easily design, execute, and monitor workflows, allowing for seamless integration of different tools and systems. The tool offers advanced features such as conditional logic, task dependencies, and error handling to ensure smooth workflow execution. Whether you are managing project tasks, processing data, or coordinating team activities, WorkflowAI simplifies the workflow management process and enhances productivity.

github

: 436

lemonai

LemonAI is a versatile machine learning library designed to simplify the process of building and deploying AI models. It provides a wide range of tools and algorithms for data preprocessing, model training, and evaluation. With LemonAI, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is well-documented and beginner-friendly, making it suitable for both novice and experienced data scientists. LemonAI aims to streamline the development of AI applications and empower users to create innovative solutions using state-of-the-art machine learning methods.

github

: 994

promptl

Promptl is a versatile command-line tool designed to streamline the process of creating and managing prompts for user input in various programming projects. It offers a simple and efficient way to prompt users for information, validate their input, and handle different scenarios based on their responses. With Promptl, developers can easily integrate interactive prompts into their scripts, applications, and automation workflows, enhancing user experience and improving overall usability. The tool provides a range of customization options and features, making it suitable for a wide range of use cases across different programming languages and environments.

github

: 71

airbrussh

Airbrussh is a Capistrano plugin that enhances the output of Capistrano's deploy command. It provides a more detailed and structured view of the deployment process, including color-coded output, timestamps, and improved formatting. Airbrussh aims to make the deployment logs easier to read and understand, helping developers troubleshoot issues and monitor deployments more effectively. It is a useful tool for teams working with Capistrano to streamline their deployment workflows and improve visibility into the deployment process.

github

: 512

Awesome-Efficient-MoE

Awesome Efficient MoE is a GitHub repository that provides an implementation of Mixture of Experts (MoE) models for efficient deep learning. The repository includes code for training and using MoE models, which are neural network architectures that combine multiple expert networks to improve performance on complex tasks. MoE models are particularly useful for handling diverse data distributions and capturing complex patterns in data. The implementation in this repository is designed to be efficient and scalable, making it suitable for training large-scale MoE models on modern hardware. The code is well-documented and easy to use, making it accessible for researchers and practitioners interested in leveraging MoE models for their deep learning projects.

github

: 131

monoscope

Monoscope is an open-source monitoring and observability platform that uses artificial intelligence to understand and monitor systems automatically. It allows users to ingest and explore logs, traces, and metrics in S3 buckets, query in natural language via LLMs, and create AI agents to detect anomalies. Key capabilities include universal data ingestion, AI-powered understanding, natural language interface, cost-effective storage, and zero configuration. Monoscope is designed to reduce alert fatigue, catch issues before they impact users, and provide visibility across complex systems.

github

: 201

koog

Koog is a Kotlin-based framework for building and running AI agents entirely in idiomatic Kotlin. It allows users to create agents that interact with tools, handle complex workflows, and communicate with users. Key features include pure Kotlin implementation, MCP integration, embedding capabilities, custom tool creation, ready-to-use components, intelligent history compression, powerful streaming API, persistent agent memory, comprehensive tracing, flexible graph workflows, modular feature system, scalable architecture, and multiplatform support.

github

: 3.2k

For similar tasks

memori

github

: 1.1k

scalene

Scalene is a high-performance CPU, GPU, and memory profiler for Python that provides detailed information and runs faster than many other profilers. It incorporates AI-powered proposed optimizations, allowing users to generate optimization suggestions by clicking on specific lines or regions of code. Scalene separates time spent in Python from native code, highlights hotspots, and identifies memory usage per line. It supports GPU profiling on NVIDIA-based systems and detects memory leaks. Users can generate reduced profiles, profile specific functions using decorators, and suspend/resume profiling for background processes. Scalene is available as a pip or conda package and works on various platforms. It offers features like profiling at the line level, memory trends, copy volume reporting, and leak detection.

github

: 12.5k

bpf-developer-tutorial

This is a development tutorial for eBPF based on CO-RE (Compile Once, Run Everywhere). It provides practical eBPF development practices from beginner to advanced, including basic concepts, code examples, and real-world applications. The tutorial focuses on eBPF examples in observability, networking, security, and more. It aims to help eBPF application developers quickly grasp eBPF development methods and techniques through examples in languages such as C, Go, and Rust. The tutorial is structured with independent eBPF tool examples in each directory, covering topics like kprobes, fentry, opensnoop, uprobe, sigsnoop, execsnoop, exitsnoop, runqlat, hardirqs, and more. The project is based on libbpf and frameworks like libbpf, Cilium, libbpf-rs, and eunomia-bpf for development.

github

: 2.3k

For similar jobs

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

ai-on-gke

This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

github

: 280

tidb

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

github

: 37.1k

nvidia_gpu_exporter

Nvidia GPU exporter for prometheus, using `nvidia-smi` binary to gather metrics.

github

: 1.3k

tracecat

Tracecat is an open-source automation platform for security teams. It's designed to be simple but powerful, with a focus on AI features and a practitioner-obsessed UI/UX. Tracecat can be used to automate a variety of tasks, including phishing email investigation, evidence collection, and remediation plan generation.

github

: 3.2k

openinference

OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.

github

: 598

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

kong

Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.

github

: 41.8k