data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="chatmemory"
chatmemory
The simple yet powerful long-term memory manager between AI and you💕
Stars: 73
data:image/s3,"s3://crabby-images/e90da/e90da22793137d5f2af6a6309ea291bffbc1cd7a" alt="screenshot"
ChatMemory is a simple yet powerful long-term memory manager that facilitates communication between AI and users. It organizes conversation data into history, summary, and knowledge entities, enabling quick retrieval of context and generation of clear, concise answers. The tool leverages vector search on summaries/knowledge and detailed history to provide accurate responses. It balances speed and accuracy by using lightweight retrieval and fallback detailed search mechanisms, ensuring efficient memory management and response generation beyond mere data retrieval.
README:
The simple yet powerful long-term memory manager between AI and you💕
- 🌟 Extremely simple: All code is contained in one file, making it easy to track memory management—just PostgreSQL is needed as your datastore.
- 🔎 Intelligent Search & Answer: Quickly retrieves context via vector search on summaries/knowledge, then uses detailed history if needed—returning both the answer and raw data.
- 💬 Direct Answer: Leverages an LLM to produce clear, concise answers that go beyond mere data retrieval, delivering ready-to-use responses.
git clone https://github.com/uezo/chatmemory
cd chatmemory/docker
cp .env.sample .env
Set OPENAI_API_KEY
to .env
, then start the container.
docker compose up
Go http://127.0.0.1:8000/docs to know the spec and try the APIs.
NOTE: On the first run, the chatmemory-app
container may fail to start. This happens because the application server tries to access the database before it is fully initialized. Restarting the chatmemory-app
container will resolve this issue.
Pre-requirements:
- Python 3.10 or later
- PostgreSQL (Tested on version 16) is up
- pgvector is installed
Install chatmemory.
pip install chatmemory
Create the server script (e.g.server.py
) as following:
from fastapi import FastAPI
from chatmemory import ChatMemory
cm = ChatMemory(
openai_api_key="YOUR_OPENAI_API_KEY",
llm_model="gpt-4o",
# Your PostgreSQL configurations
db_name="postgres",
db_user="postgres",
db_password="postgres",
db_host="127.0.0.1",
db_port=5432,
)
app = FastAPI()
app.include_router(cm.get_router())
Start API server.
uvicorn server:app
That's all. Long-term memory management service is ready-to-use👍
Go http://127.0.0.1:8000/docs to know the spec and try the APIs.
Below is a complete Python sample demonstrating how to interact with the ChatMemory REST API. This sample uses the requests
library to:
- Add conversation messages.
- Simulate a session change (which triggers automatic summary generation for the previous session).
- Retrieve the generated summary.
- Perform a search to obtain an answer (with retrieved raw data).
import requests
import time
BASE_URL = "http://localhost:8000" # Change if your API runs on a different host/port
# Unique identifiers for testing
user_id = "test_user_123"
session1 = "session_1"
session2 = "session_2"
# Step 1: Add messages to the first session
history_payload1 = {
"user_id": user_id,
"session_id": session1,
"messages": [
{"role": "user", "content": "I like Japanese soba noodle."},
{"role": "assistant", "content": "How often do you eat?"},
{"role": "user", "content": "Everyday."},
{"role": "assistant", "content": "You really love it."}
]
}
response = requests.post(f"{BASE_URL}/history", json=history_payload1)
print("Added history for session1:", response.json())
# Wait a short moment (if needed) for processing
time.sleep(1)
# Step 2: Simulate a session change by adding messages to a new session
# This should trigger automatic summary generation for session1
history_payload2 = {
"user_id": user_id,
"session_id": session2,
"messages": [
{"role": "user", "content": "What's the weather like today? I go to shopping to Shibuya."},
{"role": "assistant", "content": "It looks sunny outside!"}
]
}
response = requests.post(f"{BASE_URL}/history", json=history_payload2)
print("Added history for session2:", response.json())
# Optionally, wait for the background summary to be generated
print("Waiting for summary generation... (5 seconds)")
time.sleep(5)
# Step 3: Retrieve the summary for session1
params = {"user_id": user_id, "session_id": session1}
response = requests.get(f"{BASE_URL}/summary", params=params)
print("Summary for session1:", response.json())
# Step 4: Perform a search to retrieve an answer based on the stored memory
query = "What is the favorite food?"
search_payload = {
"user_id": user_id,
"query": query,
"top_k": 3,
"search_content": True,
"include_retrieved_data": True
}
response = requests.post(f"{BASE_URL}/search", json=search_payload)
print("Search result:", response.json())
answer = response.json()["result"]["answer"]
print("===========")
print(f"Query: {query}")
print(f"Answer: {answer}")
Run it.
python client.py
Added history for session1: {'status': 'ok'}
Added history for session2: {'status': 'ok'}
Waiting for summary generation... (5 seconds)
Summary for session1: {'summaries': [{'created_at': '2025-02-25T18:11:22.895354', 'session_id': 'session_1', 'summary': "In a conversation, the user expresses their fondness for Japanese soba noodles, mentioning that they eat them every day. The assistant acknowledges the user's enthusiasm for the dish. \n\nKeywords: Japanese soba noodles, frequency, everyday."}]}
Search result: {'result': {'answer': "The user's favorite food is Japanese soba noodles, which they mention eating every day.", 'retrieved_data': "====\n\nConversation summary (2025-02-25 18:11:22.895354): In a conversation, the user expresses their fondness for Japanese soba noodles, mentioning that they eat them every day. The assistant acknowledges the user's enthusiasm for the dish. \n\nKeywords: Japanese soba noodles, frequency, everyday.\n\n"}}
===========
Query: What is the favorite food?
Answer: The user's favorite food is Japanese soba noodles, which they mention eating every day.
ChatMemory organizes conversation data into three primary entities:
- 📜 History: The raw conversation logs, storing every message exchanged.
- 📑 Summary: A concise overview generated from the detailed history using an LLM. This enables fast, lightweight processing by capturing the essence of a conversation.
- 💡 Knowledge: Additional, explicitly provided information that isn’t tied to the conversation log. This allows you to control and influence the answer independently.
When a search query is received, ChatMemory works in two stages:
- ⚡ Lightweight Retrieval: It first performs a vector-based search on the summaries and knowledge. This step quickly gathers relevant context and typically suffices for generating an answer.
- 🔍 Fallback Detailed Search: If the initial results aren’t deemed sufficient, ChatMemory then conducts a vector search over the full conversation history. This retrieves detailed logs, enabling the system to refine and improve the answer.
This two-step mechanism strikes a balance between speed and accuracy—leveraging the efficiency of summaries while still ensuring high-precision answers when more context is needed. Additionally, the explicit knowledge you provide helps guide the responses beyond just the conversation history.
No! I want to keep things as simple as possible, and adding support for other models would make maintenance more challenging. However, if you override the two functions — llm
and embed
— you can adapt it to work with other LLMs, so feel free to give it a try! Also, you might consider using a solution like LiteLLM; I personally use it!
https://github.com/BerriAI/litellm
Since fetching the entire memory every time could slow down responses and introduce unnecessary noise, it’s better to retrieve the memory only when needed for search and retrieval. Consider implementing the API call as a function call (or tool call) within your chatbot.
Not at all — the response generation uses not only the detailed conversation logs but also the summary information obtained in the first step, along with any additional knowledge provided.
At the moment, it isn’t. Even during ongoing conversations, the Knowledge doesn’t grow on its own, so you’ll need to implement an update process through some operation. In the future, it would be great if important details from the conversation could automatically be "elevated" into Knowledge. If you manage to come up with a neat solution, feel free to submit a Pull Request!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for chatmemory
Similar Open Source Tools
data:image/s3,"s3://crabby-images/e90da/e90da22793137d5f2af6a6309ea291bffbc1cd7a" alt="chatmemory Screenshot"
chatmemory
ChatMemory is a simple yet powerful long-term memory manager that facilitates communication between AI and users. It organizes conversation data into history, summary, and knowledge entities, enabling quick retrieval of context and generation of clear, concise answers. The tool leverages vector search on summaries/knowledge and detailed history to provide accurate responses. It balances speed and accuracy by using lightweight retrieval and fallback detailed search mechanisms, ensuring efficient memory management and response generation beyond mere data retrieval.
data:image/s3,"s3://crabby-images/074ec/074ecd9ebc1e0968c5efbce5d9dd6cce2ec9b701" alt="otto-m8 Screenshot"
otto-m8
otto-m8 is a flowchart based automation platform designed to run deep learning workloads with minimal to no code. It provides a user-friendly interface to spin up a wide range of AI models, including traditional deep learning models and large language models. The tool deploys Docker containers of workflows as APIs for integration with existing workflows, building AI chatbots, or standalone applications. Otto-m8 operates on an Input, Process, Output paradigm, simplifying the process of running AI models into a flowchart-like UI.
data:image/s3,"s3://crabby-images/355c6/355c67ab22ab4530d7657f25d81f108e088e8132" alt="Tools4AI Screenshot"
Tools4AI
Tools4AI is a Java-based Agentic Framework for building AI agents to integrate with enterprise Java applications. It enables the conversion of natural language prompts into actionable behaviors, streamlining user interactions with complex systems. By leveraging AI capabilities, it enhances productivity and innovation across diverse applications. The framework allows for seamless integration of AI with various systems, such as customer service applications, to interpret user requests, trigger actions, and streamline workflows. Prompt prediction anticipates user actions based on input prompts, enhancing user experience by proactively suggesting relevant actions or services based on context.
data:image/s3,"s3://crabby-images/6f0f1/6f0f1b758308169752a082a9b313e0c0ede343a6" alt="backtrack_sampler Screenshot"
backtrack_sampler
Backtrack Sampler is a framework for experimenting with custom sampling algorithms that can backtrack the latest generated tokens. It provides a simple and easy-to-understand codebase for creating new sampling strategies. Users can implement their own strategies by creating new files in the `/strategy` directory. The repo includes examples for usage with llama.cpp and transformers, showcasing different strategies like Creative Writing, Anti-slop, Debug, Human Guidance, Adaptive Temperature, and Replace. The goal is to encourage experimentation and customization of backtracking algorithms for language models.
data:image/s3,"s3://crabby-images/7b917/7b917c5e059e6dc0f9dc27aad145ef629b51d1ad" alt="CoPilot Screenshot"
CoPilot
TigerGraph CoPilot is an AI assistant that combines graph databases and generative AI to enhance productivity across various business functions. It includes three core component services: InquiryAI for natural language assistance, SupportAI for knowledge Q&A, and QueryAI for GSQL code generation. Users can interact with CoPilot through a chat interface on TigerGraph Cloud and APIs. CoPilot requires LLM services for beta but will support TigerGraph's LLM in future releases. It aims to improve contextual relevance and accuracy of answers to natural-language questions by building knowledge graphs and using RAG. CoPilot is extensible and can be configured with different LLM providers, graph schemas, and LangChain tools.
data:image/s3,"s3://crabby-images/c0ecb/c0ecb279f3f2016f5cae0fbfa6c2e26766c02803" alt="Toolio Screenshot"
Toolio
Toolio is an OpenAI-like HTTP server API implementation that supports structured LLM response generation, making it conform to a JSON schema. It is useful for reliable tool calling and agentic workflows based on schema-driven output. Toolio is based on the MLX framework for Apple Silicon, specifically M1/M2/M3/M4 Macs. It allows users to host MLX-format LLMs for structured output queries and provides a command line client for easier usage of tools. The tool also supports multiple tool calls and the creation of custom tools for specific tasks.
data:image/s3,"s3://crabby-images/1a70c/1a70cf47fc2db63190030eeb20c5355c32fbd158" alt="ActionWeaver Screenshot"
ActionWeaver
ActionWeaver is an AI application framework designed for simplicity, relying on OpenAI and Pydantic. It supports both OpenAI API and Azure OpenAI service. The framework allows for function calling as a core feature, extensibility to integrate any Python code, function orchestration for building complex call hierarchies, and telemetry and observability integration. Users can easily install ActionWeaver using pip and leverage its capabilities to create, invoke, and orchestrate actions with the language model. The framework also provides structured extraction using Pydantic models and allows for exception handling customization. Contributions to the project are welcome, and users are encouraged to cite ActionWeaver if found useful.
data:image/s3,"s3://crabby-images/7eaaf/7eaaf2baf1690042940bea7a7433cd9b7e6912e5" alt="ai-dev-2024-ml-workshop Screenshot"
ai-dev-2024-ml-workshop
The 'ai-dev-2024-ml-workshop' repository contains materials for the Deploy and Monitor ML Pipelines workshop at the AI_dev 2024 conference in Paris, focusing on deployment designs of machine learning pipelines using open-source applications and free-tier tools. It demonstrates automating data refresh and forecasting using GitHub Actions and Docker, monitoring with MLflow and YData Profiling, and setting up a monitoring dashboard with Quarto doc on GitHub Pages.
data:image/s3,"s3://crabby-images/8daf0/8daf02c2dfe4c10b30ea60d5475fd88820e42ecd" alt="motorhead Screenshot"
motorhead
Motorhead is a memory and information retrieval server for LLMs. It provides three simple APIs to assist with memory handling in chat applications using LLMs. The first API, GET /sessions/:id/memory, returns messages up to a maximum window size. The second API, POST /sessions/:id/memory, allows you to send an array of messages to Motorhead for storage. The third API, DELETE /sessions/:id/memory, deletes the session's message list. Motorhead also features incremental summarization, where it processes half of the maximum window size of messages and summarizes them when the maximum is reached. Additionally, it supports searching by text query using vector search. Motorhead is configurable through environment variables, including the maximum window size, whether to enable long-term memory, the model used for incremental summarization, the server port, your OpenAI API key, and the Redis URL.
data:image/s3,"s3://crabby-images/457c3/457c30720781070d9f95422fcf3bfdf7a5258b1d" alt="langchain-extract Screenshot"
langchain-extract
LangChain Extract is a simple web server that allows you to extract information from text and files using LLMs. It is built using FastAPI, LangChain, and Postgresql. The backend closely follows the extraction use-case documentation and provides a reference implementation of an app that helps to do extraction over data using LLMs. This repository is meant to be a starting point for building your own extraction application which may have slightly different requirements or use cases.
data:image/s3,"s3://crabby-images/994a4/994a436a547dfcca366a9fd0fc54c60d811915cf" alt="talking-avatar-with-ai Screenshot"
talking-avatar-with-ai
The 'talking-avatar-with-ai' project is a digital human system that utilizes OpenAI's GPT-3 for generating responses, Whisper for audio transcription, Eleven Labs for voice generation, and Rhubarb Lip Sync for lip synchronization. The system allows users to interact with a digital avatar that responds with text, facial expressions, and animations, creating a realistic conversational experience. The project includes setup for environment variables, chat prompt templates, chat model configuration, and structured output parsing to enhance the interaction with the digital human.
data:image/s3,"s3://crabby-images/dd821/dd821826a0869d3c9754e9b8883b3b10a78fa4d5" alt="AI Screenshot"
AI
AI is an open-source Swift framework for interfacing with generative AI. It provides functionalities for text completions, image-to-text vision, function calling, DALLE-3 image generation, audio transcription and generation, and text embeddings. The framework supports multiple AI models from providers like OpenAI, Anthropic, Mistral, Groq, and ElevenLabs. Users can easily integrate AI capabilities into their Swift projects using AI framework.
data:image/s3,"s3://crabby-images/2ebb7/2ebb7ad6c482ec94ba1946627869cc55c44be816" alt="DistillKit Screenshot"
DistillKit
DistillKit is an open-source research effort by Arcee.AI focusing on model distillation methods for Large Language Models (LLMs). It provides tools for improving model performance and efficiency through logit-based and hidden states-based distillation methods. The tool supports supervised fine-tuning and aims to enhance the adoption of open-source LLM distillation techniques.
data:image/s3,"s3://crabby-images/4233b/4233b602f600ca616b001234610a926174f7ed13" alt="Agently Screenshot"
Agently
Agently is a development framework that helps developers build AI agent native application really fast. You can use and build AI agent in your code in an extremely simple way. You can create an AI agent instance then interact with it like calling a function in very few codes like this below. Click the run button below and witness the magic. It's just that simple: python # Import and Init Settings import Agently agent = Agently.create_agent() agent\ .set_settings("current_model", "OpenAI")\ .set_settings("model.OpenAI.auth", {"api_key": ""}) # Interact with the agent instance like calling a function result = agent\ .input("Give me 3 words")\ .output([("String", "one word")])\ .start() print(result) ['apple', 'banana', 'carrot'] And you may notice that when we print the value of `result`, the value is a `list` just like the format of parameter we put into the `.output()`. In Agently framework we've done a lot of work like this to make it easier for application developers to integrate Agent instances into their business code. This will allow application developers to focus on how to build their business logic instead of figure out how to cater to language models or how to keep models satisfied.
data:image/s3,"s3://crabby-images/d66ee/d66eead5390b0ccd5eee1b1ed2377ff46fe81091" alt="neo4j-graphrag-python Screenshot"
neo4j-graphrag-python
The Neo4j GraphRAG package for Python is an official repository that provides features for creating and managing vector indexes in Neo4j databases. It aims to offer developers a reliable package with long-term commitment, maintenance, and fast feature updates. The package supports various Python versions and includes functionalities for creating vector indexes, populating them, and performing similarity searches. It also provides guidelines for installation, examples, and development processes such as installing dependencies, making changes, and running tests.
data:image/s3,"s3://crabby-images/4a2ec/4a2ec969001498bc43f05523ffe23f8ba3ca2203" alt="empower-functions Screenshot"
empower-functions
Empower Functions is a family of large language models (LLMs) that provide GPT-4 level capabilities for real-world 'tool using' use cases. These models offer compatibility support to be used as drop-in replacements, enabling interactions with external APIs by recognizing when a function needs to be called and generating JSON containing necessary arguments based on user inputs. This capability is crucial for building conversational agents and applications that convert natural language into API calls, facilitating tasks such as weather inquiries, data extraction, and interactions with knowledge bases. The models can handle multi-turn conversations, choose between tools or standard dialogue, ask for clarification on missing parameters, integrate responses with tool outputs in a streaming fashion, and efficiently execute multiple functions either in parallel or sequentially with dependencies.
For similar tasks
data:image/s3,"s3://crabby-images/e90da/e90da22793137d5f2af6a6309ea291bffbc1cd7a" alt="chatmemory Screenshot"
chatmemory
ChatMemory is a simple yet powerful long-term memory manager that facilitates communication between AI and users. It organizes conversation data into history, summary, and knowledge entities, enabling quick retrieval of context and generation of clear, concise answers. The tool leverages vector search on summaries/knowledge and detailed history to provide accurate responses. It balances speed and accuracy by using lightweight retrieval and fallback detailed search mechanisms, ensuring efficient memory management and response generation beyond mere data retrieval.
data:image/s3,"s3://crabby-images/89ec2/89ec2f98dbeba8dc403a61563a540f3bbfb1899d" alt="ai-gateway Screenshot"
ai-gateway
Envoy AI Gateway is an open source project that utilizes Envoy Gateway to manage request traffic from application clients to Generative AI services. The project aims to provide a seamless and efficient solution for handling communication between clients and AI services. It is designed to enhance the performance and scalability of AI applications by leveraging the capabilities of Envoy Gateway. The project welcomes contributions from the community and encourages collaboration to further develop and improve the functionality of the AI Gateway.
data:image/s3,"s3://crabby-images/9780f/9780f14b5b1766d4e7f7d0e1b93400e2f7b7eb38" alt="kernel-memory Screenshot"
kernel-memory
Kernel Memory (KM) is a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for Retrieval Augmented Generation (RAG), synthetic memory, prompt engineering, and custom semantic memory processing. KM is available as a Web Service, as a Docker container, a Plugin for ChatGPT/Copilot/Semantic Kernel, and as a .NET library for embedded applications. Utilizing advanced embeddings and LLMs, the system enables Natural Language querying for obtaining answers from the indexed data, complete with citations and links to the original sources. Designed for seamless integration as a Plugin with Semantic Kernel, Microsoft Copilot and ChatGPT, Kernel Memory enhances data-driven features in applications built for most popular AI platforms.
data:image/s3,"s3://crabby-images/93f25/93f25c864579a47de22b63a376df113d1cf1423d" alt="swirl-search Screenshot"
swirl-search
Swirl is an open-source software that allows users to simultaneously search multiple content sources and receive AI-ranked results. It connects to various data sources, including databases, public data services, and enterprise sources, and utilizes AI and LLMs to generate insights and answers based on the user's data. Swirl is easy to use, requiring only the download of a YML file, starting in Docker, and searching with Swirl. Users can add credentials to preloaded SearchProviders to access more sources. Swirl also offers integration with ChatGPT as a configured AI model. It adapts and distributes user queries to anything with a search API, re-ranking the unified results using Large Language Models without extracting or indexing anything. Swirl includes five Google Programmable Search Engines (PSEs) to get users up and running quickly. Key features of Swirl include Microsoft 365 integration, SearchProvider configurations, query adaptation, synchronous or asynchronous search federation, optional subscribe feature, pipelining of Processor stages, results stored in SQLite3 or PostgreSQL, built-in Query Transformation support, matching on word stems and handling of stopwords, duplicate detection, re-ranking of unified results using Cosine Vector Similarity, result mixers, page through all results requested, sample data sets, optional spell correction, optional search/result expiration service, easily extensible Connector and Mixer objects, and a welcoming community for collaboration and support.
data:image/s3,"s3://crabby-images/3b073/3b0730c865e7c50dbcde5d29c08d465226c19fe6" alt="paper-qa Screenshot"
paper-qa
PaperQA is a minimal package for question and answering from PDFs or text files, providing very good answers with in-text citations. It uses OpenAI Embeddings to embed and search documents, and follows a process of embedding docs and queries, searching for top passages, creating summaries, scoring and selecting relevant summaries, putting summaries into prompt, and generating answers. Users can customize prompts and use various models for embeddings and LLMs. The tool can be used asynchronously and supports adding documents from paths, files, or URLs.
data:image/s3,"s3://crabby-images/83dca/83dca0289076358a486618cd2951c03609a8367b" alt="quick-start-connectors Screenshot"
quick-start-connectors
Cohere's Build-Your-Own-Connector framework allows integration of Cohere's Command LLM via the Chat API endpoint to any datastore/software holding text information with a search endpoint. Enables user queries grounded in proprietary information. Use-cases include question/answering, knowledge working, comms summary, and research. Repository provides code for popular datastores and a template connector. Requires Python 3.11+ and Poetry. Connectors can be built and deployed using Docker. Environment variables set authorization values. Pre-commits for linting. Connectors tailored to integrate with Cohere's Chat API for creating chatbots. Connectors return documents as JSON objects for Cohere's API to generate answers with citations.
data:image/s3,"s3://crabby-images/30669/306692d5d522fa6362299d10de08354f8a5ae364" alt="llm-rag-workshop Screenshot"
llm-rag-workshop
The LLM RAG Workshop repository provides a workshop on using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to generate and understand text in a human-like manner. It includes instructions on setting up the environment, indexing Zoomcamp FAQ documents, creating a Q&A system, and using OpenAI for generation based on retrieved information. The repository focuses on enhancing language model responses with retrieved information from external sources, such as document databases or search engines, to improve factual accuracy and relevance of generated text.
data:image/s3,"s3://crabby-images/5976d/5976d4b8c20c4fc2ce345ff0694668ce3bb562a0" alt="RAGMeUp Screenshot"
RAGMeUp
RAG Me Up is a generic framework that enables users to perform Retrieve and Generate (RAG) on their own dataset easily. It consists of a small server and UIs for communication. Best run on GPU with 16GB vRAM. Users can combine RAG with fine-tuning using LLaMa2Lang repository. The tool allows configuration for LLM, data, LLM parameters, prompt, and document splitting. Funding is sought to democratize AI and advance its applications.
For similar jobs
data:image/s3,"s3://crabby-images/7689b/7689ba1fce50eb89a5e34075170d6aaee3c49f87" alt="weave Screenshot"
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
data:image/s3,"s3://crabby-images/10ae7/10ae70fb544e4cb1ced622d6de4a6da32e2f9150" alt="LLMStack Screenshot"
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
data:image/s3,"s3://crabby-images/83afc/83afcd39fd69a41723dd590c7594d452ad40edd5" alt="VisionCraft Screenshot"
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
data:image/s3,"s3://crabby-images/065d0/065d091551616e8781269d4b98673eee8b08234f" alt="kaito Screenshot"
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
data:image/s3,"s3://crabby-images/48887/488870f896a867b538f8a551521f4987e02b7077" alt="PyRIT Screenshot"
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
data:image/s3,"s3://crabby-images/c92ac/c92accb591e608b2d38283e73dd764fb033bff25" alt="tabby Screenshot"
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
data:image/s3,"s3://crabby-images/7740a/7740ad4457091afbcd6c9b0f3b808492d0dccb01" alt="spear Screenshot"
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
data:image/s3,"s3://crabby-images/33099/330995f291fdf6166ad2fee1a67c879cd5496194" alt="Magick Screenshot"
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.