matrixone
MySQL-compatible HTAP database with Git for Data, vector search, and fulltext search. Cloud-native, AI-ready
Stars: 1885
MatrixOne is the industry's first database to bring Git-style version control to data, combined with MySQL compatibility, AI-native capabilities, and cloud-native architecture. It is a HTAP (Hybrid Transactional/Analytical Processing) database with a hyper-converged HSTAP engine that seamlessly handles transactional, analytical, full-text search, and vector search workloads in a single unified system—no data movement, no ETL, no compromises. Manage your database like code with features like instant snapshots, time travel, branch & merge, instant rollback, and complete audit trail. Built for the AI era, MatrixOne is MySQL-compatible, AI-native, and cloud-native, offering storage-compute separation, elastic scaling, and Kubernetes-native deployment. It serves as one database for everything, replacing multiple databases and ETL jobs with native OLTP, OLAP, full-text search, and vector search capabilities.
README:
- What is MatrixOne
- Get Started in 60 Seconds
- Tutorials & Demos
- Installation & Deployment
- Architecture
- Python SDK
- Contributing
- License
MatrixOne is the industry's first database to bring Git-style version control to data, combined with MySQL compatibility, AI-native capabilities, and cloud-native architecture.
At its core, MatrixOne is a HTAP (Hybrid Transactional/Analytical Processing) database with a hyper-converged HSTAP engine that seamlessly handles transactional (OLTP), analytical (OLAP), full-text search, and vector search workloads in a single unified system—no data movement, no ETL, no compromises.
Just as Git revolutionized code management, MatrixOne revolutionizes data management. Manage your database like code:
- 📸 Instant Snapshots - Zero-copy snapshots in milliseconds, no storage explosion
- ⏰ Time Travel - Query data as it existed at any point in history
- 🔀 Branch & Merge - Test migrations and transformations in isolated branches
- ↩️ Instant Rollback - Restore to any previous state without full backups
- 🔍 Complete Audit Trail - Track every data change with immutable history
Why it matters: Data mistakes are expensive. Git for Data gives you the safety net and flexibility developers have enjoyed with Git—now for your most critical asset: your data.
|
🗄️ MySQL-Compatible Drop-in replacement for MySQL. Use existing tools, ORMs, and applications without code changes. Seamless migration path. |
🤖 AI-Native Built-in vector search (IVF/HNSW) and full-text search. Build RAG apps and semantic search directly—no external vector databases needed. |
☁️ Cloud-Native Storage-compute separation. Deploy anywhere. Elastic scaling. Kubernetes-native. Zero-downtime operations. |
The typical modern data stack:
🗄️ MySQL for transactions → 📊 ClickHouse for analytics → 🔍 Elasticsearch for search → 🤖 Pinecone for AI
The problem: 4 databases · Multiple ETL jobs · Hours of data lag · Sync nightmares
MatrixOne replaces all of them:
🎯 One database with native OLTP, OLAP, full-text search, and vector search. Real-time. ACID compliant. No ETL.
docker run -d -p 6001:6001 --name matrixone matrixorigin/matrixone:latestmysql -h127.0.0.1 -P6001 -p111 -uroot -e "create database demo"Install Python SDK:
pip install matrixone-python-sdkVector search:
from matrixone import Client
from matrixone.orm import declarative_base
from sqlalchemy import Column, Integer, String, Text
from matrixone.sqlalchemy_ext import create_vector_column
# Create client and connect
client = Client()
client.connect(database='demo')
# Define model using MatrixOne ORM
Base = declarative_base()
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(200), nullable=False)
content = Column(Text, nullable=False)
embedding = create_vector_column(8, "f32")
# Create table using client API
client.create_table(Article)
# Insert some data using client API
articles = [
{'title': 'Machine Learning Guide',
'content': 'Comprehensive machine learning tutorial...',
'embedding': [0.1, 0.2, 0.3, 0.15, 0.25, 0.35, 0.12, 0.22]},
{'title': 'Python Programming',
'content': 'Learn Python programming basics',
'embedding': [0.2, 0.3, 0.4, 0.25, 0.35, 0.45, 0.22, 0.32]},
]
client.batch_insert(Article, articles)
client.vector_ops.create_ivf(
Article,
name='idx_embedding',
column='embedding',
lists=100,
op_type='vector_l2_ops'
)
query_vector = [0.2, 0.3, 0.4, 0.25, 0.35, 0.45, 0.22, 0.32]
results = client.query(
Article.title,
Article.content,
Article.embedding.l2_distance(query_vector).label("distance"),
).filter(Article.embedding.l2_distance(query_vector) < 0.1).execute()
for row in results.rows:
print(f"Title: {row[0]}, Content: {row[1][:50]}...")
# Cleanup
client.drop_table(Article) # Use client API
client.disconnect()Fulltext Search:
...
from matrixone.sqlalchemy_ext import boolean_match
# Create fulltext index using SDK
client.fulltext_index.create(
Article,name='ftidx_content',columns=['title', 'content']
)
# Boolean search with must/should operators
results = client.query(
Article.title,
Article.content,
boolean_match('title', 'content')
.must('machine')
.must('learning')
.must_not('basics')
).execute()
# Results is a ResultSet object
for row in results.rows:
print(f"Title: {row[0]}, Content: {row[1][:50]}...")
...That's it! 🎉 You're now running a production-ready database with Git-like snapshots, vector search, and full ACID compliance.
💡 Want more control? Check out the Installation & Deployment section below for production-grade installation options.
Ready to dive deeper? Explore our comprehensive collection of hands-on tutorials and real-world demos:
| Tutorial | Language/Framework | Description |
|---|---|---|
| Java CRUD Demo | Java | Java application development |
| SpringBoot and JPA CRUD Demo | Java | SpringBoot with Hibernate/JPA |
| PyMySQL CRUD Demo | Python | Basic database operations with Python |
| SQLAlchemy CRUD Demo | Python | Python with SQLAlchemy ORM |
| Django CRUD Demo | Python | Django web framework |
| Golang CRUD Demo | Go | Go application development |
| Gorm CRUD Demo | Go | Go with Gorm ORM |
| C# CRUD Demo | C# | .NET application development |
| TypeScript CRUD Demo | TypeScript | TypeScript application development |
| Tutorial | Use Case | Related MatrixOne Features |
|---|---|---|
| Pinecone-Compatible Vector Search | AI & Search | vector search, Pinecone-compatible API |
| IVF Index Health Monitoring | AI & Search | vector search, IVF index |
| HNSW Vector Index | AI & Search | vector search, HNSW index |
| Fulltext Natural Search | AI & Search | fulltext search, natural language |
| Fulltext Boolean Search | AI & Search | fulltext search, boolean operators |
| Fulltext JSON Search | AI & Search | fulltext search, JSON data |
| Hybrid Search | AI & Search | hybrid search, vector + fulltext + SQL |
| RAG Application Demo | AI & Search | RAG, vector search, fulltext search |
| Picture(Text)-to-Picture Search | AI & Search | multimodal search, image similarity |
| Dify Integration Demo | AI & Search | AI platform integration |
| HTAP Application Demo | Performance | HTAP, real-time analytics |
| Instant Clone for Multi-Team Development | Performance | instant clone, Git for Data |
| Safe Production Upgrade with Instant Rollback | Performance | snapshot, rollback, Git for Data |
MatrixOne supports multiple installation methods. Choose the one that best fits your needs:
Run a complete distributed cluster locally with multiple CN nodes, load balancing, and easy configuration management.
# Quick start
make dev-build && make dev-up
# Connect via proxy (load balanced)
mysql -h 127.0.0.1 -P 6001 -u root -p111
# Configure specific service (interactive editor)
make dev-edit-cn1 # Edit CN1 config
make dev-restart-cn1 # Restart only CN1 (fast!)📖 Complete Development Guide → - Comprehensive guide covering standalone setup, multi-CN clusters, monitoring, metrics, configuration, and all make dev-* commands
One-command deployment and lifecycle management with the official mo_ctl tool. Handles installation, upgrades, backups, and health monitoring automatically.
📖 Complete mo_ctl Installation Guide →
Build MatrixOne from source for development, customization, or contributing. Requires Go 1.22, GCC/Clang, Git, and Make.
📖 Complete Build from Source Guide →
Docker standalone, Kubernetes, binary packages, and more deployment options.
MatrixOne's architecture is as below:
For more details, you can checkout MatrixOne Architecture Design.
MatrixOne provides a comprehensive Python SDK for database operations, vector search, fulltext search, and advanced features like snapshots, PITR, and account management.
Key Features: High-performance async/await support, vector similarity search with IVF/HNSW indexing, fulltext search, metadata analysis, and complete type safety.
📖 Python SDK README - Full features, installation, and usage guide
📦 Installation: pip install matrixone-python-sdk
Contributions to MatrixOne are welcome from everyone.
See Contribution Guide for details on submitting patches and the contribution workflow.
MatrixOne is licensed under the Apache License, Version 2.0.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for matrixone
Similar Open Source Tools
matrixone
MatrixOne is the industry's first database to bring Git-style version control to data, combined with MySQL compatibility, AI-native capabilities, and cloud-native architecture. It is a HTAP (Hybrid Transactional/Analytical Processing) database with a hyper-converged HSTAP engine that seamlessly handles transactional, analytical, full-text search, and vector search workloads in a single unified system—no data movement, no ETL, no compromises. Manage your database like code with features like instant snapshots, time travel, branch & merge, instant rollback, and complete audit trail. Built for the AI era, MatrixOne is MySQL-compatible, AI-native, and cloud-native, offering storage-compute separation, elastic scaling, and Kubernetes-native deployment. It serves as one database for everything, replacing multiple databases and ETL jobs with native OLTP, OLAP, full-text search, and vector search capabilities.
agenta
Agenta is an open-source LLM developer platform for prompt engineering, evaluation, human feedback, and deployment of complex LLM applications. It provides tools for prompt engineering and management, evaluation, human annotation, and deployment, all without imposing any restrictions on your choice of framework, library, or model. Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time.
CTFCrackTools
CTFCrackTools X is the next generation of CTFCrackTools, featuring extreme performance and experience, extensible node-based architecture, and future-oriented technology stack. It offers a visual node-based workflow for encoding and decoding processes, with 43+ built-in algorithms covering common CTF needs like encoding, classical ciphers, modern encryption, hashing, and text processing. The tool is lightweight (< 15MB), high-performance, and cross-platform, supporting Windows, macOS, and Linux without the need for a runtime environment. It aims to provide a beginner-friendly tool for CTF enthusiasts to easily work on challenges and improve their skills.
codemod
Codemod platform is a tool that helps developers create, distribute, and run codemods in codebases of any size. The AI-powered, community-led codemods enable automation of framework upgrades, large refactoring, and boilerplate programming with speed and developer experience. It aims to make dream migrations a reality for developers by providing a platform for seamless codemod operations.
ClaraVerse
ClaraVerse is a privacy-first AI assistant and agent builder that allows users to chat with AI, create intelligent agents, and turn them into fully functional apps. It operates entirely on open-source models running on the user's device, ensuring data privacy and security. With features like AI assistant, image generation, intelligent agent builder, and image gallery, ClaraVerse offers a versatile platform for AI interaction and app development. Users can install ClaraVerse through Docker, native desktop apps, or the web version, with detailed instructions provided for each option. The tool is designed to empower users with control over their AI stack and leverage community-driven innovations for AI development.
bitcart
Bitcart is a platform designed for merchants, users, and developers, providing easy setup and usage. It includes various linked repositories for core daemons, admin panel, ready store, Docker packaging, Python library for coins connection, BitCCL scripting language, documentation, and official site. The platform aims to simplify the process for merchants and developers to interact and transact with cryptocurrencies, offering a comprehensive ecosystem for managing transactions and payments.
fastapi-admin
智元 Fast API is a one-stop API management system that unifies various LLM APIs in terms of format, standards, and management to achieve the ultimate in functionality, performance, and user experience. It includes features such as model management with intelligent and regex matching, backup model functionality, key management, proxy management, company management, user management, and chat management for both admin and user ends. The project supports cluster deployment, multi-site deployment, and cross-region deployment. It also provides a public API site for registration with a contact to the author for a 10 million quota. The tool offers a comprehensive dashboard, model management, application management, key management, and chat management functionalities for users.
ComfyUI-OllamaGemini
ComfyUI GeminiOllama Extension integrates Google's Gemini API, OpenAI (ChatGPT), Anthropic's Claude, Ollama, Qwen, and image processing tools into ComfyUI for leveraging powerful models and features directly within workflows. Features include multiple AI API integrations, advanced prompt engineering, Gemini image generation, background removal, SVG conversion, FLUX resolutions, ComfyUI Styler, smart prompt generator, and more. The extension offers comprehensive API integration, advanced prompt engineering with researched templates, high-quality tools like Smart Prompt Generator and BRIA RMBG, and supports video & audio processing. It provides a single interface to access powerful AI models, transform prompts into detailed instructions, and use various tools for image processing, styling, and content generation.
MaiBot
MaiBot is an interactive intelligent agent based on a large language model. It aims to be an 'entity' active in QQ group chats, focusing on human-like interactions. It features personification in language style, behavior planning, expression learning, plugin system for unlimited extensions, and emotion expression. The project's design philosophy emphasizes creating a 'life form' in group chats that feels real rather than perfect, with the goal of providing companionship through an AI that makes mistakes and has its own perceptions and thoughts. The code is open-source, but the runtime data of MaiBot is intended to remain closed to maintain its autonomy and conversational nature.
LotteryMaster
LotteryMaster is a tool designed to fetch lottery data, save it to Excel files, and provide analysis reports including number prediction, number recommendation, and number trends. It supports multiple platforms for access such as Web and mobile App. The tool integrates AI models like Qwen API and DeepSeek for generating analysis reports and trend analysis charts. Users can configure API parameters for controlling randomness, diversity, presence penalty, and maximum tokens. The tool also includes a frontend project based on uniapp + Vue3 + TypeScript for multi-platform applications. It provides a backend service running on Fastify with Node.js, Cheerio.js for web scraping, Pino for logging, xlsx for Excel file handling, and Jest for testing. The project is still in development and some features may not be fully implemented. The analysis reports are for reference only and do not constitute investment advice. Users are advised to use the tool responsibly and avoid addiction to gambling.
Open-dLLM
Open-dLLM is the most open release of a diffusion-based large language model, providing pretraining, evaluation, inference, and checkpoints. It introduces Open-dCoder, the code-generation variant of Open-dLLM. The repo offers a complete stack for diffusion LLMs, enabling users to go from raw data to training, checkpoints, evaluation, and inference in one place. It includes pretraining pipeline with open datasets, inference scripts for easy sampling and generation, evaluation suite with various metrics, weights and checkpoints on Hugging Face, and transparent configs for full reproducibility.
simpletransformers
Simple Transformers is a library based on the Transformers library by HuggingFace, allowing users to quickly train and evaluate Transformer models with only 3 lines of code. It supports various tasks such as Information Retrieval, Language Models, Encoder Model Training, Sequence Classification, Token Classification, Question Answering, Language Generation, T5 Model, Seq2Seq Tasks, Multi-Modal Classification, and Conversational AI.
Code-Review-GPT-Gitlab
A project that utilizes large models to help with Code Review on Gitlab, aimed at improving development efficiency. The project is customized for Gitlab and is developing a Multi-Agent plugin for collaborative review. It integrates various large models for code security issues and stays updated with the latest Code Review trends. The project architecture is designed to be powerful, flexible, and efficient, with easy integration of different models and high customization for developers.
all-api-hub
All API Hub is an open-source browser extension that serves as a centralized management tool for third-party AI aggregation hubs and self-built New APIs. It automatically identifies accounts, checks balances, synchronizes models, manages keys, and supports cross-platform and cloud backups. The extension supports various aggregation hubs like one-api, new-api, Veloera, one-hub, done-hub, Neo-API, Super-API, RIX_API, and VoAPI. It offers features such as intelligent site recognition, multi-account overview panel, automatic check-ins, token and key management, model information and pricing display, model and interface validation, usage analysis and visualization, quick export integration, self-built New API and Veloera management tools, Cloudflare challenge assistant, data backup and synchronization, multi-platform support, and privacy-focused local storage.
Lynkr
Lynkr is a self-hosted proxy server that unlocks various AI coding tools like Claude Code CLI, Cursor IDE, and Codex Cli. It supports multiple LLM providers such as Databricks, AWS Bedrock, OpenRouter, Ollama, llama.cpp, Azure OpenAI, Azure Anthropic, OpenAI, and LM Studio. Lynkr offers cost reduction, local/private execution, remote or local connectivity, zero code changes, and enterprise-ready features. It is perfect for developers needing provider flexibility, cost control, self-hosted AI with observability, local model execution, and cost reduction strategies.
NeuroAI_Course
Neuromatch Academy NeuroAI Course Syllabus is a repository that contains the schedule and licensing information for the NeuroAI course. The course is designed to provide participants with a comprehensive understanding of artificial intelligence in neuroscience. It covers various topics related to AI applications in neuroscience, including machine learning, data analysis, and computational modeling. The content is primarily accessed from the ebook provided in the repository, and the course is scheduled for July 15-26, 2024. The repository is shared under a Creative Commons Attribution 4.0 International License and software elements are additionally licensed under the BSD (3-Clause) License. Contributors to the project are acknowledged and welcomed to contribute further.
For similar tasks
matrixone
MatrixOne is the industry's first database to bring Git-style version control to data, combined with MySQL compatibility, AI-native capabilities, and cloud-native architecture. It is a HTAP (Hybrid Transactional/Analytical Processing) database with a hyper-converged HSTAP engine that seamlessly handles transactional, analytical, full-text search, and vector search workloads in a single unified system—no data movement, no ETL, no compromises. Manage your database like code with features like instant snapshots, time travel, branch & merge, instant rollback, and complete audit trail. Built for the AI era, MatrixOne is MySQL-compatible, AI-native, and cloud-native, offering storage-compute separation, elastic scaling, and Kubernetes-native deployment. It serves as one database for everything, replacing multiple databases and ETL jobs with native OLTP, OLAP, full-text search, and vector search capabilities.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.


