
redb-open
Distributed data mesh for real-time access, migration, and replication across diverse databases β built for AI, security, and scale.
Stars: 55

reDB Node is a distributed, policy-driven data mesh platform that enables True Data Portability across various databases, warehouses, clouds, and environments. It unifies data access, data mobility, and schema transformation into one open platform. Built for developers, architects, and AI systems, reDB addresses the challenges of fragmented data ecosystems in modern enterprises by providing multi-database interoperability, automated schema versioning, zero-downtime migration, real-time developer data environments with obfuscation, quantum-resistant encryption, and policy-based access control. The project aims to build a foundation for future-proof data infrastructure.
README:
reDB is a distributed, policy-driven data mesh that enables True Data Portability across any mix of databases, warehouses, clouds, and environments. Built for developers, architects, and AI systems, reDB unifies data access, data mobility, and schema transformation into one open platform.
Modern enterprises operate in fragmented data ecosystems: multi-cloud, hybrid, on-prem, and filled with incompatible database technologiesβrelational, document, key-value, vector, graph, and more. These silos limit agility, complicate migrations, and throttle AI.
reDB solves this with a unified approach to enterprise data infrastructure:
- π reDB Mesh: A decentralized network that securely connects all your databasesβcloud, on-prem, or hybridβwithout brittle pipelines or manual tunnels.
- π§ Unified Schema Model: Translate and normalize structures across relational, NoSQL, and graph databases into a single, interoperable model.
- π Zero-Downtime Migration & Replication: Replicate and migrate live data across environmentsβreliably, securely, and in real time.
- π Policy-Driven Data Obfuscation: Automatically protect sensitive data with contextual masking and privacy rules at the access layer.
- π€ AI-Ready Access: Through the Model Context Protocol (MCP), reDB gives AI agents and tools frictionless access to data with full compliance and schema context.
This project is for:
- Data platform teams who need real-time migrations and multi-database interoperability
- AI/ML engineers who need contextual access to distributed data
- Developers who want production-like data access with built-in privacy
- Enterprises undergoing cloud transitions, database modernization, or data compliance transformations
Key capabilities that this project aims to provide:
- β Multi-database interoperability across SQL, NoSQL, vector, and graph
- β Automated schema versioning and transformation
- β Zero-downtime, bidirectional replication
- β Real-time developer data environments with obfuscation
- β Quantum-resistant encryption and policy-based access control
- β Distributed MCP server for scalable AI/IDE integration
We want to build the foundation for future-proof data infrastructure.
This project uses Make for building and managing the application.
- Go 1.23+ - Download
- PostgreSQL 17+ - Download
- Redis Server - Download
- Protocol Buffers Compiler - Installation Guide
# Clone the repository
git clone https://github.com/redbco/redb-open.git
cd redb-open
# Install development tools
make dev-tools
# Build for local development
make local
# Run tests
make test
-
make all
- Clean, generate proto files, build, and test -
make build
- Build all services (cross-compile for Linux by default) -
make local
- Build for local development (host OS) -
make dev
- Development build (clean, proto, build, test) -
make clean
- Remove build artifacts -
make test
- Run all tests -
make proto
- Generate Protocol Buffer code -
make lint
- Run linter -
make dev-tools
- Install development tools -
make build-all
- Build for multiple platforms (Linux/macOS, amd64/arm64) -
make install
- Install binaries (Linux only) -
make version
- Show version information
The build process creates binaries in the bin/
directory for local builds and build/
directory for multi-platform builds.
# Clone and build
git clone https://github.com/redbco/redb-open.git
cd redb-open
make build
# Install PostgreSQL 17 as prerequisite
sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt update
sudo apt -y install postgresql
# Create an admin user that the application can use for initialization
sudo -u postgres psql
CREATE USER your_admin_user WITH ENCRYPTED PASSWORD 'your_admin_password' CREATEDB CREATEROLE LOGIN;
exit
# Install Redis Server as a prerequisite
sudo apt install redis-server
# Initialize the reDB installation
./bin/redb-node --initialize
# If prompted, provide the PostgreSQL details
Enter PostgreSQL username [postgres]: your_admin_user
Enter PostgreSQL password: *your_admin_password*
Enter PostgreSQL host [localhost]:
Enter PostgreSQL port [5432]:
# Select "y" to create the default tenant and user - required for a fresh install
Would you like to create a default tenant and user? (Y/n): y
# Enter the details of the default tenant and user
Enter tenant name: tenant_name
Enter admin user email: [email protected]
Enter admin user password: *your_new_login_password*
Confirm password: *your_new_login_password*
# Initialization is complete, ready to start the application
./bin/redb-node
# The application can be run in the background as a service
# Logging in
redb@redb-demo:~$ ./bin/redb-cli auth login
Username (email): [email protected]
Password:
Hostname (default: localhost:8080):
Tenant URL: demo
Successfully logged in as [email protected]
Session: reDB CLI (ID: session_1752410814767386264_1nkJhJM4)
Select workspace (press Enter to skip):
No workspace selected. Use 'redb-cli select workspace <name>' to select one later.
redb@redb-demo:~$
# Creating your first workspace
redb@redb-demo:~$ ./bin/redb-cli workspaces add
Workspace Name: demo
Description (optional): reDB demo workspace
Successfully created workspace 'demo' (ID: ws_0000019803D4CBCEBCA9C6AB2D)
redb@redb-demo:~$
# Selecting the current workspace
redb@redb-demo:~$ ./bin/redb-cli select workspace demo
Selected workspace: demo (ID: ws_0000019803D4CBCEBCA9C6AB2D)
redb@redb-demo:~$
The reDB Node consists of 12 microservices orchestrated by a supervisor service, providing a comprehensive platform for managing heterogeneous database environments.
Central service orchestrator managing lifecycle, health monitoring, and configuration distribution for all microservices.
Authentication and authorization hub providing JWT tokens, session management, RBAC, and multi-tenant security.
Central business logic hub managing tenants, workspaces, databases, repositories, mappings, and policies.
Database abstraction layer with 16+ database adapters, schema translation, and cross-database type conversion.
Database connectivity service managing direct connections, schema monitoring, and data replication across all supported databases.
Data processing service providing internal transformation functions (e.g., formatting, hashing, encoding) and schema-aware mutations.
Manages external integrations such as LLMs, RAG systems, and third-party processors. Provides CRUD for integration definitions and an execution endpoint to invoke integrations over gRPC.
Distributed coordination service handling inter-node communication, consensus management, and message routing via WebSocket.
Primary REST API providing 50+ endpoints for resource management, serving CLI and web clients.
External system integration via webhooks for sending events to external systems.
Model Context Protocol server enabling AI/LLM integration with database resources, tools, and prompt templates.
Command-line interface for system management, database operations, and administrative tasks.
Multi-tenant web dashboard providing comprehensive operational management across three architectural levels:
- Tenant Level: Organization-wide operations including workspace management, mesh infrastructure monitoring, user access control, and integration management (RAG, LLM, Webhooks)
- Workspace Level: Environment-specific operations including database instance monitoring, schema repository management, data relationships, job tracking, and performance analytics
- Mesh Level: Network infrastructure management including satellite nodes, anchor nodes, regional distribution, and topology visualization
Key Features:
- Operational Dashboards: Real-time monitoring with health indicators, performance metrics, and activity tracking
- Dual-Sidebar Navigation: Icon-based tenant navigation with contextual aside menus for workspace and mesh operations
- Multi-Environment Support: Production, staging, development, and analytics workspace management
- Schema Version Control: Git-like repository management for database schemas with branching and merging
- Data Relationship Monitoring: Active replication and migration tracking with performance analytics
- User Profile Management: Complete account management with security settings, preferences, and activity history
- Theme Support: Dark/light mode with system preference detection
Technology Stack: Next.js 15, React 19, TypeScript 5, Tailwind CSS
The Anchor Service supports 27+ database types across 8 paradigms through specialized adapters:
Core Characteristics: Row-based storage, normalized structure, SQL queries, ACID transactions
- PostgreSQL (postgres)
- MySQL (mysql)
- Microsoft SQL Server (mssql)
- Oracle Database (oracle)
- MariaDB (mariadb)
- IBM Db2 (db2)
- CockroachDB (cockroachdb) - Distributed SQL but fundamentally relational
- Snowflake (snowflake) - Cloud data warehouse but SQL-based relational
- DuckDB (duckdb) - Analytical but fundamentally relational with SQL
Core Characteristics: Document-based storage, flexible schema, nested structures
- MongoDB (mongodb)
- Azure CosmosDB (cosmosdb) - Multi-model but primarily document-oriented
Core Characteristics: Graph-based storage, relationships as first-class citizens, traversal queries
- Neo4j (neo4j)
- EdgeDB (edgedb) - Object-relational but with graph capabilities and object modeling
Core Characteristics: High-dimensional vector storage, similarity search, embedding-focused
- Chroma (chroma)
- Milvus (milvus) - Includes Zilliz (managed milvus)
- Pinecone (pinecone)
- LanceDB (lancedb)
- Weaviate (weaviate)
Core Characteristics: Column-oriented storage, optimized for analytical queries, time-series focus
- ClickHouse (clickhouse)
- Apache Cassandra (cassandra) - Wide-column store, partition-key based
Core Characteristics: Simple key-value pairs, in-memory focus, limited query capabilities
- Redis (redis)
Core Characteristics: Inverted index storage, full-text search optimization, document scoring
- Elasticsearch (elasticsearch)
Core Characteristics: Flexible schema, partition-based, NoSQL query patterns
- Amazon DynamoDB (dynamodb)
Core Characteristics: File/blob storage, hierarchical key structure, metadata-based organization
- Amazon S3 (s3)
- Google Cloud Storage (gcs)
- Azure Blob Storage (azure_blob)
- MinIO (minio)
Category | Count | Databases |
---|---|---|
RELATIONAL | 9 | postgres, mysql, mssql, oracle, mariadb, db2, cockroachdb, snowflake, duckdb |
DOCUMENT | 2 | mongodb, cosmosdb |
GRAPH | 2 | neo4j, edgedb |
VECTOR | 5 | chroma, milvus, pinecone, lancedb, weaviate |
COLUMNAR | 2 | clickhouse, cassandra |
KEY_VALUE | 1 | redis |
SEARCH | 1 | elasticsearch |
WIDE_COLUMN | 1 | dynamodb |
OBJECT_STORAGE | 4 | s3, gcs, azure_blob, minio |
Total: 27 databases across 8 paradigms
The CLI provides commands organized into functional categories:
-
Authentication:
auth login
,auth logout
,auth profile
,auth status
,auth password
-
Tenants & Users:
tenants list
,tenants show
,tenants add
,tenants modify
,tenants delete
-
Workspaces & Environments:
workspaces list
,workspaces show
,workspaces add
,workspaces modify
,workspaces delete
-
Regions:
regions list
,regions show
,regions add
,regions modify
,regions delete
-
Instances:
instances connect
,instances list
,instances show
,instances modify
-
Databases:
databases connect
,databases list
,databases wipe
,databases clone table-data
- Schema Management: Database schema inspection and modification
-
Repositories:
repos list
,repos show
,repos add
,repos modify
-
Branches:
branches show
,branches attach
,branches detach
-
Commits:
commits show
, schema version management
-
Mappings:
mappings list
,mappings add table-mapping
, column-to-column relationship definitions - Relationships: Replication and migration relationship management
- Transformations: Data transformation and obfuscation functions
-
Mesh Operations:
mesh seed
,mesh join
,mesh show topology
, node management - Satellites & Anchors: Specialized node type management
- Routes: Network topology and routing configuration
- MCP Servers: Model Context Protocol server management
- MCP Resources: AI-accessible data resource configuration
- MCP Tools: AI tool and function definitions
The pkg/
directory contains reusable components shared across all microservices:
-
pkg/config/
- Centralized configuration management and validation -
pkg/database/
- Database connection utilities (PostgreSQL, Redis) -
pkg/encryption/
- Cryptographic operations and secure key management -
pkg/grpc/
- gRPC client/server utilities and middleware -
pkg/health/
- Health check framework and service monitoring -
pkg/keyring/
- Secure key storage and cryptographic key management -
pkg/logger/
- Structured logging framework used across all services -
pkg/models/
- Common data models and shared structures -
pkg/service/
- BaseService framework for standardized microservice lifecycle -
pkg/syslog/
- System logging integration and configuration
redb-open/
βββ cmd/ # Command-line applications
β βββ cli/ # CLI client (200+ commands)
β βββ supervisor/ # Service orchestrator
βββ services/ # Core microservices
β βββ anchor/ # Database connectivity (16+ adapters)
β βββ clientapi/ # Primary REST API (50+ endpoints)
β βββ core/ # Central business logic hub
β βββ mcpserver/ # AI/LLM integration (MCP protocol)
β βββ mesh/ # Distributed coordination and consensus
β βββ queryapi/ # Database query execution interface
β βββ security/ # Authentication and authorization
β βββ serviceapi/ # Administrative and service management
β βββ transformation/ # Internal data processing (no external integrations)
β βββ integration/ # External integrations (LLMs, RAG, custom)
β βββ unifiedmodel/ # Database abstraction and schema translation
β βββ webhook/ # External system integration
βββ pkg/ # Shared libraries and utilities
β βββ config/ # Configuration management
β βββ database/ # Database connection utilities
β βββ encryption/ # Cryptographic operations
β βββ grpc/ # gRPC client/server utilities
β βββ health/ # Health monitoring framework
β βββ keyring/ # Secure key management
β βββ logger/ # Structured logging
β βββ models/ # Common data models
β βββ service/ # BaseService lifecycle framework
β βββ syslog/ # System logging integration
βββ api/proto/ # Protocol Buffer definitions
βββ scripts/ # Database schemas and deployment
All services use a standardized gRPC communication pattern eliminating port conflicts and ensuring proper service registration through the BaseService framework.
The BaseService framework (pkg/service/
) provides standardized service initialization, health monitoring, graceful shutdown, and dependency management across all services.
The Unified Model Service provides a common interface for multiple database types, enabling cross-database operations, schema translation, and type conversion without vendor lock-in.
The Mesh Service enables multi-node deployments with peer-to-peer communication, consensus algorithms, and distributed state synchronization for high availability.
All requests are authenticated through the Security Service using JWT tokens, session management, and role-based access control (RBAC) with multi-tenant isolation.
The Webhook Service provides reliable event delivery to external systems with retry logic and delivery guarantees for real-time notifications.
Built-in Model Context Protocol (MCP) server enables seamless AI/LLM integration with database resources, tools, and prompt templates.
We welcome contributions from the open source community! Please see our Contributing Guidelines for details on how to:
- Set up your development environment
- Submit bug reports and feature requests
- Contribute code and documentation
- Follow our coding standards
- Participate in our community
This project is currently in Phase 1: Single Maintainer governance. This means:
- 1 approval required for pull requests
- Basic CI/CD checks (build, test, lint, security)
- Maintainer bypass available for emergencies
- Simple CODEOWNERS structure
As the community grows, governance will evolve through phases. See CONTRIBUTING.md for the complete governance evolution plan.
This project is dual-licensed:
- Open Source: GNU Affero General Public License v3.0 (AGPL-3.0)
- Commercial: Available under a commercial license for proprietary use
The AGPL-3.0 license requires that:
- Any modifications to the software must be made available to users
- If you run a modified version on a server and let other users communicate with it there, you must make the modified source code available to them
- The source code must be accessible to all users who interact with the software over a network
For commercial licensing options, please see LICENSE-COMMERCIAL.md or contact us directly.
- Install Prerequisites: Go 1.23+, PostgreSQL 17, Redis Server
-
Build the System:
make local
ormake build
-
Initialize:
./bin/redb-node --initialize
-
Start Services:
./bin/redb-node
-
Access CLI:
./bin/redb-cli auth login
For detailed installation instructions, see the Installation Instructions section above.
reDB Node provides a comprehensive open source platform for managing heterogeneous database environments with advanced features including schema version control, cross-database replication, data transformation pipelines, distributed mesh networking, and AI-powered database operations.
- Documentation: Project Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Community: [Discord] https://discord.gg/K3UkDYXG77
reDB Node is an open source project maintained by the community. We believe in the power of open source to drive innovation in database management and distributed systems.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for redb-open
Similar Open Source Tools

redb-open
reDB Node is a distributed, policy-driven data mesh platform that enables True Data Portability across various databases, warehouses, clouds, and environments. It unifies data access, data mobility, and schema transformation into one open platform. Built for developers, architects, and AI systems, reDB addresses the challenges of fragmented data ecosystems in modern enterprises by providing multi-database interoperability, automated schema versioning, zero-downtime migration, real-time developer data environments with obfuscation, quantum-resistant encryption, and policy-based access control. The project aims to build a foundation for future-proof data infrastructure.

layra
LAYRA is the world's first visual-native AI automation engine that sees documents like a human, preserves layout and graphical elements, and executes arbitrarily complex workflows with full Python control. It empowers users to build next-generation intelligent systems with no limits or compromises. Built for Enterprise-Grade deployment, LAYRA features a modern frontend, high-performance backend, decoupled service architecture, visual-native multimodal document understanding, and a powerful workflow engine.

lyraios
LYRAIOS (LLM-based Your Reliable AI Operating System) is an advanced AI assistant platform built with FastAPI and Streamlit, designed to serve as an operating system for AI applications. It offers core features such as AI process management, memory system, and I/O system. The platform includes built-in tools like Calculator, Web Search, Financial Analysis, File Management, and Research Tools. It also provides specialized assistant teams for Python and research tasks. LYRAIOS is built on a technical architecture comprising FastAPI backend, Streamlit frontend, Vector Database, PostgreSQL storage, and Docker support. It offers features like knowledge management, process control, and security & access control. The roadmap includes enhancements in core platform, AI process management, memory system, tools & integrations, security & access control, open protocol architecture, multi-agent collaboration, and cross-platform support.

robustmq
RobustMQ is a next-generation, high-performance, multi-protocol message queue built in Rust. It aims to create a unified messaging infrastructure tailored for modern cloud-native and AI systems. With features like high performance, distributed architecture, multi-protocol support, pluggable storage, cloud-native readiness, multi-tenancy, security features, observability, and user-friendliness, RobustMQ is designed to be production-ready and become a top-level Apache project in the message queue ecosystem by the second half of 2025.

evi-run
evi-run is a powerful, production-ready multi-agent AI system built on Python using the OpenAI Agents SDK. It offers instant deployment, ultimate flexibility, built-in analytics, Telegram integration, and scalable architecture. The system features memory management, knowledge integration, task scheduling, multi-agent orchestration, custom agent creation, deep research, web intelligence, document processing, image generation, DEX analytics, and Solana token swap. It supports flexible usage modes like private, free, and pay mode, with upcoming features including NSFW mode, task scheduler, and automatic limit orders. The technology stack includes Python 3.11, OpenAI Agents SDK, Telegram Bot API, PostgreSQL, Redis, and Docker & Docker Compose for deployment.

llamafarm
LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.

zotero-mcp
Zotero MCP is an open-source project that integrates AI capabilities with Zotero using the Model Context Protocol. It consists of a Zotero plugin and an MCP server, enabling AI assistants to search, retrieve, and cite references from Zotero library. The project features a unified architecture with an integrated MCP server, eliminating the need for a separate server process. It provides features like intelligent search, detailed reference information, filtering by tags and identifiers, aiding in academic tasks such as literature reviews and citation management.

meeting-minutes
An open-source AI assistant for taking meeting notes that captures live meeting audio, transcribes it in real-time, and generates summaries while ensuring user privacy. Perfect for teams to focus on discussions while automatically capturing and organizing meeting content without external servers or complex infrastructure. Features include modern UI, real-time audio capture, speaker diarization, local processing for privacy, and more. The tool also offers a Rust-based implementation for better performance and native integration, with features like live transcription, speaker diarization, and a rich text editor for notes. Future plans include database connection for saving meeting minutes, improving summarization quality, and adding download options for meeting transcriptions and summaries. The backend supports multiple LLM providers through a unified interface, with configurations for Anthropic, Groq, and Ollama models. System architecture includes core components like audio capture service, transcription engine, LLM orchestrator, data services, and API layer. Prerequisites for setup include Node.js, Python, FFmpeg, and Rust. Development guidelines emphasize project structure, testing, documentation, type hints, and ESLint configuration. Contributions are welcome under the MIT License.

bifrost
Bifrost is a high-performance AI gateway that unifies access to multiple providers through a single OpenAI-compatible API. It offers features like automatic failover, load balancing, semantic caching, and enterprise-grade functionalities. Users can deploy Bifrost in seconds with zero configuration, benefiting from its core infrastructure, advanced features, enterprise and security capabilities, and developer experience. The repository structure is modular, allowing for maximum flexibility. Bifrost is designed for quick setup, easy configuration, and seamless integration with various AI models and tools.

persistent-ai-memory
Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.

astrsk
astrsk is a tool that pushes the boundaries of AI storytelling by offering advanced AI agents, customizable response formatting, and flexible prompt editing for immersive roleplaying experiences. It provides complete AI agent control, a visual flow editor for conversation flows, and ensures 100% local-first data storage. The tool is true cross-platform with support for various AI providers and modern technologies like React, TypeScript, and Tailwind CSS. Coming soon features include cross-device sync, enhanced session customization, and community features.

finite-monkey-engine
FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

TranslateBookWithLLM
TranslateBookWithLLM is a Python application designed for large-scale text translation, such as entire books (.EPUB), subtitle files (.SRT), and plain text. It leverages local LLMs via the Ollama API or Gemini API. The tool offers both a web interface for ease of use and a command-line interface for advanced users. It supports multiple format translations, provides a user-friendly browser-based interface, CLI support for automation, multiple LLM providers including local Ollama models and Google Gemini API, and Docker support for easy deployment.

neuropilot
NeuroPilot is an open-source AI-powered education platform that transforms study materials into interactive learning resources. It provides tools like contextual chat, smart notes, flashcards, quizzes, and AI podcasts. Supported by various AI models and embedding providers, it offers features like WebSocket streaming, JSON or vector database support, file-based storage, and configurable multi-provider setup for LLMs and TTS engines. The technology stack includes Node.js, TypeScript, Vite, React, TailwindCSS, JSON database, multiple LLM providers, and Docker for deployment. Users can contribute to the project by integrating AI models, adding mobile app support, improving performance, enhancing accessibility features, and creating documentation and tutorials.

ai-doc-gen
An AI-powered code documentation generator that automatically analyzes repositories and creates comprehensive documentation using advanced language models. The system employs a multi-agent architecture to perform specialized code analysis and generate structured documentation.

OpenChat
OS Chat is a free, open-source AI personal assistant that combines 40+ language models with powerful automation capabilities. It allows users to deploy background agents, connect services like Gmail, Calendar, Notion, GitHub, and Slack, and get things done through natural conversation. With features like smart automation, service connectors, AI models, chat management, interface customization, and premium features, OS Chat offers a comprehensive solution for managing digital life and workflows. It prioritizes privacy by being open source and self-hostable, with encrypted API key storage.
For similar tasks

redb-open
reDB Node is a distributed, policy-driven data mesh platform that enables True Data Portability across various databases, warehouses, clouds, and environments. It unifies data access, data mobility, and schema transformation into one open platform. Built for developers, architects, and AI systems, reDB addresses the challenges of fragmented data ecosystems in modern enterprises by providing multi-database interoperability, automated schema versioning, zero-downtime migration, real-time developer data environments with obfuscation, quantum-resistant encryption, and policy-based access control. The project aims to build a foundation for future-proof data infrastructure.
For similar jobs

second-brain-ai-assistant-course
This open-source course teaches how to build an advanced RAG and LLM system using LLMOps and ML systems best practices. It helps you create an AI assistant that leverages your personal knowledge base to answer questions, summarize documents, and provide insights. The course covers topics such as LLM system architecture, pipeline orchestration, large-scale web crawling, model fine-tuning, and advanced RAG features. It is suitable for ML/AI engineers and data/software engineers & data scientists looking to level up to production AI systems. The course is free, with minimal costs for tools like OpenAI's API and Hugging Face's Dedicated Endpoints. Participants will build two separate Python applications for offline ML pipelines and online inference pipeline.

knavigator
Knavigator is a project designed to analyze, optimize, and compare scheduling systems, with a focus on AI/ML workloads. It addresses various needs, including testing, troubleshooting, benchmarking, chaos engineering, performance analysis, and optimization. Knavigator interfaces with Kubernetes clusters to manage tasks such as manipulating with Kubernetes objects, evaluating PromQL queries, as well as executing specific operations. It can operate both outside and inside a Kubernetes cluster, leveraging the Kubernetes API for task management. To facilitate large-scale experiments without the overhead of running actual user workloads, Knavigator utilizes KWOK for creating virtual nodes in extensive clusters.

redb-open
reDB Node is a distributed, policy-driven data mesh platform that enables True Data Portability across various databases, warehouses, clouds, and environments. It unifies data access, data mobility, and schema transformation into one open platform. Built for developers, architects, and AI systems, reDB addresses the challenges of fragmented data ecosystems in modern enterprises by providing multi-database interoperability, automated schema versioning, zero-downtime migration, real-time developer data environments with obfuscation, quantum-resistant encryption, and policy-based access control. The project aims to build a foundation for future-proof data infrastructure.

minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

kong
Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.

AI-in-a-Box
AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.

awsome-distributed-training
This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).