cordum

Cordum (cordum.io) is a platform-only control plane for autonomous AI Agents and external workers. It uses NATS for the bus, Redis for state and payload pointers, and CAP v2 wire contracts for jobs, results, and heartbeats. Workers and product packs live outside this repo.Core cordum

Stars: 448

Visit

Cordum is a control plane for AI agents designed to close the Trust Gap by providing safety, observability, and control features. It allows teams to deploy autonomous agents with built-in governance mechanisms, including safety policies, workflow orchestration, job routing, observability, and human-in-the-loop approvals. The tool aims to address the challenges of deploying AI agents in production by offering visibility, safety rails, audit trails, and approval mechanisms for sensitive operations.

README:

Cordum

AI Agent Governance Control Plan
Deploy autonomous agents with built-in safety, observability, and control.

The Problem

AI agents are powerful. They're also unpredictable.

Teams deploying agents in production face the Trust Gap: the distance between what an agent can do and what you're confident letting it do unsupervised.

Without governance, you're flying blind:

No visibility into what agents are doing
No safety rails before dangerous actions
No audit trail when things go wrong
No way to require human approval for sensitive operations

The Solution

Cordum is a control plane for AI agents that closes the Trust Gap.

┌─────────────────────────────────────────────────────────────────┐
│                         Cordum                                  │
│                                                                 │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────────┐ │
│  │   API    │──▶│ Scheduler│──▶│  Safety  │──▶│ Worker Pools │ │
│  │ Gateway  │   │          │   │  Kernel  │   │              │ │
│  └──────────┘   └──────────┘   └──────────┘   └──────────────┘ │
│       │              │              │                │         │
│       ▼              ▼              ▼                ▼         │
│  [Dashboard]    [Workflows]    [Policies]      [Your Agents]   │
└─────────────────────────────────────────────────────────────────┘

What Cordum does:

Safety Kernel — Policy checks (allow/deny/throttle/human-approve) before any job runs
Workflow Engine — Orchestrate multi-step agent workflows with retries, approvals, and timeouts
Job Routing — Distribute work across agent pools with capability-based routing
Observability — Full audit trail, traces, and real-time dashboard
Human-in-the-Loop — Require approval for sensitive operations

Quickstart

Prerequisites: Docker, Docker Compose, Go 1.24+

# Clone the repo
git clone https://github.com/cordum-io/cordum.git
cd cordum

# Set an API key
export CORDUM_API_KEY="$(openssl rand -hex 32)"

# Start everything (auto-generates TLS certs on first run)
go run ./cmd/cordumctl up

# Open dashboard
open http://localhost:8082

Or use the quickstart script:

export CORDUM_API_KEY="$(openssl rand -hex 32)"
./tools/scripts/quickstart.sh

That's it. You have a running Cordum instance with API, scheduler, safety kernel, dashboard, and TLS enabled by default. System configuration is auto-bootstrapped on first startup.

How It Works

Cordum uses CAP (Cordum Agent Protocol) for all agent communication:

Submit — Client submits a job via API
Safety Check — Scheduler asks Safety Kernel: allow, deny, throttle, or require approval?
Dispatch — Approved jobs route to the right worker pool via NATS
Execute — Your agent runs the job (using MCP, LangChain, whatever)
Result — Agent returns result; Cordum updates state and notifies client

Client ──▶ API ──▶ Scheduler ──▶ Safety Kernel ──▶ NATS ──▶ Agent Pool
                       │                              │
                       ▼                              ▼
                  [Redis State]                 [Your Agents]

Key design choices:

Payloads stay off the bus — context_ptr and result_ptr reference Redis/S3, keeping the message bus lean
Protocol-first — CAP is an independent spec; Cordum is the reference implementation
Workers are external — Cordum is the control plane; your agents run wherever you want

Key Features

Feature	Description
Safety Policies	Define rules for what agents can/can't do. Enforce before execution.
Output Safety	Evaluate completed job outputs and allow, redact, or quarantine unsafe results.
Human Approval	Flag sensitive jobs for manual review before they run.
Workflows	Multi-step DAGs with fan-out, retries, delays, and conditions.
Pool Routing	Route jobs by capability, region, or custom tags.
Heartbeats	Know which agents are alive, their capacity, and load.
Audit Trail	Every job, decision, and result logged and queryable.
Dashboard	Real-time UI for workflows, jobs, approvals, and policies.
Multi-tenant	API keys with RBAC for teams and environments.

Architecture

cordum/
├── cmd/                          # Service entrypoints + CLI
│   ├── cordum-api-gateway/       # API gateway (HTTP/WS + gRPC)
│   ├── cordum-scheduler/         # Scheduler + safety gating
│   ├── cordum-safety-kernel/     # Policy evaluation
│   ├── cordum-workflow-engine/   # Workflow orchestration
│   ├── cordum-context-engine/    # Optional context/memory service
│   └── cordumctl/                # CLI
├── core/                         # Core libraries
│   ├── controlplane/             # Gateway, scheduler, safety kernel
│   ├── context/                  # Context engine implementation
│   ├── infra/                    # Config, storage, bus, metrics
│   ├── protocol/                 # API protos + CAP aliases
│   └── workflow/                 # Workflow engine
├── dashboard/                    # React UI
├── sdk/                          # SDK + worker runtime
├── cordum-helm/                  # Helm chart
├── deploy/k8s/                   # Kubernetes manifests
└── docs/                         # Documentation

Documentation

Doc	Description
System Overview	Architecture and data flow
Core Reference	Deep technical details
Docker Guide	Running with Compose
Agent Protocol	CAP bus + pointer semantics
MCP Server	MCP stdio + HTTP/SSE integration
Pack Format	How to package agent capabilities
Local E2E	Full local walkthrough

Protocol: CAP

Cordum implements CAP (Cordum Agent Protocol) — an open protocol for distributed AI agent orchestration.

CAP vs MCP:

MCP = tool-calling protocol for a single model
CAP = job protocol for distributed agent clusters

They're complementary. Use CAP for orchestration, MCP inside your agents for tools.

MCP Server

Cordum includes an MCP server framework with:

Standalone stdio mode via cmd/cordum-mcp (for Claude Desktop/Code local integration)
Gateway HTTP/SSE mode via /mcp/message and /mcp/sse (when mcp.enabled=true)

See docs/mcp-server.md for setup, auth headers, and client configuration examples.

SDK

The Go SDK makes it easy to build CAP-compatible workers:

import (
    "log"

    "github.com/cordum/cordum/sdk/runtime"
)

type Input struct {
    Prompt string `json:"prompt"`
}

type Output struct {
    Summary string `json:"summary"`
}

func main() {
    agent := &runtime.Agent{Retries: 2}

    runtime.Register(agent, "job.summarize", func(ctx runtime.Context, input Input) (Output, error) {
        // Your agent logic here
        return Output{Summary: input.Prompt}, nil
    })

    if err := agent.Start(); err != nil {
        log.Fatal(err)
    }
    select {}
}

SDKs: Go (stable) | Python | Node

Community

Discord: Join the conversation
GitHub Discussions: Ask questions
Twitter/X: @coraboratedai
Email: [email protected]

Enterprise

Cordum Enterprise adds:

SSO/SAML integration
Advanced RBAC
SIEM export
Priority support

Governance

Cordum follows a transparent governance model with a protocol stability pledge, maintainer structure, and clear decision-making process. See GOVERNANCE.md for details including:

Protocol Stability: CAP v2 wire format frozen until February 2027
Security: SECURITY.md for vulnerability reporting
Versioning: Semantic versioning with deprecation policy

Roadmap

See ROADMAP.md for the full feature roadmap, completed milestones, and planned work.

Changelog

See CHANGELOG.md for a detailed log of all changes by version.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

License

Licensed under Business Source License 1.1 (BUSL-1.1).

Free for self-hosted and internal use. Not permitted for competing hosted/managed offerings. See LICENSE for details and Change Date.

Ready to govern your AI agents?
cordum.io · CAP Protocol · Discord

⭐ Star this repo if Cordum helps you deploy agents safely

For Tasks:

Click tags to check more tools for each tasks

manage agent policies orchestrate agent workflows route jobs to worker pools monitor agent activities approve sensitive operations

For Jobs:

ai engineer machine learning engineer data scientist ai governance specialist ai operations manager

Alternative AI tools for cordum

Similar Open Source Tools

cordum

github

: 448

incidentfox

IncidentFox is an open-source AI SRE tool designed to assist in incident response by automatically investigating incidents, finding root causes, and suggesting fixes. It integrates with observability stack, infrastructure, and collaboration tools, forming hypotheses, collecting data, and reasoning through to find root causes. The tool is built for production on-call scenarios, handling log sampling, alert correlation, anomaly detection, and dependency mapping. IncidentFox is highly customizable, Slack-first, and works on various platforms like web UI, GitHub, PagerDuty, and API. It aims to reduce incident resolution time, alert noise, and improve knowledge retention for engineering teams.

github

: 270

lihil

Lihil is a performant, productive, and professional web framework designed to make Python the mainstream programming language for web development. It is 100% test covered and strictly typed, offering fast performance, ergonomic API, and built-in solutions for common problems. Lihil is suitable for enterprise web development, delivering robust and scalable solutions with best practices in microservice architecture and related patterns. It features dependency injection, OpenAPI docs generation, error response generation, data validation, message system, testability, and strong support for AI features. Lihil is ASGI compatible and uses starlette as its ASGI toolkit, ensuring compatibility with starlette classes and middlewares. The framework follows semantic versioning and has a roadmap for future enhancements and features.

github

: 199

mimiclaw

MimiClaw is a pocket AI assistant that runs on a $5 chip, specifically designed for the ESP32-S3 board. It operates without Linux or Node.js, using pure C language. Users can interact with MimiClaw through Telegram, enabling it to handle various tasks and learn from local memory. The tool is energy-efficient, running on USB power 24/7. With MimiClaw, users can have a personal AI assistant on a chip the size of a thumb, making it convenient and accessible for everyday use.

github

: 175

httpjail

httpjail is a cross-platform tool designed for monitoring and restricting HTTP/HTTPS requests from processes using network isolation and transparent proxy interception. It provides process-level network isolation, HTTP/HTTPS interception with TLS certificate injection, script-based and JavaScript evaluation for custom request logic, request logging, default deny behavior, and zero-configuration setup. The tool operates on Linux and macOS, creating an isolated network environment for target processes and intercepting all HTTP/HTTPS traffic through a transparent proxy enforcing user-defined rules.

github

: 364

VT.ai

VT.ai is a multimodal AI platform that offers dynamic conversation routing with SemanticRouter, multi-modal interactions (text/image/audio), an assistant framework with code interpretation, real-time response streaming, cross-provider model switching, and local model support with Ollama integration. It supports various AI providers such as OpenAI, Anthropic, Google Gemini, Groq, Cohere, and OpenRouter, providing a wide range of core capabilities for AI orchestration.

github

: 66

mcp-ts-template

The MCP TypeScript Server Template is a production-grade framework for building powerful and scalable Model Context Protocol servers with TypeScript. It features built-in observability, declarative tooling, robust error handling, and a modular, DI-driven architecture. The template is designed to be AI-agent-friendly, providing detailed rules and guidance for developers to adhere to best practices. It enforces architectural principles like 'Logic Throws, Handler Catches' pattern, full-stack observability, declarative components, and dependency injection for decoupling. The project structure includes directories for configuration, container setup, server resources, services, storage, utilities, tests, and more. Configuration is done via environment variables, and key scripts are available for development, testing, and publishing to the MCP Registry.

github

: 116

aichildedu

AICHILDEDU is a microservice-based AI education platform for children that integrates LLMs, image generation, and speech synthesis to provide personalized storybook creation, intelligent conversational learning, and multimedia content generation. It offers features like personalized story generation, educational quiz creation, multimedia integration, age-appropriate content, multi-language support, user management, parental controls, and asynchronous processing. The platform follows a microservice architecture with components like API Gateway, User Service, Content Service, Learning Service, and AI Services. Technologies used include Python, FastAPI, PostgreSQL, MongoDB, Redis, LangChain, OpenAI GPT models, TensorFlow, PyTorch, Transformers, MinIO, Elasticsearch, Docker, Docker Compose, and JWT-based authentication.

github

: 162

DeepTutor

DeepTutor is an AI-powered personalized learning assistant that offers a suite of modules for massive document knowledge Q&A, interactive learning visualization, knowledge reinforcement with practice exercise generation, deep research, and idea generation. The tool supports multi-agent collaboration, dynamic topic queues, and structured outputs for various tasks. It provides a unified system entry for activity tracking, knowledge base management, and system status monitoring. DeepTutor is designed to streamline learning and research processes by leveraging AI technologies and interactive features.

github

: 10.2k

Shannon

Shannon is a battle-tested infrastructure for AI agents that solves problems at scale, such as runaway costs, non-deterministic failures, and security concerns. It offers features like intelligent caching, deterministic replay of workflows, time-travel debugging, WASI sandboxing, and hot-swapping between LLM providers. Shannon allows users to ship faster with zero configuration multi-agent setup, multiple AI patterns, time-travel debugging, and hot configuration changes. It is production-ready with features like WASI sandbox, token budget control, policy engine (OPA), and multi-tenancy. Shannon helps scale without breaking by reducing costs, being provider agnostic, observable by default, and designed for horizontal scaling with Temporal workflow orchestration.

github

: 258

nono

nono is a secure, kernel-enforced capability shell for running AI agents and any POSIX style process. It leverages OS security primitives to create an environment where unauthorized operations are structurally impossible. It provides protections against destructive commands and securely stores API keys, tokens, and secrets. The tool is agent-agnostic, works with any AI agent or process, and blocks dangerous commands by default. It follows a capability-based security model with defense-in-depth, ensuring secure execution of commands and protecting sensitive data.

github

: 312

claudex

Claudex is an open-source, self-hosted Claude Code UI that runs entirely on your machine. It provides multiple sandboxes, allows users to use their own plans, offers a full IDE experience with VS Code in the browser, and is extensible with skills, agents, slash commands, and MCP servers. Users can run AI agents in isolated environments, view and interact with a browser via VNC, switch between multiple AI providers, automate tasks with Celery workers, and enjoy various chat features and preview capabilities. Claudex also supports marketplace plugins, secrets management, integrations like Gmail, and custom instructions. The tool is configured through providers and supports various providers like Anthropic, OpenAI, OpenRouter, and Custom. It has a tech stack consisting of React, FastAPI, Python, PostgreSQL, Celery, Redis, and more.

github

: 202

multi-agent-shogun

multi-agent-shogun is a system that runs multiple AI coding CLI instances simultaneously, orchestrating them like a feudal Japanese army. It supports Claude Code, OpenAI Codex, GitHub Copilot, and Kimi Code. The system allows you to command your AI army with zero coordination cost, enabling parallel execution, non-blocking workflow, cross-session memory, event-driven communication, and full transparency. It also features skills discovery, phone notifications, pane border task display, shout mode, and multi-CLI support.

github

: 869

distill

Distill is a reliability layer for LLM context that provides deterministic deduplication to remove redundancy before reaching the model. It aims to reduce redundant data, lower costs, provide faster responses, and offer more efficient and deterministic results. The tool works by deduplicating, compressing, summarizing, and caching context to ensure reliable outputs. It offers various installation methods, including binary download, Go install, Docker usage, and building from source. Distill can be used for tasks like deduplicating chunks, connecting to vector databases, integrating with AI assistants, analyzing files for duplicates, syncing vectors to Pinecone, querying from the command line, and managing configuration files. The tool supports self-hosting via Docker, Docker Compose, building from source, Fly.io deployment, Render deployment, and Railway integration. Distill also provides monitoring capabilities with Prometheus-compatible metrics, Grafana dashboard, and OpenTelemetry tracing.

github

: 89

mcp-prompts

mcp-prompts is a Python library that provides a collection of prompts for generating creative writing ideas. It includes a variety of prompts such as story starters, character development, plot twists, and more. The library is designed to inspire writers and help them overcome writer's block by offering unique and engaging prompts to spark creativity. With mcp-prompts, users can access a wide range of writing prompts to kickstart their imagination and enhance their storytelling skills.

github

: 89

mesh

MCP Mesh is an open-source control plane for MCP traffic that provides a unified layer for authentication, routing, and observability. It replaces multiple integrations with a single production endpoint, simplifying configuration management. Built for multi-tenant organizations, it offers workspace/project scoping for policies, credentials, and logs. With core capabilities like MeshContext, AccessControl, and OpenTelemetry, it ensures fine-grained RBAC, full tracing, and metrics for tools and workflows. Users can define tools with input/output validation, access control checks, audit logging, and OpenTelemetry traces. The project structure includes apps for full-stack MCP Mesh, encryption, observability, and more, with deployment options ranging from Docker to Kubernetes. The tech stack includes Bun/Node runtime, TypeScript, Hono API, React, Kysely ORM, and Better Auth for OAuth and API keys.

github

: 331

For similar tasks

cordum

github

: 448

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 697

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k