
model-compose
Declarative AI Model and Workflow Orchestrator
Stars: 73

model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.
README:
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose
. It lets you define and run AI model pipelines using simple YAML files — no custom code required. Effortlessly connect external AI services or run local AI models, all within powerful, composable workflows.
- Declarative by Design: Define complete AI workflows using simple YAML files—no complex scripting required.
- Compose Anything: Combine multiple AI models, APIs, and tools into a single, unified pipeline.
- Built for Orchestration: Orchestrate multi-step model interactions with ease. Transform individual API calls into maintainable, end-to-end systems.
- Multi-Workflow Support: Define multiple named workflows in one project. Run them by name or set a default for quick execution.
- Modular Components: Break down logic into reusable components and jobs. Easily plug, swap, and extend them across workflows.
- Flexible I/O Routing: Connect inputs and outputs between jobs using clean, scoped variables—no glue code needed.
- Streaming Mode Support: Stream real-time outputs from models and APIs, enabling interactive applications and faster feedback loops.
- Run Locally, Serve Remotely: Execute workflows from the CLI or expose them as HTTP or MCP endpoints with an optional Web UI.
- Docker Deployment: Build and deploy your workflow controller as a Docker container for consistent and portable execution environments.
-
Environment Variable Support: Easily inject secrets and configuration via
.env
files or environment variables to keep your YAML clean and secure.
pip install model-compose
Or install from source:
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .
Requires: Python 3.9 or higher
model-compose provides a CLI interface to launch and manage your AI workflows — inspired by docker-compose
.
Use the up
command to launch the workflow controller, which hosts your workflows as HTTP or MCP endpoints and optionally provides a Web UI.
model-compose up
By default, this command will:
- Look for a file named
model-compose.yml
in the current working directory - Automatically load environment variables from a
.env
file in the same directory, if it exists - Start the workflow controller (default:
http://localhost:8080
) - Optionally launch the Web UI (default:
http://localhost:8081
, if configured)
To run in the background (detached mode):
model-compose up -d
You can specify one or more configuration files using -f
:
model-compose -f base.yml -f override.yml up
If needed, you can override or extend environment variables with:
model-compose up --env-file .env
or
model-compose up --env OPENAI_API_KEY=... --env ELEVENLABS_API_KEY=...
💡 Once the controller is running, you can trigger workflows via the REST API or, if using MCP, via JSON-RPC. You can also access them through the Web UI.
To gracefully stop and remove the workflow controller and all associated services:
model-compose down
Run a workflow directly from the CLI without starting the controller:
model-compose run <workflow-name> --input '{"key": "value"}'
This is useful for testing, automation, or scripting.
Command | Description |
---|---|
model-compose up |
Launch the workflow controller and load defined workflows |
model-compose down |
Gracefully stop and remove the controller and all related services |
model-compose start |
Start the controller if it has been previously configured |
model-compose stop |
Temporarily pause the currently running controller |
The model-compose.yml
file is the central configuration file that defines how your workflows are composed and executed.
It includes:
- Controller: configures the HTTP/MCP server, API endpoints, and optional Web UI
- Components: reusable definitions for calling APIs, running local AI models, or executing commands
- Workflows: named sets of jobs that define the flow of data
- Jobs: steps that execute specific components, with support for inputs, outputs, and dependencies
- Listeners: optional callback listeners that handle asynchronous responses from external services
- Gateways: optional tunneling services that expose your local controller to the public internet
By default, model-compose
automatically looks for a file named model-compose.yml
in the current working directory when running commands like up
or run
.
controller:
type: http-server
port: 8080
base_path: /api
webui:
port: 8081
components:
- id: chatgpt
type: http-client
base_url: https://api.openai.com/v1
path: /chat/completions
method: POST
headers:
Authorization: Bearer ${env.OPENAI_API_KEY}
Content-Type: application/json
body:
model: gpt-4o
messages:
- role: user
content: "Write an inspiring quote."
output:
quote: ${response.choices[0].message.content}
workflows:
- id: generate-quote
default: true
jobs:
- id: get-quote
component: chatgpt
This minimal example defines a simple workflow that calls the OpenAI ChatGPT API to generate an inspiring quote.
- The
controller
section starts an HTTP server on port8080
and enables a Web UI on port8081
. - The
components
section defines a reusable HTTP client namedchatgpt
that makes aPOST
request to the OpenAI Chat Completions API. It uses an environment variableOPENAI_API_KEY
for authentication and extracts the quote from the API response. - The
workflows
section defines a single workflow calledgenerate-quote
. It contains one job,get-quote
, which uses thechatgpt
component to fetch a quote from the API. - Since
default: true
is set, the workflow is selected by default if no workflow name is specified during execution.
You can easily expand this example by adding more components (e.g., text-to-speech, image generation) and connecting them through additional jobs.
listener:
type: http-callback
port: 8090
base_path: /callbacks
callbacks:
- path: /chat-ai
method: POST
item: ${body.data}
identify_by: ${item.task_id}
result: ${item.choices[0].message.content}
This listener sets up an HTTP callback endpoint at http://localhost:8090/callbacks/chat-ai
to handle asynchronous responses from an external service that behaves like ChatGPT but supports delayed or push-based results. This is useful when integrating with services that notify results via webhook-style callbacks.
gateway:
type: http-tunnel
driver: ngrok
port: 8090
This gateway configuration exposes the local listener defined above to the public internet using an HTTP tunnel powered by ngrok. It forwards incoming traffic from a secure, public URL (e.g., https://abc123.ngrok.io
) directly to your local callback endpoint at http://localhost:8090
. This is essential when integrating with third-party services that need to push data back to your workflow via webhooks or asynchronous callbacks.
📁 For more example model-compose.yml configurations, check the examples directory in the source code.
model-compose optionally provides a lightweight Web UI to help you visually trigger workflows, inspect inputs and outputs, and monitor execution logs.
To enable the Web UI, simply add the webui
section under your controller
in the model-compose.yml
file:
controller:
type: http-server
port: 8080
webui:
port: 8081
Once enabled, the Web UI will be available at:
http://localhost:8081
You can fully customize the Web UI experience by specifying a different driver or serving your own frontend.
By default, model-compose
uses Gradio as the interactive UI. However, if you prefer to use your own static frontend (e.g., a custom React/Vite app), you can switch to the static
driver.
Here’s how you can do it:
controller:
type: http-server
port: 8080
webui:
driver: static
static_dir: webui
port: 8081
Your frontend should be a prebuilt static site (e.g., using vite build
, next export
, or react-scripts
build) and placed in the specified static_dir
.
Make sure index.html
exists in that directory.
project/
├── model-compose.yml
├── webui/
│ ├── index.html
│ ├── assets/
│ └── ...
- Ensure the
static_dir
path is relative to the project root or an absolute path. - You can use environment variables inside
model-compose.yml
to make this path configurable.
Support for additional drivers (e.g., dynamic
) may be added in future versions.
Once configured, the Web UI will be available at:
http://localhost:8081
You can also expose your workflows via the Model Context Protocol (MCP) server to enable remote execution, automation, or system integration using a lightweight JSON-RPC interface.
controller:
type: mcp-server
port: 8080
base_path: /mcp
components:
- id: chatgpt
type: http-client
base_url: https://api.openai.com/v1
path: /chat/completions
method: POST
headers:
Authorization: Bearer ${env.OPENAI_API_KEY}
Content-Type: application/json
body:
model: gpt-4o
messages:
- role: user
content: "Write an inspiring quote."
output:
quote: ${response.choices[0].message.content}
workflows:
- id: generate-quote
default: true
jobs:
- id: get-quote
component: chatgpt
This configuration launches the controller as an MCP server, which listens on port 8080 and exposes your workflows over a JSON-RPC API.
Once running, you can invoke workflows remotely using a standard MCP request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "callTool",
"params": {
"name": "generate-quote",
"arguments": {}
}
}
You can send this request via any HTTP client to:
POST http://localhost:8080/mcp
You can run the workflow controller inside a Docker container by specifying the runtime option in your model-compose.yml
file.
controller:
type: http-server
port: 8080
runtime: docker
This configuration will launch the controller inside a lightweight Docker container automatically managed by model-compose. It uses default settings such as container name, image, and volume mappings.
You can fully configure the Docker runtime by using an object under the runtime
key:
controller:
type: http-server
port: 8080
runtime:
type: docker
image: 192.168.0.23/custom-image:latest
container_name: my-controller
volumes:
- ./models:/models
- ./cache:/cache
ports:
- "5000:8080"
- "5001:8081"
env:
MODEL_STORAGE_PATH: /models
command: [ "python", "-m", "mindor.cli.compose", "up" ]
...
This gives you full control over:
- container_name: Custom name for the container
- image: Docker image to use
- volumes: Bind mounts for sharing files between host and container
- ports: Port mappings for host ↔ container communication
- env: Environment variables to inject
- command: Override the default entrypoint
- and many more
All of these are optional, allowing you to start simple and customize only what you need.
- Run your controller in a clean, isolated environment
- Avoid dependency conflicts with your host Python setup
- Easily deploy your project to remote servers or CI pipelines
- Share reproducible workflows with others
We welcome all contributions! Whether it's fixing bugs, improving docs, or adding examples — every bit helps.
# Setup for development
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .[dev]
MIT License © 2025 Hanyeol Cho.
Have questions, ideas, or feedback? Open an issue or start a discussion on GitHub Discussions.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for model-compose
Similar Open Source Tools

model-compose
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.

action_mcp
Action MCP is a powerful tool for managing and automating your cloud infrastructure. It provides a user-friendly interface to easily create, update, and delete resources on popular cloud platforms. With Action MCP, you can streamline your deployment process, reduce manual errors, and improve overall efficiency. The tool supports various cloud providers and offers a wide range of features to meet your infrastructure management needs. Whether you are a developer, system administrator, or DevOps engineer, Action MCP can help you simplify and optimize your cloud operations.

dbt-llm-agent
dbt-llm-agent is an LLM-powered agent designed for interacting with dbt projects. It offers features such as question answering, documentation generation, agentic model interpretation, Postgres integration with pgvector, dbt model selection, question tracking, and upcoming Slack integration. The agent utilizes dbt project parsing, PostgreSQL with pgvector, model selection syntax, large language models like GPT-4, and question tracking to provide its functionalities. Users can set up the agent by checking Python version, cloning the repository, installing dependencies, setting up PostgreSQL with pgvector, configuring environment variables, and initializing the database schema. The agent can be initialized in Cloud Mode, Local Mode, or Source Code Mode to load project metadata. Once set up, users can work with model documentation, ask questions, provide feedback, list models, get detailed model information, and contribute to the project.

director
Director is a context infrastructure tool for AI agents that simplifies managing MCP servers, prompts, and configurations by packaging them into portable workspaces accessible through a single endpoint. It allows users to define context workspaces once and share them across different AI clients, enabling seamless collaboration, instant context switching, and secure isolation of untrusted servers without cloud dependencies or API keys. Director offers features like workspaces, universal portability, local-first architecture, sandboxing, smart filtering, unified OAuth, observability, multiple interfaces, and compatibility with all MCP clients and servers.

RA.Aid
RA.Aid is an AI software development agent powered by `aider` and advanced reasoning models like `o1`. It combines `aider`'s code editing capabilities with LangChain's agent-based task execution framework to provide an intelligent assistant for research, planning, and implementation of multi-step development tasks. It handles complex programming tasks by breaking them down into manageable steps, running shell commands automatically, and leveraging expert reasoning models like OpenAI's o1. RA.Aid is designed for everyday software development, offering features such as multi-step task planning, automated command execution, and the ability to handle complex programming tasks beyond single-shot code edits.

code2prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.

pipecat-flows
Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.

well-architected-iac-analyzer
Well-Architected Infrastructure as Code (IaC) Analyzer is a project demonstrating how generative AI can evaluate infrastructure code for alignment with best practices. It features a modern web application allowing users to upload IaC documents, complete IaC projects, or architecture diagrams for assessment. The tool provides insights into infrastructure code alignment with AWS best practices, offers suggestions for improving cloud architecture designs, and can generate IaC templates from architecture diagrams. Users can analyze CloudFormation, Terraform, or AWS CDK templates, architecture diagrams in PNG or JPEG format, and complete IaC projects with supporting documents. Real-time analysis against Well-Architected best practices, integration with AWS Well-Architected Tool, and export of analysis results and recommendations are included.

BuildCLI
BuildCLI is a command-line interface (CLI) tool designed for managing and automating common tasks in Java project development. It simplifies the development process by allowing users to create, compile, manage dependencies, run projects, generate documentation, manage configuration profiles, dockerize projects, integrate CI/CD tools, and generate structured changelogs. The tool aims to enhance productivity and streamline Java project management by providing a range of functionalities accessible directly from the terminal.

steel-browser
Steel is an open-source browser API designed for AI agents and applications, simplifying the process of building live web agents and browser automation tools. It serves as a core building block for a production-ready, containerized browser sandbox with features like stealth capabilities, text-to-markdown session management, UI for session viewing/debugging, and full browser control through popular automation frameworks. Steel allows users to control, run, and manage a production-ready browser environment via a REST API, offering features such as full browser control, session management, proxy support, extension support, debugging tools, anti-detection mechanisms, resource management, and various browser tools. It aims to streamline complex browsing tasks programmatically, enabling users to focus on their AI applications while Steel handles the underlying complexity.

cursor-tools
cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

openai-kotlin
OpenAI Kotlin API client is a Kotlin client for OpenAI's API with multiplatform and coroutines capabilities. It allows users to interact with OpenAI's API using Kotlin programming language. The client supports various features such as models, chat, images, embeddings, files, fine-tuning, moderations, audio, assistants, threads, messages, and runs. It also provides guides on getting started, chat & function call, file source guide, and assistants. Sample apps are available for reference, and troubleshooting guides are provided for common issues. The project is open-source and licensed under the MIT license, allowing contributions from the community.

pastemax
PasteMax is a modern file viewer application designed for developers to easily navigate, search, and copy code from repositories. It provides features such as file tree navigation, token counting, search capabilities, selection management, sorting options, dark mode, binary file detection, and smart file exclusion. Built with Electron, React, and TypeScript, PasteMax is ideal for pasting code into ChatGPT or other language models. Users can download the application or build it from source, and customize file exclusions. Troubleshooting steps are provided for common issues, and contributions to the project are welcome under the MIT License.

agenticSeek
AgenticSeek is a voice-enabled AI assistant powered by DeepSeek R1 agents, offering a fully local alternative to cloud-based AI services. It allows users to interact with their filesystem, code in multiple languages, and perform various tasks autonomously. The tool is equipped with memory to remember user preferences and past conversations, and it can divide tasks among multiple agents for efficient execution. AgenticSeek prioritizes privacy by running entirely on the user's hardware without sending data to the cloud.

fraim
Fraim is an AI-powered toolkit designed for security engineers to enhance their workflows by leveraging AI capabilities. It offers solutions to find, detect, fix, and flag vulnerabilities throughout the development lifecycle. The toolkit includes features like Risk Flagger for identifying risks in code changes, Code Security Analysis for context-aware vulnerability detection, and Infrastructure as Code Analysis for spotting misconfigurations in cloud environments. Fraim can be run as a CLI tool or integrated into Github Actions, making it a versatile solution for security teams and organizations looking to enhance their security practices with AI technology.

Fabric
Fabric is an open-source framework designed to augment humans using AI by organizing prompts by real-world tasks. It addresses the integration problem of AI by creating and organizing prompts for various tasks. Users can create, collect, and organize AI solutions in a single place for use in their favorite tools. Fabric also serves as a command-line interface for those focused on the terminal. It offers a wide range of features and capabilities, including support for multiple AI providers, internationalization, speech-to-text, AI reasoning, model management, web search, text-to-speech, desktop notifications, and more. The project aims to help humans flourish by leveraging AI technology to solve human problems and enhance creativity.
For similar tasks

model-compose
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.

llama_deploy
llama_deploy is an async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index. It allows building workflows in llama_index and deploying them seamlessly with minimal changes to code. The system includes services endlessly processing tasks, a control plane managing state and services, an orchestrator deciding task handling, and fault tolerance mechanisms. It is designed for high-concurrency scenarios, enabling real-time and high-throughput applications.

fastagency
FastAgency is an open-source framework designed to accelerate the transition from prototype to production for multi-agent AI workflows. It provides a unified programming interface for deploying agentic workflows written in AG2 agentic framework in both development and productional settings. With features like seamless external API integration, a Tester Class for continuous integration, and a Command-Line Interface (CLI) for orchestration, FastAgency streamlines the deployment process, saving time and effort while maintaining flexibility and performance. Whether orchestrating complex AI agents or integrating external APIs, FastAgency helps users quickly transition from concept to production, reducing development cycles and optimizing multi-agent systems.

cua
Cua is a tool for creating and running high-performance macOS and Linux virtual machines on Apple Silicon, with built-in support for AI agents. It provides libraries like Lume for running VMs with near-native performance, Computer for interacting with sandboxes, and Agent for running agentic workflows. Users can refer to the documentation for onboarding, explore demos showcasing AI-Gradio and GitHub issue fixing, and utilize accessory libraries like Core, PyLume, Computer Server, and SOM. Contributions are welcome, and the tool is open-sourced under the MIT License.
For similar jobs

ludwig
Ludwig is a declarative deep learning framework designed for scale and efficiency. It is a low-code framework that allows users to build custom AI models like LLMs and other deep neural networks with ease. Ludwig offers features such as optimized scale and efficiency, expert level control, modularity, and extensibility. It is engineered for production with prebuilt Docker containers, support for running with Ray on Kubernetes, and the ability to export models to Torchscript and Triton. Ludwig is hosted by the Linux Foundation AI & Data.

wenda
Wenda is a platform for large-scale language model invocation designed to efficiently generate content for specific environments, considering the limitations of personal and small business computing resources, as well as knowledge security and privacy issues. The platform integrates capabilities such as knowledge base integration, multiple large language models for offline deployment, auto scripts for additional functionality, and other practical capabilities like conversation history management and multi-user simultaneous usage.

LLMonFHIR
LLMonFHIR is an iOS application that utilizes large language models (LLMs) to interpret and provide context around patient data in the Fast Healthcare Interoperability Resources (FHIR) format. It connects to the OpenAI GPT API to analyze FHIR resources, supports multiple languages, and allows users to interact with their health data stored in the Apple Health app. The app aims to simplify complex health records, provide insights, and facilitate deeper understanding through a conversational interface. However, it is an experimental app for informational purposes only and should not be used as a substitute for professional medical advice. Users are advised to verify information provided by AI models and consult healthcare professionals for personalized advice.

Chinese-Mixtral-8x7B
Chinese-Mixtral-8x7B is an open-source project based on Mistral's Mixtral-8x7B model for incremental pre-training of Chinese vocabulary, aiming to advance research on MoE models in the Chinese natural language processing community. The expanded vocabulary significantly improves the model's encoding and decoding efficiency for Chinese, and the model is pre-trained incrementally on a large-scale open-source corpus, enabling it with powerful Chinese generation and comprehension capabilities. The project includes a large model with expanded Chinese vocabulary and incremental pre-training code.

AI-Horde-Worker
AI-Horde-Worker is a repository containing the original reference implementation for a worker that turns your graphics card(s) into a worker for the AI Horde. It allows users to generate or alchemize images for others. The repository provides instructions for setting up the worker on Windows and Linux, updating the worker code, running with multiple GPUs, and stopping the worker. Users can configure the worker using a WebUI to connect to the horde with their username and API key. The repository also includes information on model usage and running the Docker container with specified environment variables.

openshield
OpenShield is a firewall designed for AI models to protect against various attacks such as prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, excessive agency granting, overreliance, and model theft. It provides rate limiting, content filtering, and keyword filtering for AI models. The tool acts as a transparent proxy between AI models and clients, allowing users to set custom rate limits for OpenAI endpoints and perform tokenizer calculations for OpenAI models. OpenShield also supports Python and LLM based rules, with upcoming features including rate limiting per user and model, prompts manager, content filtering, keyword filtering based on LLM/Vector models, OpenMeter integration, and VectorDB integration. The tool requires an OpenAI API key, Postgres, and Redis for operation.

VoAPI
VoAPI is a new high-value/high-performance AI model interface management and distribution system. It is a closed-source tool for personal learning use only, not for commercial purposes. Users must comply with upstream AI model service providers and legal regulations. The system offers a visually appealing interface, independent development documentation page support, service monitoring page configuration support, and third-party login support. It also optimizes interface elements, user registration time support, data operation button positioning, and more.

VoAPI
VoAPI is a new high-value/high-performance AI model interface management and distribution system. It is a closed-source tool for personal learning use only, not for commercial purposes. Users must comply with upstream AI model service providers and legal regulations. The system offers a visually appealing interface with features such as independent development documentation page support, service monitoring page configuration support, and third-party login support. Users can manage user registration time, optimize interface elements, and support features like online recharge, model pricing display, and sensitive word filtering. VoAPI also provides support for various AI models and platforms, with the ability to configure homepage templates, model information, and manufacturer information.