model-compose
Declarative AI Model and Workflow Orchestrator
Stars: 73
    model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.
README:
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files — no custom code required. Effortlessly connect external AI services or run local AI models, all within powerful, composable workflows.
- Declarative by Design: Define complete AI workflows using simple YAML files—no complex scripting required.
 - Compose Anything: Combine multiple AI models, APIs, and tools into a single, unified pipeline.
 - Built for Orchestration: Orchestrate multi-step model interactions with ease. Transform individual API calls into maintainable, end-to-end systems.
 - Multi-Workflow Support: Define multiple named workflows in one project. Run them by name or set a default for quick execution.
 - Modular Components: Break down logic into reusable components and jobs. Easily plug, swap, and extend them across workflows.
 - Flexible I/O Routing: Connect inputs and outputs between jobs using clean, scoped variables—no glue code needed.
 - Streaming Mode Support: Stream real-time outputs from models and APIs, enabling interactive applications and faster feedback loops.
 - Run Locally, Serve Remotely: Execute workflows from the CLI or expose them as HTTP or MCP endpoints with an optional Web UI.
 - Docker Deployment: Build and deploy your workflow controller as a Docker container for consistent and portable execution environments.
 - 
Environment Variable Support: Easily inject secrets and configuration via 
.envfiles or environment variables to keep your YAML clean and secure. 
pip install model-compose
Or install from source:
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .
Requires: Python 3.9 or higher
model-compose provides a CLI interface to launch and manage your AI workflows — inspired by docker-compose.
Use the up command to launch the workflow controller, which hosts your workflows as HTTP or MCP endpoints and optionally provides a Web UI.
model-compose up
By default, this command will:
- Look for a file named 
model-compose.ymlin the current working directory - Automatically load environment variables from a 
.envfile in the same directory, if it exists - Start the workflow controller (default: 
http://localhost:8080) - Optionally launch the Web UI (default: 
http://localhost:8081, if configured) 
To run in the background (detached mode):
model-compose up -d
You can specify one or more configuration files using -f:
model-compose -f base.yml -f override.yml up
If needed, you can override or extend environment variables with:
model-compose up --env-file .env
or
model-compose up --env OPENAI_API_KEY=... --env ELEVENLABS_API_KEY=...
💡 Once the controller is running, you can trigger workflows via the REST API or, if using MCP, via JSON-RPC. You can also access them through the Web UI.
To gracefully stop and remove the workflow controller and all associated services:
model-compose down
Run a workflow directly from the CLI without starting the controller:
model-compose run <workflow-name> --input '{"key": "value"}'
This is useful for testing, automation, or scripting.
| Command | Description | 
|---|---|
model-compose up | 
Launch the workflow controller and load defined workflows | 
model-compose down | 
Gracefully stop and remove the controller and all related services | 
model-compose start | 
Start the controller if it has been previously configured | 
model-compose stop | 
Temporarily pause the currently running controller | 
The model-compose.yml file is the central configuration file that defines how your workflows are composed and executed.
It includes:
- Controller: configures the HTTP/MCP server, API endpoints, and optional Web UI
 - Components: reusable definitions for calling APIs, running local AI models, or executing commands
 - Workflows: named sets of jobs that define the flow of data
 - Jobs: steps that execute specific components, with support for inputs, outputs, and dependencies
 - Listeners: optional callback listeners that handle asynchronous responses from external services
 - Gateways: optional tunneling services that expose your local controller to the public internet
 
By default, model-compose automatically looks for a file named model-compose.yml in the current working directory when running commands like up or run.
controller:
  type: http-server
  port: 8080
  base_path: /api
  webui:
    port: 8081
components:
  - id: chatgpt
    type: http-client
    base_url: https://api.openai.com/v1
    path: /chat/completions
    method: POST
    headers:
      Authorization: Bearer ${env.OPENAI_API_KEY}
      Content-Type: application/json
    body:
      model: gpt-4o
      messages:
        - role: user
          content: "Write an inspiring quote."
    output:
      quote: ${response.choices[0].message.content}
workflows:
  - id: generate-quote
    default: true
    jobs:
      - id: get-quote
        component: chatgpt
This minimal example defines a simple workflow that calls the OpenAI ChatGPT API to generate an inspiring quote.
- The 
controllersection starts an HTTP server on port8080and enables a Web UI on port8081. - The 
componentssection defines a reusable HTTP client namedchatgptthat makes aPOSTrequest to the OpenAI Chat Completions API. It uses an environment variableOPENAI_API_KEYfor authentication and extracts the quote from the API response. - The 
workflowssection defines a single workflow calledgenerate-quote. It contains one job,get-quote, which uses thechatgptcomponent to fetch a quote from the API. - Since 
default: trueis set, the workflow is selected by default if no workflow name is specified during execution. 
You can easily expand this example by adding more components (e.g., text-to-speech, image generation) and connecting them through additional jobs.
listener:
  type: http-callback
  port: 8090
  base_path: /callbacks
  callbacks:
    - path: /chat-ai
      method: POST
      item: ${body.data}
      identify_by: ${item.task_id}
      result: ${item.choices[0].message.content}
This listener sets up an HTTP callback endpoint at http://localhost:8090/callbacks/chat-ai to handle asynchronous responses from an external service that behaves like ChatGPT but supports delayed or push-based results. This is useful when integrating with services that notify results via webhook-style callbacks.
gateway:
  type: http-tunnel
  driver: ngrok
  port: 8090
This gateway configuration exposes the local listener defined above to the public internet using an HTTP tunnel powered by ngrok. It forwards incoming traffic from a secure, public URL (e.g., https://abc123.ngrok.io) directly to your local callback endpoint at http://localhost:8090. This is essential when integrating with third-party services that need to push data back to your workflow via webhooks or asynchronous callbacks.
📁 For more example model-compose.yml configurations, check the examples directory in the source code.
model-compose optionally provides a lightweight Web UI to help you visually trigger workflows, inspect inputs and outputs, and monitor execution logs.
To enable the Web UI, simply add the webui section under your controller in the model-compose.yml file:
controller:
  type: http-server
  port: 8080
  webui:
    port: 8081
Once enabled, the Web UI will be available at:
http://localhost:8081
You can fully customize the Web UI experience by specifying a different driver or serving your own frontend.
By default, model-compose uses Gradio as the interactive UI. However, if you prefer to use your own static frontend (e.g., a custom React/Vite app), you can switch to the static driver.
Here’s how you can do it:
controller:
  type: http-server
  port: 8080
  webui:
    driver: static
    static_dir: webui
    port: 8081
Your frontend should be a prebuilt static site (e.g., using vite build, next export, or react-scripts build) and placed in the specified static_dir.
Make sure index.html exists in that directory.
project/
├── model-compose.yml
├── webui/
│   ├── index.html
│   ├── assets/
│   └── ...
- Ensure the 
static_dirpath is relative to the project root or an absolute path. - You can use environment variables inside 
model-compose.ymlto make this path configurable. 
Support for additional drivers (e.g., dynamic) may be added in future versions.
Once configured, the Web UI will be available at:
http://localhost:8081
You can also expose your workflows via the Model Context Protocol (MCP) server to enable remote execution, automation, or system integration using a lightweight JSON-RPC interface.
controller:
  type: mcp-server
  port: 8080
  base_path: /mcp
components:
  - id: chatgpt
    type: http-client
    base_url: https://api.openai.com/v1
    path: /chat/completions
    method: POST
    headers:
      Authorization: Bearer ${env.OPENAI_API_KEY}
      Content-Type: application/json
    body:
      model: gpt-4o
      messages:
        - role: user
          content: "Write an inspiring quote."
    output:
      quote: ${response.choices[0].message.content}
workflows:
  - id: generate-quote
    default: true
    jobs:
      - id: get-quote
        component: chatgpt
This configuration launches the controller as an MCP server, which listens on port 8080 and exposes your workflows over a JSON-RPC API.
Once running, you can invoke workflows remotely using a standard MCP request:
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "callTool",
  "params": {
    "name": "generate-quote",
    "arguments": {}
  }
}
You can send this request via any HTTP client to:
POST http://localhost:8080/mcp
You can run the workflow controller inside a Docker container by specifying the runtime option in your model-compose.yml file.
controller:
  type: http-server
  port: 8080
  runtime: docker
This configuration will launch the controller inside a lightweight Docker container automatically managed by model-compose. It uses default settings such as container name, image, and volume mappings.
You can fully configure the Docker runtime by using an object under the runtime key:
controller:
  type: http-server
  port: 8080
  runtime:
    type: docker
    image: 192.168.0.23/custom-image:latest
    container_name: my-controller
    volumes:
      - ./models:/models
      - ./cache:/cache
    ports:
      - "5000:8080"
      - "5001:8081"
    env:
      MODEL_STORAGE_PATH: /models
    command: [ "python", "-m", "mindor.cli.compose", "up" ]
    ...
This gives you full control over:
- container_name: Custom name for the container
 - image: Docker image to use
 - volumes: Bind mounts for sharing files between host and container
 - ports: Port mappings for host ↔ container communication
 - env: Environment variables to inject
 - command: Override the default entrypoint
 - and many more
 
All of these are optional, allowing you to start simple and customize only what you need.
- Run your controller in a clean, isolated environment
 - Avoid dependency conflicts with your host Python setup
 - Easily deploy your project to remote servers or CI pipelines
 - Share reproducible workflows with others
 
We welcome all contributions! Whether it's fixing bugs, improving docs, or adding examples — every bit helps.
# Setup for development
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .[dev]
MIT License © 2025 Hanyeol Cho.
Have questions, ideas, or feedback? Open an issue or start a discussion on GitHub Discussions.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for model-compose
Similar Open Source Tools
            
            model-compose
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.
            
            action_mcp
Action MCP is a powerful tool for managing and automating your cloud infrastructure. It provides a user-friendly interface to easily create, update, and delete resources on popular cloud platforms. With Action MCP, you can streamline your deployment process, reduce manual errors, and improve overall efficiency. The tool supports various cloud providers and offers a wide range of features to meet your infrastructure management needs. Whether you are a developer, system administrator, or DevOps engineer, Action MCP can help you simplify and optimize your cloud operations.
            
            dbt-llm-agent
dbt-llm-agent is an LLM-powered agent designed for interacting with dbt projects. It offers features such as question answering, documentation generation, agentic model interpretation, Postgres integration with pgvector, dbt model selection, question tracking, and upcoming Slack integration. The agent utilizes dbt project parsing, PostgreSQL with pgvector, model selection syntax, large language models like GPT-4, and question tracking to provide its functionalities. Users can set up the agent by checking Python version, cloning the repository, installing dependencies, setting up PostgreSQL with pgvector, configuring environment variables, and initializing the database schema. The agent can be initialized in Cloud Mode, Local Mode, or Source Code Mode to load project metadata. Once set up, users can work with model documentation, ask questions, provide feedback, list models, get detailed model information, and contribute to the project.
            
            director
Director is a context infrastructure tool for AI agents that simplifies managing MCP servers, prompts, and configurations by packaging them into portable workspaces accessible through a single endpoint. It allows users to define context workspaces once and share them across different AI clients, enabling seamless collaboration, instant context switching, and secure isolation of untrusted servers without cloud dependencies or API keys. Director offers features like workspaces, universal portability, local-first architecture, sandboxing, smart filtering, unified OAuth, observability, multiple interfaces, and compatibility with all MCP clients and servers.
            
            code2prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.
            
            pipecat-flows
Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.
            
            well-architected-iac-analyzer
Well-Architected Infrastructure as Code (IaC) Analyzer is a project demonstrating how generative AI can evaluate infrastructure code for alignment with best practices. It features a modern web application allowing users to upload IaC documents, complete IaC projects, or architecture diagrams for assessment. The tool provides insights into infrastructure code alignment with AWS best practices, offers suggestions for improving cloud architecture designs, and can generate IaC templates from architecture diagrams. Users can analyze CloudFormation, Terraform, or AWS CDK templates, architecture diagrams in PNG or JPEG format, and complete IaC projects with supporting documents. Real-time analysis against Well-Architected best practices, integration with AWS Well-Architected Tool, and export of analysis results and recommendations are included.
            
            BuildCLI
BuildCLI is a command-line interface (CLI) tool designed for managing and automating common tasks in Java project development. It simplifies the development process by allowing users to create, compile, manage dependencies, run projects, generate documentation, manage configuration profiles, dockerize projects, integrate CI/CD tools, and generate structured changelogs. The tool aims to enhance productivity and streamline Java project management by providing a range of functionalities accessible directly from the terminal.
            
            pastemax
PasteMax is a modern file viewer application designed for developers to easily navigate, search, and copy code from repositories. It provides features such as file tree navigation, token counting, search capabilities, selection management, sorting options, dark mode, binary file detection, and smart file exclusion. Built with Electron, React, and TypeScript, PasteMax is ideal for pasting code into ChatGPT or other language models. Users can download the application or build it from source, and customize file exclusions. Troubleshooting steps are provided for common issues, and contributions to the project are welcome under the MIT License.
            
            fraim
Fraim is an AI-powered toolkit designed for security engineers to enhance their workflows by leveraging AI capabilities. It offers solutions to find, detect, fix, and flag vulnerabilities throughout the development lifecycle. The toolkit includes features like Risk Flagger for identifying risks in code changes, Code Security Analysis for context-aware vulnerability detection, and Infrastructure as Code Analysis for spotting misconfigurations in cloud environments. Fraim can be run as a CLI tool or integrated into Github Actions, making it a versatile solution for security teams and organizations looking to enhance their security practices with AI technology.
            
            text-extract-api
The text-extract-api is a powerful tool that allows users to convert images, PDFs, or Office documents to Markdown text or JSON structured documents with high accuracy. It is built using FastAPI and utilizes Celery for asynchronous task processing, with Redis for caching OCR results. The tool provides features such as PDF/Office to Markdown and JSON conversion, improving OCR results with LLama, removing Personally Identifiable Information from documents, distributed queue processing, caching using Redis, switchable storage strategies, and a CLI tool for task management. Users can run the tool locally or on cloud services, with support for GPU processing. The tool also offers an online demo for testing purposes.
            
            pentagi
PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. It is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests. The tool provides secure and isolated operations in a sandboxed Docker environment, fully autonomous AI-powered agent for penetration testing steps, a suite of 20+ professional security tools, smart memory system for storing research results, web intelligence for gathering information, integration with external search systems, team delegation system, comprehensive monitoring and reporting, modern interface, API integration, persistent storage, scalable architecture, self-hosted solution, flexible authentication, and quick deployment through Docker Compose.
            
            comfyui-web-viewer
The ComfyUI Web Viewer by vrch.ai is a real-time AI-generated interactive art framework that integrates realtime streaming into ComfyUI workflows. It supports keyboard control nodes, OSC control nodes, sound input nodes, and more, accessible from any device with a web browser. It enables real-time interaction with AI-generated content, ideal for interactive visual projects and enhancing ComfyUI workflows with efficient content management and display.
            
            airbadge
Airbadge is a Stripe addon for Auth.js that simplifies the process of creating a SaaS site by integrating payment, authentication, gating, self-service account management, webhook handling, trials & free plans, session data, and more. It allows users to launch a SaaS app without writing any authentication or payment code. The project is open source and free to use with optional paid features under the BSL License.
            
            manifold
Manifold is a powerful platform for workflow automation using AI models. It supports text generation, image generation, and retrieval-augmented generation, integrating seamlessly with popular AI endpoints. Additionally, Manifold provides robust semantic search capabilities using PGVector combined with the SEFII engine. It is under active development and not production-ready.
            
            llmgateway
The llmgateway repository is a tool that provides a gateway for interacting with various LLM (Large Language Model) models. It allows users to easily access and utilize pre-trained language models for tasks such as text generation, sentiment analysis, and language translation. The tool simplifies the process of integrating LLMs into applications and workflows, enabling developers to leverage the power of state-of-the-art language models for various natural language processing tasks.
For similar tasks
            
            model-compose
model-compose is an open-source, declarative workflow orchestrator inspired by docker-compose. It lets you define and run AI model pipelines using simple YAML files. Effortlessly connect external AI services or run local AI models within powerful, composable workflows. Features include declarative design, multi-workflow support, modular components, flexible I/O routing, streaming mode support, and more. It supports running workflows locally or serving them remotely, Docker deployment, environment variable support, and provides a CLI interface for managing AI workflows.
            
            llama_deploy
llama_deploy is an async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index. It allows building workflows in llama_index and deploying them seamlessly with minimal changes to code. The system includes services endlessly processing tasks, a control plane managing state and services, an orchestrator deciding task handling, and fault tolerance mechanisms. It is designed for high-concurrency scenarios, enabling real-time and high-throughput applications.
            
            fastagency
FastAgency is an open-source framework designed to accelerate the transition from prototype to production for multi-agent AI workflows. It provides a unified programming interface for deploying agentic workflows written in AG2 agentic framework in both development and productional settings. With features like seamless external API integration, a Tester Class for continuous integration, and a Command-Line Interface (CLI) for orchestration, FastAgency streamlines the deployment process, saving time and effort while maintaining flexibility and performance. Whether orchestrating complex AI agents or integrating external APIs, FastAgency helps users quickly transition from concept to production, reducing development cycles and optimizing multi-agent systems.
            
            cua
Cua is a tool for creating and running high-performance macOS and Linux virtual machines on Apple Silicon, with built-in support for AI agents. It provides libraries like Lume for running VMs with near-native performance, Computer for interacting with sandboxes, and Agent for running agentic workflows. Users can refer to the documentation for onboarding, explore demos showcasing AI-Gradio and GitHub issue fixing, and utilize accessory libraries like Core, PyLume, Computer Server, and SOM. Contributions are welcome, and the tool is open-sourced under the MIT License.
For similar jobs
            
            ludwig
Ludwig is a declarative deep learning framework designed for scale and efficiency. It is a low-code framework that allows users to build custom AI models like LLMs and other deep neural networks with ease. Ludwig offers features such as optimized scale and efficiency, expert level control, modularity, and extensibility. It is engineered for production with prebuilt Docker containers, support for running with Ray on Kubernetes, and the ability to export models to Torchscript and Triton. Ludwig is hosted by the Linux Foundation AI & Data.
            
            wenda
Wenda is a platform for large-scale language model invocation designed to efficiently generate content for specific environments, considering the limitations of personal and small business computing resources, as well as knowledge security and privacy issues. The platform integrates capabilities such as knowledge base integration, multiple large language models for offline deployment, auto scripts for additional functionality, and other practical capabilities like conversation history management and multi-user simultaneous usage.
            
            LLMonFHIR
LLMonFHIR is an iOS application that utilizes large language models (LLMs) to interpret and provide context around patient data in the Fast Healthcare Interoperability Resources (FHIR) format. It connects to the OpenAI GPT API to analyze FHIR resources, supports multiple languages, and allows users to interact with their health data stored in the Apple Health app. The app aims to simplify complex health records, provide insights, and facilitate deeper understanding through a conversational interface. However, it is an experimental app for informational purposes only and should not be used as a substitute for professional medical advice. Users are advised to verify information provided by AI models and consult healthcare professionals for personalized advice.
            
            Chinese-Mixtral-8x7B
Chinese-Mixtral-8x7B is an open-source project based on Mistral's Mixtral-8x7B model for incremental pre-training of Chinese vocabulary, aiming to advance research on MoE models in the Chinese natural language processing community. The expanded vocabulary significantly improves the model's encoding and decoding efficiency for Chinese, and the model is pre-trained incrementally on a large-scale open-source corpus, enabling it with powerful Chinese generation and comprehension capabilities. The project includes a large model with expanded Chinese vocabulary and incremental pre-training code.
            
            AI-Horde-Worker
AI-Horde-Worker is a repository containing the original reference implementation for a worker that turns your graphics card(s) into a worker for the AI Horde. It allows users to generate or alchemize images for others. The repository provides instructions for setting up the worker on Windows and Linux, updating the worker code, running with multiple GPUs, and stopping the worker. Users can configure the worker using a WebUI to connect to the horde with their username and API key. The repository also includes information on model usage and running the Docker container with specified environment variables.
            
            openshield
OpenShield is a firewall designed for AI models to protect against various attacks such as prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, excessive agency granting, overreliance, and model theft. It provides rate limiting, content filtering, and keyword filtering for AI models. The tool acts as a transparent proxy between AI models and clients, allowing users to set custom rate limits for OpenAI endpoints and perform tokenizer calculations for OpenAI models. OpenShield also supports Python and LLM based rules, with upcoming features including rate limiting per user and model, prompts manager, content filtering, keyword filtering based on LLM/Vector models, OpenMeter integration, and VectorDB integration. The tool requires an OpenAI API key, Postgres, and Redis for operation.
            
            VoAPI
VoAPI is a new high-value/high-performance AI model interface management and distribution system. It is a closed-source tool for personal learning use only, not for commercial purposes. Users must comply with upstream AI model service providers and legal regulations. The system offers a visually appealing interface, independent development documentation page support, service monitoring page configuration support, and third-party login support. It also optimizes interface elements, user registration time support, data operation button positioning, and more.
            
            VoAPI
VoAPI is a new high-value/high-performance AI model interface management and distribution system. It is a closed-source tool for personal learning use only, not for commercial purposes. Users must comply with upstream AI model service providers and legal regulations. The system offers a visually appealing interface with features such as independent development documentation page support, service monitoring page configuration support, and third-party login support. Users can manage user registration time, optimize interface elements, and support features like online recharge, model pricing display, and sensitive word filtering. VoAPI also provides support for various AI models and platforms, with the ability to configure homepage templates, model information, and manufacturer information.

