Pixelle-MCP

An Open-Source Multimodal AIGC Solution based on ComfyUI + MCP + LLM https://pixelle.ai

Stars: 615

Visit

Pixelle-MCP is a multi-channel publishing tool designed to streamline the process of publishing content across various social media platforms. It allows users to create, schedule, and publish posts simultaneously on platforms such as Facebook, Twitter, and Instagram. With a user-friendly interface and advanced scheduling features, Pixelle-MCP helps users save time and effort in managing their social media presence. The tool also provides analytics and insights to track the performance of posts and optimize content strategy. Whether you are a social media manager, content creator, or digital marketer, Pixelle-MCP is a valuable tool to enhance your online presence and engage with your audience effectively.

README:

🎨 Pixelle MCP - Omnimodal Agent Framework

English | 中文

✨ An AIGC solution based on the MCP protocol, seamlessly converting ComfyUI workflows into MCP tools with zero code, empowering LLM and ComfyUI integration.

https://github.com/user-attachments/assets/65422cef-96f9-44fe-a82b-6a124674c417

📋 Recent Updates

✅ 2025-09-03: Architecture refactoring from three services to unified application; added CLI tool support; published to PyPI
✅ 2025-08-12: Integrated the LiteLLM framework, adding multi-model support for Gemini, DeepSeek, Claude, Qwen, and more

🚀 Features

✅ 🔄 Full-modal Support: Supports TISV (Text, Image, Sound/Speech, Video) full-modal conversion and generation
✅ 🧩 ComfyUI Ecosystem: Built on ComfyUI, inheriting all capabilities from the open ComfyUI ecosystem
✅ 🔧 Zero-code Development: Defines and implements the Workflow-as-MCP Tool solution, enabling zero-code development and dynamic addition of new MCP Tools
✅ 🗄️ MCP Server: Based on the MCP protocol, supporting integration with any MCP client (including but not limited to Cursor, Claude Desktop, etc.)
✅ 🌐 Web Interface: Developed based on the Chainlit framework, inheriting Chainlit's UI controls and supporting integration with more MCP Servers
✅ 📦 One-click Deployment: Supports PyPI installation, CLI commands, Docker and other deployment methods, ready to use out of the box
✅ ⚙️ Simplified Configuration: Uses environment variable configuration scheme, simple and intuitive configuration
✅ 🤖 Multi-LLM Support: Supports multiple mainstream LLMs, including OpenAI, Ollama, Gemini, DeepSeek, Claude, Qwen, and more

📁 Project Architecture

Pixelle MCP adopts a unified architecture design, integrating MCP server, web interface, and file services into one application, providing:

🌐 Web Interface: Chainlit-based chat interface supporting multimodal interaction
🔌 MCP Endpoint: For external MCP clients (such as Cursor, Claude Desktop) to connect
📁 File Service: Handles file upload, download, and storage
🛠️ Workflow Engine: Automatically converts ComfyUI workflows into MCP tools

🏃‍♂️ Quick Start

Choose the deployment method that best suits your needs, from simple to complex:

🎯 Method 1: One-click Experience

💡 Zero configuration startup, perfect for quick experience and testing

🚀 Temporary Run

# Start with one command, no system installation required
uvx pixelle@latest

📚 View uvx CLI Reference →

📦 Persistent Installation

# Install to system
pip install -U pixelle

# Start service
pixelle

📚 View pip CLI Reference →

After startup, it will automatically enter the configuration wizard to guide you through ComfyUI connection and LLM configuration.

🛠️ Method 2: Local Development Deployment

💡 Supports custom workflows and secondary development

📥 1. Get Source Code

git clone https://github.com/AIDC-AI/Pixelle-MCP.git
cd Pixelle-MCP

🚀 2. Start Service

# Interactive mode (recommended)
uv run pixelle

📚 View Complete CLI Reference →

🔧 3. Add Custom Workflows (Optional)

# Copy example workflows to data directory (run this in your desired project directory)
cp -r workflows/* ./data/custom_workflows/

⚠️ Important: Make sure to test workflows in ComfyUI first to ensure they run properly, otherwise execution will fail.

🐳 Method 3: Docker Deployment

💡 Suitable for production environments and containerized deployment

📋 1. Prepare Configuration

git clone https://github.com/AIDC-AI/Pixelle-MCP.git
cd Pixelle-MCP

# Create environment configuration file
cp .env.example .env
# Edit .env file to configure your ComfyUI address and LLM settings

🚀 2. Start Container

# Start all services in background
docker compose up -d

# View logs
docker compose logs -f

🌐 Access Services

Regardless of which method you use, after startup you can access via:

🌐 Web Interface: http://localhost:9004
Default username and password are both dev, can be modified after startup
🔌 MCP Endpoint: http://localhost:9004/pixelle/mcp
For MCP clients like Cursor, Claude Desktop to connect

💡 Port Configuration: Default port is 9004, can be customized via environment variable PORT=your_port.

⚙️ Initial Configuration

On first startup, the system will automatically detect configuration status:

🔧 ComfyUI Connection: Ensure ComfyUI service is running at http://localhost:8188
🤖 LLM Configuration: Configure at least one LLM provider (OpenAI, Ollama, etc.)
📁 Workflow Directory: System will automatically create necessary directory structure

🆘 Need Help? Join community groups for support (see Community section below)

🛠️ Add Your Own MCP Tool

⚡ One workflow = One MCP Tool

🎯 1. Add the Simplest MCP Tool

📝 Build a workflow in ComfyUI for image Gaussian blur (Get it here), then set the LoadImage node's title to $image.image! as shown below:
📤 Export it as an API format file and rename it to i_blur.json. You can export it yourself or use our pre-exported version (Get it here)
📋 Copy the exported API workflow file (must be API format), input it on the web page, and let the LLM add this Tool
✨ After sending, the LLM will automatically convert this workflow into an MCP Tool
🎨 Now, refresh the page and send any image to perform Gaussian blur processing via LLM

🔌 2. Add a Complex MCP Tool

The steps are the same as above, only the workflow part differs (Download workflow: UI format and API format)

🔧 ComfyUI Workflow Custom Specification

🎨 Workflow Format

The system supports ComfyUI workflows. Just design your workflow in the canvas and export it as API format. Use special syntax in node titles to define parameters and outputs.

📝 Parameter Definition Specification

In the ComfyUI canvas, double-click the node title to edit, and use the following DSL syntax to define parameters:

$<param_name>.[~]<field_name>[!][:<description>]

🔍 Syntax Explanation:

param_name: The parameter name for the generated MCP tool function
~: Optional, indicates URL parameter upload processing, returns relative path
field_name: The corresponding input field in the node
!: Indicates this parameter is required
description: Description of the parameter

💡 Example:

Required parameter example:

Set LoadImage node title to: $image.image!:Input image URL
Meaning: Creates a required parameter named image, mapped to the node's image field

URL upload processing example:

Set any node title to: $image.~image!:Input image URL
Meaning: Creates a required parameter named image, system will automatically download URL and upload to ComfyUI, returns relative path

📝 Note: LoadImage, VHS_LoadAudioUpload, VHS_LoadVideo and other nodes have built-in functionality, no need to add ~ marker

Optional parameter example:

Set EmptyLatentImage node title to: $width.width:Image width, default 512
Meaning: Creates an optional parameter named width, mapped to the node's width field, default value is 512

🎯 Type Inference Rules

The system automatically infers parameter types based on the current value of the node field:

🔢 int: Integer values (e.g. 512, 1024)
📊 float: Floating-point values (e.g. 1.5, 3.14)
✅ bool: Boolean values (e.g. true, false)
📝 str: String values (default type)

📤 Output Definition Specification

🤖 Method 1: Auto-detect Output Nodes

The system will automatically detect the following common output nodes:

🖼️ SaveImage - Image save node
🎬 SaveVideo - Video save node
🔊 SaveAudio - Audio save node
📹 VHS_SaveVideo - VHS video save node
🎵 VHS_SaveAudio - VHS audio save node

🎯 Method 2: Manual Output Marking

Usually used for multiple outputs Use $output.var_name in any node title to mark output:

Set node title to: $output.result
The system will use this node's output as the tool's return value

📄 Tool Description Configuration (Optional)

You can add a node titled MCP in the workflow to provide a tool description:

Add a String (Multiline) or similar text node (must have a single string property, and the node field should be one of: value, text, string)
Set the node title to: MCP
Enter a detailed tool description in the value field

⚠️ Important Notes

🔒 Parameter Validation: Optional parameters (without !) must have default values set in the node
🔗 Node Connections: Fields already connected to other nodes will not be parsed as parameters
🏷️ Tool Naming: Exported file name will be used as the tool name, use meaningful English names
📋 Detailed Descriptions: Provide detailed parameter descriptions for better user experience
🎯 Export Format: Must export as API format, do not export as UI format

💬 Community

Scan the QR codes below to join our communities for latest updates and technical support:

Discord Community	WeChat Group

🤝 How to Contribute

We welcome all forms of contribution! Whether you're a developer, designer, or user, you can participate in the project in the following ways:

🐛 Report Issues

📋 Submit bug reports on the Issues page
🔍 Please search for similar issues before submitting
📝 Describe the reproduction steps and environment in detail

💡 Feature Suggestions

🚀 Submit feature requests in Issues
💭 Describe the feature you want and its use case
🎯 Explain how it improves user experience

🔧 Code Contributions

📋 Contribution Process

🍴 Fork this repo to your GitHub account
🌿 Create a feature branch: git checkout -b feature/your-feature-name
💻 Develop and add corresponding tests
📝 Commit changes: git commit -m "feat: add your feature"
📤 Push to your repo: git push origin feature/your-feature-name
🔄 Create a Pull Request to the main repo

🎨 Code Style

🐍 Python code follows PEP 8 style guide
📖 Add appropriate documentation and comments for new features

🧩 Contribute Workflows

📦 Share your ComfyUI workflows with the community
🛠️ Submit tested workflow files
📚 Add usage instructions and examples for workflows

🙏 Acknowledgements

❤️ Sincere thanks to the following organizations, projects, and teams for supporting the development and implementation of this project.

🧩 ComfyUI
💬 Chainlit
🔌 MCP
🎬 WanVideo
⚡ Flux
🤖 LiteLLM

License

This project is released under the MIT License (LICENSE, SPDX-License-identifier: MIT).

For Tasks:

Click tags to check more tools for each tasks

schedule posts analyze performance engage audience manage content track analytics

For Jobs:

social media manager content creator digital marketer social media strategist marketing coordinator

Alternative AI tools for Pixelle-MCP

Similar Open Source Tools

Pixelle-MCP

github

: 615

trubrics-sdk

Trubrics-sdk is a software development kit designed to facilitate the integration of analytics features into applications. It provides a set of tools and functionalities that enable developers to easily incorporate analytics capabilities, such as data collection, analysis, and reporting, into their software products. The SDK streamlines the process of implementing analytics solutions, allowing developers to focus on building and enhancing their applications' functionality and user experience. By leveraging trubrics-sdk, developers can quickly and efficiently integrate robust analytics features, gaining valuable insights into user behavior and application performance.

github

: 126

promptl

Promptl is a versatile command-line tool designed to streamline the process of creating and managing prompts for user input in various programming projects. It offers a simple and efficient way to prompt users for information, validate their input, and handle different scenarios based on their responses. With Promptl, developers can easily integrate interactive prompts into their scripts, applications, and automation workflows, enhancing user experience and improving overall usability. The tool provides a range of customization options and features, making it suitable for a wide range of use cases across different programming languages and environments.

github

: 71

duckduckgo-ai-chat

This repository contains a chatbot tool powered by AI technology. The chatbot is designed to interact with users in a conversational manner, providing information and assistance on various topics. Users can engage with the chatbot to ask questions, seek recommendations, or simply have a casual conversation. The AI technology behind the chatbot enables it to understand natural language inputs and provide relevant responses, making the interaction more intuitive and engaging. The tool is versatile and can be customized for different use cases, such as customer support, information retrieval, or entertainment purposes. Overall, the chatbot offers a user-friendly and interactive experience, leveraging AI to enhance communication and engagement.

github

: 83

context-portal

Context-portal is a versatile tool for managing and visualizing data in a collaborative environment. It provides a user-friendly interface for organizing and sharing information, making it easy for teams to work together on projects. With features such as customizable dashboards, real-time updates, and seamless integration with popular data sources, Context-portal streamlines the data management process and enhances productivity. Whether you are a data analyst, project manager, or team leader, Context-portal offers a comprehensive solution for optimizing workflows and driving better decision-making.

github

: 619

baibot

Baibot is a versatile chatbot framework designed to simplify the process of creating and deploying chatbots. It provides a user-friendly interface for building custom chatbots with various functionalities such as natural language processing, conversation flow management, and integration with external APIs. Baibot is highly customizable and can be easily extended to suit different use cases and industries. With Baibot, developers can quickly create intelligent chatbots that can interact with users in a seamless and engaging manner, enhancing user experience and automating customer support processes.

github

: 155

mcp-fundamentals

The mcp-fundamentals repository is a collection of fundamental concepts and examples related to microservices, cloud computing, and DevOps. It covers topics such as containerization, orchestration, CI/CD pipelines, and infrastructure as code. The repository provides hands-on exercises and code samples to help users understand and apply these concepts in real-world scenarios. Whether you are a beginner looking to learn the basics or an experienced professional seeking to refresh your knowledge, mcp-fundamentals has something for everyone.

github

: 100

AlphaAvatar

AlphaAvatar is a powerful tool for creating customizable avatars with AI-generated faces. It provides a user-friendly interface to design unique characters for various purposes such as gaming, virtual reality, social media, and more. With advanced AI algorithms, users can easily generate realistic and diverse avatars to enhance their projects and engage with their audience.

github

: 65

chatmcp

Chatmcp is a chatbot framework for building conversational AI applications. It provides a flexible and extensible platform for creating chatbots that can interact with users in a natural language. With Chatmcp, developers can easily integrate chatbot functionality into their applications, enabling users to communicate with the system through text-based conversations. The framework supports various natural language processing techniques and allows for the customization of chatbot behavior and responses. Chatmcp simplifies the development of chatbots by providing a set of pre-built components and tools that streamline the creation process. Whether you are building a customer support chatbot, a virtual assistant, or a chat-based game, Chatmcp offers the necessary features and capabilities to bring your conversational AI ideas to life.

github

: 659

airstate

AirState is a straightforward software development kit that enables users to integrate real-time collaboration functionalities into their web applications. With its user-friendly interface and robust capabilities, AirState simplifies the process of incorporating live collaboration features, making it an ideal choice for developers seeking to enhance the interactive elements of their projects. The SDK offers a seamless solution for creating engaging and interactive web experiences, allowing users to easily implement real-time collaboration tools without the need for extensive coding knowledge or complex configurations. By leveraging AirState, developers can streamline the development process and deliver dynamic web applications that facilitate real-time communication and collaboration among users.

github

: 112

PotPlayer_ChatGPT_Translate

PotPlayer_ChatGPT_Translate is a GitHub repository that provides a script to integrate ChatGPT with PotPlayer for real-time translation of chat messages during video playback. The script utilizes the power of ChatGPT's natural language processing capabilities to translate chat messages in various languages, enhancing the viewing experience for users who consume video content with subtitles or chat interactions. By seamlessly integrating ChatGPT with PotPlayer, this tool offers a convenient solution for users to enjoy multilingual content without the need for manual translation efforts. The repository includes detailed instructions on how to set up and use the script, making it accessible for both novice and experienced users interested in leveraging AI-powered translation services within the PotPlayer environment.

github

: 662

Memento

Memento is a lightweight and user-friendly version control tool designed for small to medium-sized projects. It provides a simple and intuitive interface for managing project versions and collaborating with team members. With Memento, users can easily track changes, revert to previous versions, and merge different branches. The tool is suitable for developers, designers, content creators, and other professionals who need a streamlined version control solution. Memento simplifies the process of managing project history and ensures that team members are always working on the latest version of the project.

github

: 1.0k

Hexabot

Hexabot Community Edition is an open-source chatbot solution designed for flexibility and customization, offering powerful text-to-action capabilities. It allows users to create and manage AI-powered, multi-channel, and multilingual chatbots with ease. The platform features an analytics dashboard, multi-channel support, visual editor, plugin system, NLP/NLU management, multi-lingual support, CMS integration, user roles & permissions, contextual data, subscribers & labels, and inbox & handover functionalities. The directory structure includes frontend, API, widget, NLU, and docker components. Prerequisites for running Hexabot include Docker and Node.js. The installation process involves cloning the repository, setting up the environment, and running the application. Users can access the UI admin panel and live chat widget for interaction. Various commands are available for managing the Docker services. Detailed documentation and contribution guidelines are provided for users interested in contributing to the project.

github

: 855

obsidian-NotEMD

Obsidian-NotEMD is a plugin for the Obsidian note-taking app that allows users to export notes in various formats without converting them to EMD. It simplifies the process of sharing and collaborating on notes by providing seamless export options. With Obsidian-NotEMD, users can easily export their notes to PDF, HTML, Markdown, and other formats directly from Obsidian, saving time and effort. This plugin enhances the functionality of Obsidian by streamlining the export process and making it more convenient for users to work with their notes across different platforms and applications.

github

: 60

mdream

Mdream is a lightweight and user-friendly markdown editor designed for developers and writers. It provides a simple and intuitive interface for creating and editing markdown files with real-time preview. The tool offers syntax highlighting, markdown formatting options, and the ability to export files in various formats. Mdream aims to streamline the writing process and enhance productivity for individuals working with markdown documents.

github

: 604

arcade-ai

Arcade AI is a developer-focused tooling and API platform designed to enhance the capabilities of LLM applications and agents. It simplifies the process of connecting agentic applications with user data and services, allowing developers to concentrate on building their applications. The platform offers prebuilt toolkits for interacting with various services, supports multiple authentication providers, and provides access to different language models. Users can also create custom toolkits and evaluate their tools using Arcade AI. Contributions are welcome, and self-hosting is possible with the provided documentation.

github

: 654

For similar tasks

intro-llm-rag

This repository serves as a comprehensive guide for technical teams interested in developing conversational AI solutions using Retrieval-Augmented Generation (RAG) techniques. It covers theoretical knowledge and practical code implementations, making it suitable for individuals with a basic technical background. The content includes information on large language models (LLMs), transformers, prompt engineering, embeddings, vector stores, and various other key concepts related to conversational AI. The repository also provides hands-on examples for two different use cases, along with implementation details and performance analysis.

github

: 182

LLM-Viewer

LLM-Viewer is a tool for visualizing Language and Learning Models (LLMs) and analyzing performance on different hardware platforms. It enables network-wise analysis, considering factors such as peak memory consumption and total inference time cost. With LLM-Viewer, users can gain valuable insights into LLM inference and performance optimization. The tool can be used in a web browser or as a command line interface (CLI) for easy configuration and visualization. The ongoing project aims to enhance features like showing tensor shapes, expanding hardware platform compatibility, and supporting more LLMs with manual model graph configuration.

github

: 210

llm-colosseum

llm-colosseum is a tool designed to evaluate Language Model Models (LLMs) in real-time by making them fight each other in Street Fighter III. The tool assesses LLMs based on speed, strategic thinking, adaptability, out-of-the-box thinking, and resilience. It provides a benchmark for LLMs to understand their environment and take context-based actions. Users can analyze the performance of different LLMs through ELO rankings and win rate matrices. The tool allows users to run experiments, test different LLM models, and customize prompts for LLM interactions. It offers installation instructions, test mode options, logging configurations, and the ability to run the tool with local models. Users can also contribute their own LLM models for evaluation and ranking.

github

: 1.3k

eureka-ml-insights

The Eureka ML Insights Framework is a repository containing code designed to help researchers and practitioners run reproducible evaluations of generative models efficiently. Users can define custom pipelines for data processing, inference, and evaluation, as well as utilize pre-defined evaluation pipelines for key benchmarks. The framework provides a structured approach to conducting experiments and analyzing model performance across various tasks and modalities.

github

: 106

Pixelle-MCP

github

: 615

trae-agent

Trae-agent is a Python library for building and training reinforcement learning agents. It provides a simple and flexible framework for implementing various reinforcement learning algorithms and experimenting with different environments. With Trae-agent, users can easily create custom agents, define reward functions, and train them on a variety of tasks. The library also includes utilities for visualizing agent performance and analyzing training results, making it a valuable tool for both beginners and experienced researchers in the field of reinforcement learning.

github

: 9.3k

dataset-viewer

Dataset Viewer is a modern, high-performance tool built with Tauri, React, and TypeScript, designed to handle massive datasets from multiple sources with efficient streaming for large files (100GB+) and lightning-fast search capabilities. It supports instant large file opening, real-time search, direct archive preview, multi-protocol and multi-format support, and features a modern interface with dark/light themes and responsive design. The tool is perfect for data scientists, log analysis, archive management, remote access, and performance-critical tasks.

github

: 523

basehub

JavaScript / TypeScript SDK for BaseHub, the first AI-native content hub. **Features:** * ✨ Infers types from your BaseHub repository... _meaning IDE autocompletion works great._ * 🏎️ No dependency on graphql... _meaning your bundle is more lightweight._ * 🌐 Works everywhere `fetch` is supported... _meaning you can use it anywhere._

github

: 183

For similar jobs

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

daily-poetry-image

Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.

github

: 492

exif-photo-blog

EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.

github

: 1.4k

SillyTavern

SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.

github

: 18.8k

Twitter-Insight-LLM

This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).

github

: 401

AISuperDomain

Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.

github

: 1.2k

ChatGPT-On-CS

This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.

github

: 768

obs-localvocal

LocalVocal is a live-streaming AI assistant plugin for OBS that allows you to transcribe audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). It's privacy-first, with all data staying on your machine, and requires no GPU, cloud costs, network, or downtime.

github

: 248