runpod-worker-comfy

ComfyUI as a serverless API on RunPod

Stars: 412

Visit

runpod-worker-comfy is a serverless API tool that allows users to run any ComfyUI workflow to generate an image. Users can provide input images as base64-encoded strings, and the generated image can be returned as a base64-encoded string or uploaded to AWS S3. The tool is built on Ubuntu + NVIDIA CUDA and provides features like built-in checkpoints and VAE models. Users can configure environment variables to upload images to AWS S3 and interact with the RunPod API to generate images. The tool also supports local testing and deployment to Docker hub using Github Actions.

README:

runpod-worker-comfy

ComfyUI as a serverless API on RunPod

Quickstart
Features
Config
- Upload image to AWS S3
Use the Docker image on RunPod
API specification
- JSON Request Body
- Fields
  - "input.images"
Interact with your RunPod API
- Health status
- Generate an image
  - Example request for SDXL with cURL
How to get the workflow from ComfyUI?
Bring Your Own Models and Nodes
- Network Volume
- Custom Docker Image
Local testing
Automatically deploy to Docker hub with GitHub Actions
Acknowledgments

Quickstart

🐳 Choose one of the five available images for your serverless endpoint:
- timpietruskyblibla/runpod-worker-comfy:3.5.0-base: doesn't contain anything, just a clean ComfyUI
- timpietruskyblibla/runpod-worker-comfy:3.5.0-flux1-schnell: contains the checkpoint, text encoders and VAE for FLUX.1 schnell
- timpietruskyblibla/runpod-worker-comfy:3.5.0-flux1-dev: contains the checkpoint, text encoders and VAE for FLUX.1 dev
- timpietruskyblibla/runpod-worker-comfy:3.5.0-sdxl: contains the checkpoint and VAE for Stable Diffusion XL
- timpietruskyblibla/runpod-worker-comfy:3.5.0-sd3: contains the checkpoint for Stable Diffusion 3 medium
ℹ️ Use the Docker image on RunPod
🧪 Pick an example workflow & send it to your deployed endpoint

Features

Run any ComfyUI workflow to generate an image
Provide input images as base64-encoded string
The generated image is either:
- Returned as base64-encoded string (default)
- Uploaded to AWS S3 (if AWS S3 is configured)
There are a few different Docker images to choose from:
- timpietruskyblibla/runpod-worker-comfy:3.5.0-flux1-schnell: contains the flux1-schnell.safetensors checkpoint, the clip_l.safetensors + t5xxl_fp8_e4m3fn.safetensors text encoders and ae.safetensors VAE for FLUX.1-schnell
- timpietruskyblibla/runpod-worker-comfy:3.5.0-flux1-dev: contains the flux1-dev.safetensors checkpoint, the clip_l.safetensors + t5xxl_fp8_e4m3fn.safetensors text encoders and ae.safetensors VAE for FLUX.1-dev
- timpietruskyblibla/runpod-worker-comfy:3.5.0-sdxl: contains the checkpoints and VAE for Stable Diffusion XL
  - Checkpoint: sd_xl_base_1.0.safetensors
  - VAEs:
    - sdxl_vae.safetensors
    - sdxl-vae-fp16-fix
- timpietruskyblibla/runpod-worker-comfy:3.5.0-sd3: contains the sd3_medium_incl_clips_t5xxlfp8.safetensors checkpoint for Stable Diffusion 3 medium
Bring your own models
Based on Ubuntu + NVIDIA CUDA

Config

Environment Variable	Description	Default
`REFRESH_WORKER`	When you want to stop the worker after each finished job to have a clean state, see official documentation.	`false`
`COMFY_POLLING_INTERVAL_MS`	Time to wait between poll attempts in milliseconds.	`250`
`COMFY_POLLING_MAX_RETRIES`	Maximum number of poll attempts. This should be increased the longer your workflow is running.	`500`
`SERVE_API_LOCALLY`	Enable local API server for development and testing. See Local Testing for more details.	disabled

Upload image to AWS S3

This is only needed if you want to upload the generated picture to AWS S3. If you don't configure this, your image will be exported as base64-encoded string.

Create a bucket in region of your choice in AWS S3 (BUCKET_ENDPOINT_URL)
Create an IAM that has access rights to AWS S3
Create an Access-Key (BUCKET_ACCESS_KEY_ID & BUCKET_SECRET_ACCESS_KEY) for that IAM
Configure these environment variables for your RunPod worker:

Environment Variable	Description	Example
`BUCKET_ENDPOINT_URL`	The endpoint URL of your S3 bucket.	`https://<bucket>.s3.<region>.amazonaws.com`
`BUCKET_ACCESS_KEY_ID`	Your AWS access key ID for accessing the S3 bucket.	`AKIAIOSFODNN7EXAMPLE`
`BUCKET_SECRET_ACCESS_KEY`	Your AWS secret access key for accessing the S3 bucket.	`wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY`

Use the Docker image on RunPod

Create your template (optional)

Create a new template by clicking on New Template
In the dialog, configure:
- Template Name: runpod-worker-comfy (it can be anything you want)
- Template Type: serverless (change template type to "serverless")
- Container Image: <dockerhub_username>/<repository_name>:tag, in this case: timpietruskyblibla/runpod-worker-comfy:3.5.0-sd3 (or -base for a clean image or -sdxl for Stable Diffusion XL or -flex1-schnell for FLUX.1 schnell)
- Container Registry Credentials: You can leave everything as it is, as this repo is public
- Container Disk: 20 GB
- (optional) Environment Variables: Configure S3
  - Note: You can also not configure it, the images will then stay in the worker. In order to have them stored permanently, we have to add the network volume
Click on Save Template

Create your endpoint

Navigate to Serverless > Endpoints and click on New Endpoint
In the dialog, configure:
- Endpoint Name: comfy
- Worker configuration: Select a GPU that can run the model you have chosen (see GPU recommendations)
- Active Workers: 0 (whatever makes sense for you)
- Max Workers: 3 (whatever makes sense for you)
- GPUs/Worker: 1
- Idle Timeout: 5 (you can leave the default)
- Flash Boot: enabled (doesn't cost more, but provides faster boot of our worker, which is good)
- Select Template: runpod-worker-comfy (or whatever name you gave your template)
- (optional) Advanced: If you are using a Network Volume, select it under Select Network Volume. Otherwise leave the defaults.
Click deploy
Your endpoint will be created, you can click on it to see the dashboard

GPU recommendations

Model	Image	Minimum VRAM Required	Container Size
Stable Diffusion XL	`sdxl`	8 GB	15 GB
Stable Diffusion 3 Medium	`sd3`	5 GB	20 GB
FLUX.1 Schnell	`flux1-schnell`	24 GB	30 GB
FLUX.1 dev	`flux1-dev`	24 GB	30 GB

API specification

The following describes which fields exist when doing requests to the API. We only describe the fields that are sent via input as those are needed by the worker itself. For a full list of fields, please take a look at the official documentation.

JSON Request Body

{
  "input": {
    "workflow": {},
    "images": [
      {
        "name": "example_image_name.png",
        "image": "base64_encoded_string"
      }
    ]
  }
}

Fields

Field Path	Type	Required	Description
`input`	Object	Yes	The top-level object containing the request data.
`input.workflow`	Object	Yes	Contains the ComfyUI workflow configuration.
`input.images`	Array	No	An array of images. Each image will be added into the "input"-folder of ComfyUI and can then be used in the workflow by using it's `name`

"input.images"

An array of images, where each image should have a different name.

🚨 The request body for a RunPod endpoint is 10 MB for /run and 20 MB for /runsync, so make sure that your input images are not super huge as this will be blocked by RunPod otherwise, see the official documentation

Field Name	Type	Required	Description
`name`	String	Yes	The name of the image. Please use the same name in your workflow to reference the image.
`image`	String	Yes	A base64 encoded string of the image.

Interact with your RunPod API

Generate an API Key:
- In the User Settings, click on API Keys and then on the API Key button.
- Save the generated key somewhere safe, as you will not be able to see it again when you navigate away from the page.
Use the API Key:
- Use cURL or any other tool to access the API using the API key and your Endpoint ID:
  - Replace <api_key> with your key.
Use your Endpoint:
- Replace <endpoint_id> with the ID of the endpoint. (You can find the endpoint ID by clicking on your endpoint; it is written underneath the name of the endpoint at the top and also part of the URLs shown at the bottom of the first box.)

Health status

curl -H "Authorization: Bearer <api_key>" https://api.runpod.ai/v2/<endpoint_id>/health

Generate an image

You can either create a new job async by using /run or a sync by using /runsync. The example here is using a sync job and waits until the response is delivered.

The API expects a JSON in this form, where workflow is the workflow from ComfyUI, exported as JSON and images is optional.

Please also take a look at the test_input.json to see how the API input should look like.

Example request for SDXL with cURL

curl -X POST -H "Authorization: Bearer <api_key>" -H "Content-Type: application/json" -d '{"input":{"workflow":{"3":{"inputs":{"seed":1337,"steps":20,"cfg":8,"sampler_name":"euler","scheduler":"normal","denoise":1,"model":["4",0],"positive":["6",0],"negative":["7",0],"latent_image":["5",0]},"class_type":"KSampler"},"4":{"inputs":{"ckpt_name":"sd_xl_base_1.0.safetensors"},"class_type":"CheckpointLoaderSimple"},"5":{"inputs":{"width":512,"height":512,"batch_size":1},"class_type":"EmptyLatentImage"},"6":{"inputs":{"text":"beautiful scenery nature glass bottle landscape, purple galaxy bottle,","clip":["4",1]},"class_type":"CLIPTextEncode"},"7":{"inputs":{"text":"text, watermark","clip":["4",1]},"class_type":"CLIPTextEncode"},"8":{"inputs":{"samples":["3",0],"vae":["4",2]},"class_type":"VAEDecode"},"9":{"inputs":{"filename_prefix":"ComfyUI","images":["8",0]},"class_type":"SaveImage"}}}}' https://api.runpod.ai/v2/<endpoint_id>/runsync

Example response with AWS S3 bucket configuration

{
  "delayTime": 2188,
  "executionTime": 2297,
  "id": "sync-c0cd1eb2-068f-4ecf-a99a-55770fc77391-e1",
  "output": {
    "message": "https://bucket.s3.region.amazonaws.com/10-23/sync-c0cd1eb2-068f-4ecf-a99a-55770fc77391-e1/c67ad621.png",
    "status": "success"
  },
  "status": "COMPLETED"
}

Example response as base64-encoded image

{
  "delayTime": 2188,
  "executionTime": 2297,
  "id": "sync-c0cd1eb2-068f-4ecf-a99a-55770fc77391-e1",
  "output": { "message": "base64encodedimage", "status": "success" },
  "status": "COMPLETED"
}

How to get the workflow from ComfyUI?

Open ComfyUI in the browser
Open the Settings (gear icon in the top right of the menu)
In the dialog that appears configure:
- Enable Dev mode Options: enable
- Close the Settings
In the menu, click on the Save (API Format) button, which will download a file named workflow_api.json

You can now take the content of this file and put it into your workflow when interacting with the API.

Bring Your Own Models and Nodes

Network Volume

Using a Network Volume allows you to store and access custom models:

Create a Network Volume:
- Follow the RunPod Network Volumes guide to create a volume.

Populate the Volume:

Create a temporary GPU instance:
- Navigate to Manage > Storage, click Deploy under the volume, and deploy any GPU or CPU instance.
- Navigate to Manage > Pods. Under the new pod, click Connect to open a shell (either via Jupyter notebook or SSH).

Populate the volume with your models:

cd /workspace
for i in checkpoints clip clip_vision configs controlnet embeddings loras upscale_models vae; do mkdir -p models/$i; done
wget -O models/checkpoints/sd_xl_turbo_1.0_fp16.safetensors https://huggingface.co/stabilityai/sdxl-turbo/resolve/main/sd_xl_turbo_1.0_fp16.safetensors

Delete the Temporary GPU Instance:
- Once populated, terminate the temporary GPU instance.
Configure Your Endpoint:
- Use the Network Volume in your endpoint configuration:
  - Either create a new endpoint or update an existing one.
  - In the endpoint configuration, under Advanced > Select Network Volume, select your Network Volume.

Note: The folders in the Network Volume are automatically available to ComfyUI when the network volume is configured and attached.

Custom Docker Image

If you prefer to include your models and custom nodes directly in the Docker image, follow these steps:

Fork the Repository:
- Fork this repository to your own GitHub account.

Adding Custom Models

To include additional models in your Docker image, edit the Dockerfile and add the download commands:

RUN wget -O models/checkpoints/sd_xl_base_1.0.safetensors https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

Adding Custom Nodes

To include custom nodes in your Docker image:

Export a snapshot from ComfyUI Manager that includes all your desired custom nodes
1. Open "Manager > Snapshot Manager"
2. Create a new snapshot by clicking on "Save snapshot"
3. Get the *_snapshot.json from your ComfyUI: ComfyUI/custom_nodes/ComfyUI-Manager/snapshots
Save the snapshot file in the root directory of the project
The snapshot will be automatically restored during the Docker build process, see Building the Image

[!NOTE]

Some custom nodes may download additional models during installation, which can significantly increase the image size

Having many custom nodes may increase ComfyUI's initialization time

Building the Image

Build your customized Docker image locally:

# Build the base image
docker build -t <your_dockerhub_username>/runpod-worker-comfy:dev-base --target base --platform linux/amd64 .

# Build the SDXL image
docker build --build-arg MODEL_TYPE=sdxl -t <your_dockerhub_username>/runpod-worker-comfy:dev-sdxl --platform linux/amd64 .

# Build the SD3 image
docker build --build-arg MODEL_TYPE=sd3 --build-arg HUGGINGFACE_ACCESS_TOKEN=<your-huggingface-token> -t <your_dockerhub_username>/runpod-worker-comfy:dev-sd3 --platform linux/amd64 .

[!NOTE]
Ensure to specify --platform linux/amd64 to avoid errors on RunPod, see issue #13

Local testing

Both tests will use the data from test_input.json, so make your changes in there to test this properly.

Setup

Make sure you have Python >= 3.10
Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
- Windows:
```
.\venv\Scripts\activate
```
- Mac / Linux:
```
source ./venv/bin/activate
```
Install the dependencies:
```
pip install -r requirements.txt
```

Setup for Windows

Install WSL2 and a Linux distro (like Ubuntu) following this guide. You can skip the "Install and use a GUI package" part.
After installing Ubuntu, open the terminal and log in:
```
wsl -d Ubuntu
```
Update the packages:
```
sudo apt update
```
Install Docker in Ubuntu:
- Follow the official Docker installation guide.
- Install docker-compose:
```
sudo apt-get install docker-compose
```
- Install the NVIDIA Toolkit in Ubuntu: Follow this guide and create the nvidia runtime.
Enable GPU acceleration on Ubuntu on WSL2: Follow this guide.
- If you already have your GPU driver installed on Windows, you can skip the "Install the appropriate Windows vGPU driver for WSL" step.
Add your user to the docker group to use Docker without sudo:
```
sudo usermod -aG docker $USER
```

Once these steps are completed, you can either run the Docker image directly on Windows using Docker Desktop or switch to Ubuntu in the terminal to run the Docker image via WSL

wsl -d Ubuntu

[!NOTE]

Windows: Accessing the API or ComfyUI might not work when you run the Docker Image via WSL, so it is recommended to run the Docker Image directly on Windows using Docker Desktop

Testing the RunPod handler

Run all tests: python -m unittest discover
If you want to run a specific test: python -m unittest tests.test_rp_handler.TestRunpodWorkerComfy.test_bucket_endpoint_not_configured

You can also start the handler itself to have the local server running: python src/rp_handler.py To get this to work you will also need to start "ComfyUI", otherwise the handler will not work.

Local API

For enhanced local development, you can start an API server that simulates the RunPod worker environment. This feature is particularly useful for debugging and testing your integrations locally.

Set the SERVE_API_LOCALLY environment variable to true to activate the local API server when running your Docker container. This is already the default value in the docker-compose.yml, so you can get it running by executing:

docker-compose up

[!NOTE]

This will only work on computer with an NVIDIA GPU for now, as it requires CUDA. Please open an issue if you want to use it on a CPU / Mac

Access the local Worker API

With the local API server running, it's accessible at: localhost:8000
When you open this in your browser, you can also see the API documentation and can interact with the API directly

[!NOTE]

Windows: Accessing the API or ComfyUI might not work when you run the Docker Image via WSL, so it is recommended to run the Docker Image directly on Windows using Docker Desktop

Access local ComfyUI

With the local API server running, you can access ComfyUI at: localhost:8188

[!NOTE]

Windows: Accessing the API or ComfyUI might not work when you run the Docker Image via WSL, so it is recommended to run the Docker Image directly on Windows using Docker Desktop

Automatically deploy to Docker hub with GitHub Actions

The repo contains two workflows that publish the image to Docker hub using GitHub Actions:

dev.yml: Creates the image and pushes it to Docker hub with the dev tag on every push to the main branch
release.yml: Creates the image and pushes it to Docker hub with the latest and the release tag. It will only be triggered when you create a release on GitHub

If you want to use this, you should add these secrets to your repository:

Configuration Variable	Description	Example Value
`DOCKERHUB_USERNAME`	Your Docker Hub username.	`your-username`
`DOCKERHUB_TOKEN`	Your Docker Hub token for authentication.	`your-token`
`HUGGINGFACE_ACCESS_TOKEN`	Your READ access token from Hugging Face	`your-access-token`

And also make sure to add these variables to your repository:

Variable Name	Description	Example Value
`DOCKERHUB_REPO`	The repository on Docker Hub where the image will be pushed.	`timpietruskyblibla`
`DOCKERHUB_IMG`	The name of the image to be pushed to Docker Hub.	`runpod-worker-comfy`

Acknowledgments

Thanks to all contributors for your awesome work
Thanks to Justin Merrell from RunPod for worker-1111, which was used to get inspired on how to create this worker
Thanks to Ashley Kleynhans for runpod-worker-a1111, which was used to get inspired on how to create this worker
Thanks to comfyanonymous for creating ComfyUI, which provides such an awesome API to interact with Stable Diffusion and beyond

For Tasks:

Click tags to check more tools for each tasks

generate image upload image interact with api local testing deploy to docker hub

For Jobs:

image processing specialist ai workflow engineer cloud computing developer machine learning researcher devops engineer

Alternative AI tools for runpod-worker-comfy

Similar Open Source Tools

runpod-worker-comfy

github

: 412

chatgpt-cli

ChatGPT CLI provides a powerful command-line interface for seamless interaction with ChatGPT models via OpenAI and Azure. It features streaming capabilities, extensive configuration options, and supports various modes like streaming, query, and interactive mode. Users can manage thread-based context, sliding window history, and provide custom context from any source. The CLI also offers model and thread listing, advanced configuration options, and supports GPT-4, GPT-3.5-turbo, and Perplexity's models. Installation is available via Homebrew or direct download, and users can configure settings through default values, a config.yaml file, or environment variables.

github

: 661

BodhiApp

Bodhi App runs Open Source Large Language Models locally, exposing LLM inference capabilities as OpenAI API compatible REST APIs. It leverages llama.cpp for GGUF format models and huggingface.co ecosystem for model downloads. Users can run fine-tuned models for chat completions, create custom aliases, and convert Huggingface models to GGUF format. The CLI offers commands for environment configuration, model management, pulling files, serving API, and more.

github

: 67

raycast_api_proxy

The Raycast AI Proxy is a tool that acts as a proxy for the Raycast AI application, allowing users to utilize the application without subscribing. It intercepts and forwards Raycast requests to various AI APIs, then reformats the responses for Raycast. The tool supports multiple AI providers and allows for custom model configurations. Users can generate self-signed certificates, add them to the system keychain, and modify DNS settings to redirect requests to the proxy. The tool is designed to work with providers like OpenAI, Azure OpenAI, Google, and more, enabling tasks such as AI chat completions, translations, and image generation.

github

: 317

code2prompt

Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.

github

: 734

Construction-Hazard-Detection

Construction-Hazard-Detection is an AI-driven tool focused on improving safety at construction sites by utilizing the YOLOv8 model for object detection. The system identifies potential hazards like overhead heavy loads and steel pipes, providing real-time analysis and warnings. Users can configure the system via a YAML file and run it using Docker. The primary dataset used for training is the Construction Site Safety Image Dataset enriched with additional annotations. The system logs are accessible within the Docker container for debugging, and notifications are sent through the LINE messaging API when hazards are detected.

github

: 153

rag-chatbot

The RAG ChatBot project combines Lama.cpp, Chroma, and Streamlit to build a Conversation-aware Chatbot and a Retrieval-augmented generation (RAG) ChatBot. The RAG Chatbot works by taking a collection of Markdown files as input and provides answers based on the context provided by those files. It utilizes a Memory Builder component to load Markdown pages, divide them into sections, calculate embeddings, and save them in an embedding database. The chatbot retrieves relevant sections from the database, rewrites questions for optimal retrieval, and generates answers using a local language model. It also remembers previous interactions for more accurate responses. Various strategies are implemented to deal with context overflows, including creating and refining context, hierarchical summarization, and async hierarchical summarization.

github

: 194

llm2sh

llm2sh is a command-line utility that leverages Large Language Models (LLMs) to translate plain-language requests into shell commands. It provides a convenient way to interact with your system using natural language. The tool supports multiple LLMs for command generation, offers a customizable configuration file, YOLO mode for running commands without confirmation, and is easily extensible with new LLMs and system prompts. Users can set up API keys for OpenAI, Claude, Groq, and Cerebras to use the tool effectively. llm2sh does not store user data or command history, and it does not record or send telemetry by itself, but the LLM APIs may collect and store requests and responses for their purposes.

github

: 188

mcp-graphql

github

: 62

stable-diffusion-webui

Stable Diffusion WebUI Docker Image allows users to run Automatic1111 WebUI in a docker container locally or in the cloud. The images do not bundle models or third-party configurations, requiring users to use a provisioning script for container configuration. It supports NVIDIA CUDA, AMD ROCm, and CPU platforms, with additional environment variables for customization and pre-configured templates for Vast.ai and Runpod.io. The service is password protected by default, with options for version pinning, startup flags, and service management using supervisorctl.

github

: 98

playword

PlayWord is a tool designed to supercharge web test automation experience with AI. It provides core features such as enabling browser operations and validations using natural language inputs, as well as monitoring interface to record and dry-run test steps. PlayWord supports multiple AI services including Anthropic, Google, and OpenAI, allowing users to select the appropriate provider based on their requirements. The tool also offers features like assertion handling, frame handling, custom variables, test recordings, and an Observer module to track user interactions on web pages. With PlayWord, users can interact with web pages using natural language commands, reducing the need to worry about element locators and providing AI-powered adaptation to UI changes.

github

: 52

xFasterTransformer

xFasterTransformer is an optimized solution for Large Language Models (LLMs) on the X86 platform, providing high performance and scalability for inference on mainstream LLM models. It offers C++ and Python APIs for easy integration, along with example codes and benchmark scripts. Users can prepare models in a different format, convert them, and use the APIs for tasks like encoding input prompts, generating token ids, and serving inference requests. The tool supports various data types and models, and can run in single or multi-rank modes using MPI. A web demo based on Gradio is available for popular LLM models like ChatGLM and Llama2. Benchmark scripts help evaluate model inference performance quickly, and MLServer enables serving with REST and gRPC interfaces.

github

: 247

LEADS

LEADS is a lightweight embedded assisted driving system designed to simplify the development of instrumentation, control, and analysis systems for racing cars. It is written in Python and C/C++ with impressive performance. The system is customizable and provides abstract layers for component rearrangement. It supports hardware components like Raspberry Pi and Arduino, and can adapt to various hardware types. LEADS offers a modular structure with a focus on flexibility and lightweight design. It includes robust safety features, modern GUI design with dark mode support, high performance on different platforms, and powerful ESC systems for traction control and braking. The system also supports real-time data sharing, live video streaming, and AI-enhanced data analysis for driver training. LEADS VeC Remote Analyst enables transparency between the driver and pit crew, allowing real-time data sharing and analysis. The system is designed to be user-friendly, adaptable, and efficient for racing car development.

github

: 241

steel-browser

Steel is an open-source browser API designed for AI agents and applications, simplifying the process of building live web agents and browser automation tools. It serves as a core building block for a production-ready, containerized browser sandbox with features like stealth capabilities, text-to-markdown session management, UI for session viewing/debugging, and full browser control through popular automation frameworks. Steel allows users to control, run, and manage a production-ready browser environment via a REST API, offering features such as full browser control, session management, proxy support, extension support, debugging tools, anti-detection mechanisms, resource management, and various browser tools. It aims to streamline complex browsing tasks programmatically, enabling users to focus on their AI applications while Steel handles the underlying complexity.

github

: 4.1k

comfyui

ComfyUI is a highly-configurable, cloud-first AI-Dock container that allows users to run ComfyUI without bundled models or third-party configurations. Users can configure the container using provisioning scripts. The Docker image supports NVIDIA CUDA, AMD ROCm, and CPU platforms, with version tags for different configurations. Additional environment variables and Python environments are provided for customization. ComfyUI service runs on port 8188 and can be managed using supervisorctl. The tool also includes an API wrapper service and pre-configured templates for Vast.ai. The author may receive compensation for services linked in the documentation.

github

: 434

showdown

Showdown is a Pokémon battle-bot that can play battles on Pokemon Showdown. It can play single battles in generations 3 through 8. The project offers different battle bot implementations such as Safest, Nash-Equilibrium, Team Datasets, and Most Damage. Users can configure the bot using environment variables and run it either without Docker by cloning the repository and installing requirements or with Docker by building the Docker image and running it with an environment variable file. Additionally, users can write their own bot by creating a package in showdown/battle_bots with a module named main.py and implementing a find_best_move function.

github

: 238

For similar tasks

runpod-worker-comfy

github

: 412

wenxin-starter

WenXin-Starter is a spring-boot-starter for Baidu's "Wenxin Qianfan WENXINWORKSHOP" large model, which can help you quickly access Baidu's AI capabilities. It fully integrates the official API documentation of Wenxin Qianfan. Supports text-to-image generation, built-in dialogue memory, and supports streaming return of dialogue. Supports QPS control of a single model and supports queuing mechanism. Plugins will be added soon.

github

: 207

modelfusion

ModelFusion is an abstraction layer for integrating AI models into JavaScript and TypeScript applications, unifying the API for common operations such as text streaming, object generation, and tool usage. It provides features to support production environments, including observability hooks, logging, and automatic retries. You can use ModelFusion to build AI applications, chatbots, and agents. ModelFusion is a non-commercial open source project that is community-driven. You can use it with any supported provider. ModelFusion supports a wide range of models including text generation, image generation, vision, text-to-speech, speech-to-text, and embedding models. ModelFusion infers TypeScript types wherever possible and validates model responses. ModelFusion provides an observer framework and logging support. ModelFusion ensures seamless operation through automatic retries, throttling, and error handling mechanisms. ModelFusion is fully tree-shakeable, can be used in serverless environments, and only uses a minimal set of dependencies.

github

: 918

freeGPT

freeGPT provides free access to text and image generation models. It supports various models, including gpt3, gpt4, alpaca_7b, falcon_40b, prodia, and pollinations. The tool offers both asynchronous and non-asynchronous interfaces for text completion and image generation. It also features an interactive Discord bot that provides access to all the models in the repository. The tool is easy to use and can be integrated into various applications.

github

: 361

generative-ai-go

The Google AI Go SDK enables developers to use Google's state-of-the-art generative AI models (like Gemini) to build AI-powered features and applications. It supports use cases like generating text from text-only input, generating text from text-and-images input (multimodal), building multi-turn conversations (chat), and embedding.

github

: 557

ai-flow

AI Flow is an open-source, user-friendly UI application that empowers you to seamlessly connect multiple AI models together, specifically leveraging the capabilities of multiples AI APIs such as OpenAI, StabilityAI and Replicate. In a nutshell, AI Flow provides a visual platform for crafting and managing AI-driven workflows, thereby facilitating diverse and dynamic AI interactions.

github

: 188

liboai

liboai is a simple C++17 library for the OpenAI API, providing developers with access to OpenAI endpoints through a collection of methods and classes. It serves as a spiritual port of OpenAI's Python library, 'openai', with similar structure and features. The library supports various functionalities such as ChatGPT, Audio, Azure, Functions, Image DALL·E, Models, Completions, Edit, Embeddings, Files, Fine-tunes, Moderation, and Asynchronous Support. Users can easily integrate the library into their C++ projects to interact with OpenAI services.

github

: 321

OpenAI-DotNet

OpenAI-DotNet is a simple C# .NET client library for OpenAI to use through their RESTful API. It is independently developed and not an official library affiliated with OpenAI. Users need an OpenAI API account to utilize this library. The library targets .NET 6.0 and above, working across various platforms like console apps, winforms, wpf, asp.net, etc., and on Windows, Linux, and Mac. It provides functionalities for authentication, interacting with models, assistants, threads, chat, audio, images, files, fine-tuning, embeddings, and moderations.

github

: 732

For similar jobs

runpod-worker-comfy

github

: 412

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

ai-on-gke

This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

github

: 280

tidb

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

github

: 37.1k

nvidia_gpu_exporter

Nvidia GPU exporter for prometheus, using `nvidia-smi` binary to gather metrics.

github

: 1.1k

tracecat

Tracecat is an open-source automation platform for security teams. It's designed to be simple but powerful, with a focus on AI features and a practitioner-obsessed UI/UX. Tracecat can be used to automate a variety of tasks, including phishing email investigation, evidence collection, and remediation plan generation.

github

: 2.6k

openinference

OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.

github

: 362

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953