sd-webui-agent-scheduler

An open source Scheduling Agent for Generative AI

Stars: 646

Visit

AgentScheduler is an Automatic/Vladmandic Stable Diffusion Web UI extension designed to enhance image generation workflows. It allows users to enqueue prompts, settings, and controlnets, manage queued tasks, prioritize, pause, resume, and delete tasks, view generation results, and more. The extension offers hidden features like queuing checkpoints, editing queued tasks, and custom checkpoint selection. Users can access the functionality through HTTP APIs and API callbacks. Troubleshooting steps are provided for common errors. The extension is compatible with latest versions of A1111 and Vladmandic. It is licensed under Apache License 2.0.

README:

Agent Scheduler

Introducing AgentScheduler, an Automatic1111/Vladmandic Stable Diffusion Web UI extension to power up your image generation workflow!

Compatibility

This version of AgentScheduler is compatible with latest versions of:

A1111: commit baf6946
Vladmandic: commit 9726b4d

Older versions may not working properly.

Installation

Using Vlad's WebUI Fork

The extension is already included in Vlad fork's builtin extensions.

Using the built-in extension list

Open the Extensions tab
Open the "Install From URL" sub-tab
Paste the repo url: https://github.com/ArtVentureX/sd-webui-agent-scheduler.git
Click "Install"

Manual clone

git clone "https://github.com/ArtVentureX/sd-webui-agent-scheduler.git" extensions/agent-scheduler

(The second argument specifies the name of the folder, you can choose whatever you like).

Basic Features

1️⃣ Input your usual Prompts & Settings. Enqueue to send your current prompts, settings, controlnets to AgentScheduler.

2️⃣ AgentScheduler Extension Tab.

3️⃣ See all queued tasks, current image being generated and tasks' associated information. Drag and drop the handle in the begining of each row to reaggrange the generation order.

4️⃣ Pause to stop queue auto generation. Resume to start.

5️⃣ Press ▶️ to prioritize selected task, or to start a single task when queue is paused. Delete tasks that you no longer want.

6️⃣ Show queue history.

7️⃣ Filter task status or search by text.

8️⃣ Bookmark task to easier filtering.

9️⃣ Double click the task id to rename and quickly update basic parameters. Click ↩️ to Requeue old task.

🔟 Click on each task to view the generation results.

https://github.com/ArtVentureX/sd-webui-agent-scheduler/assets/133728487/50c74922-b85f-493c-9be8-b8e78f0cd061

Hidden Features:

Queue all checkpoints at the same time

Right click the Enqueue button and select Queue with all checkpoints to quickly queue the current setting with all available checkpoints.

Queue with a subset of checkpoints

With the custom checkpoint select enabled (see Extension Settings section below), you can select a folder (or subfolder) to queue task with all checkpoints inside. Eg: Select anime will queue anime\AOM3A1B_oragemixs, anime\counterfeit\Counterfeit-V2.5_fp16 and anime\counterfeit\Counterfeit-V2.5_pruned.

Edit queued task

Double click a queued task to edit. You can name a task by changing task_id or update some basic parameters: prompt, negative prompt, sampler, checkpoint, steps, cfg scale.

Extension Settings

Go to Settings > Agent Scheduler to access extension settings.

Disable Queue Auto-Processing: Check this option to disable queue auto-processing on start-up. You can also temporarily pause or resume the queue from the Extension tab.

Queue Button Placement: Change the placement of the queue button on the UI.

Hide the Checkpoint Dropdown: The Extension provides a custom checkpoint dropdown.

By default, queued tasks use the currently loaded checkpoint. However, changing the system checkpoint requires some time to load the checkpoint into memory, and you also cannot change the checkpoint during image generation. You can use this dropdown to quickly queue a task with a custom checkpoint.

Auto Delete Queue History: Select a timeframe to keep your queue history. Tasks that are older than the configured value will be automatically deleted. Please note that bookmarked tasks will not be deleted.

API Access

All the functionality of this extension can be accessed through HTTP APIs. You can access the API documentation via http://127.0.0.1:7860/docs. Remember to include --api in your startup arguments.

Queue Task

The two apis /agent-scheduler/v1/queue/txt2img and /agent-scheduler/v1/queue/img2img support all the parameters of the original webui apis. These apis response the task id, which can be used to perform updates later.

{
  "task_id": "string"
}

Download Results

Use api /agent-scheduler/v1/results/{id} to get the generated images. The api supports two response format:

json with base64 encoded

{
  "success": true,
  "data": [
    {
      "image": "data:image/png;base64,iVBORw0KGgoAAAAN...",
      "infotext": "1girl\nNegative prompt: EasyNegative, badhandv4..."
    },
    {
      "image": "data:image/png;base64,iVBORw0KGgoAAAAN...",
      "infotext": "1girl\nNegative prompt: EasyNegative, badhandv4..."
    }
  ]
}

zip file with querystring zip=true

API Callback

Queue task with param callback_url to register an API callback. Eg:

{
  "prompt": "1girl",
  "negative_prompt": "easynegative",
  "callback_url": "http://somehost:port/task_completed"
}

The callback endpoint must support POST method with body in multipart/form-data encoding. Body format:

{
  "task_id": "abc123",
  "status": "done",
  "files": [list of image files],
}

Example code of the endpoint handle with FastApi:

from fastapi import FastAPI, UploadFile, File, Form

@app.post("/task_completed")
async def handle_task_completed(
    task_id: Annotated[str, Form()],
    status: Annotated[str, Form()],
    files: Optional[List[UploadFile]] = File(None),
):
    print(f"Received {len(files)} files for task {task_id} with status {status}")
    for file in files:
        print(f"* {file.filename} {file.content_type} {file.size}")
        # ... do something with the file contents ...

# Received 1 files for task 3cf8b150-f260-4489-b6e8-d86ed8a564ca with status done
# * 00008-3322209480.png image/png 416400

Troubleshooting

Make sure that you are running the latest version of the extension and an updated version of the WebUI.

To update the extension, go to Extension tab and click Check for Updates, then click Apply and restart UI.
To update the WebUI it self, you run the command git pull origin master in the same folder as webui.bat (or webui.sh).

Steps to try to find the cause of issues:

Check the for errors in the WebUI output console.
Press F12 in the browser then go to the console tab and reload the page, find any error message here.

Common errors:

AttributeError: module 'modules.script_callbacks' has no attribute 'on_before_reload'

If you see this error message in the output console, try update the WebUI to the latest version.

Update: The extension is updated to print this warning message instead: YOUR SD WEBUI IS OUTDATED AND AGENT SCHEDULER WILL NOT WORKING PROPERLY. You can still able to use the extension but it will not working correctly after a reload.

~~ReferenceError: submit_enqueue is not defined~~

~~If you click the Enqueue button and nothing happen, and you find above error message in the browser F12 console, follow the steps in this comment.~~

Update: This issue is now fixed.

TypeError: issubclass() arg 1 must be a class Please update the extension, there's a chance it's already fixed.

TypeError: Object of type X is not JSON serializable Please update the extension, it should be fixed already. If not, please fire an issue report with the list of installed extensions.

For other errors, feel free to fire a new Github issue.

Contributing

We welcome contributions to the Agent Scheduler Extension project! Please feel free to submit issues, bug reports, and feature requests through the GitHub repository.

Please give us a ⭐ if you find this extension helpful!

License

This project is licensed under the Apache License 2.0.

Disclaimer

The author(s) of this project are not responsible for any damages or legal issues arising from the use of this software. Users are solely responsible for ensuring that they comply with any applicable laws and regulations when using this software and assume all risks associated with its use. The author(s) are not responsible for any copyright violations or legal issues arising from the use of input or output content.

CRAFTED BY THE PEOPLE BUILDING SIPHER//AGI, PROTOGAIA, ATHERLABS & SIPHER ODYSSEY

About ProtoGAIA

ProtoGAIA offers powerful collaboration features for Generative AI Image workflows. It is designed to help designers and creative professionals of all levels collaborate more efficiently, unleash their creativity, and have full transparency and tracking over the creation process.

Current protoGAIA Features

Like any open project that seeks to bring the powerful of Generative AI to the masses, ProtoGAIA offers the following key features:

✅ Seamless Access: available on desktop and mobile ✅ Powerful Macro Abilities that allowing the chaining of tasks, which is then packaged as Macro Command ready for AI Agent Automation ✅ Multiplayer & Collaborative UX. Strong collaboration features, such as real-time commenting and feedback, version control, and image/file/project sharing. ✅ Rooms Chat for lively discussion between users and running Generative AI workflows right in the chat ✅ Custom Models Management including Lora, Diffusion Models, Controlnet Models and more ✅ Powerful semantic search capabilities ✅ Powerful AI driven chat box that can trigger quick Generative AI tasks and workflows ✅ Building on shoulders of Giants, leveraging A1111/Vladnmandic and other pioneers, provide collaboration process from Idea to Final Results in 1 platform ✅ Automation tooling for certain repeated tasks ✅ Secure and transparent, leveraging hasing and metadata to track the origin and history of models, loras, images to allow for tracability and ease of collaboration. ✅ Personalize UIUX for both beginner and experienced users to quickly remix existing SD images by editing prompts and negative prompts, selecting new training models and output quality as desired. ✅ Provenance Tracking for all models, loras, images to allow for tracability and ease of collaboration. ✅ Custom UIUX for both beginner and experienced users to quickly remix existing SD images by editing prompts and negative prompts, selecting new training models and output quality as desired. ✅ Articles and Tutorials for learning Generative AI ✅ Voting System for best generative AI images, models, recipes, macros etc. ✅ Open sharing of generative AI images, models, recipes, macros etc via the Global Explore tab

Target Audience

ProtoGAIA is designed for the following target audiences:

Creators
Small Design Teams or Freelancers
Design Agencies & Game Studios
AI Agents

🎉 Stay Tuned for Updates

We hope you find this extension to be useful. We will be adding new features and improvements over time as we enhance this extension to support our creative workflows.

To stay up-to-date with the latest news and updates, be sure to follow us on GitHub and Twitter. We welcome your feedback and suggestions, and are excited to hear how AgentScheduler can help you streamline your workflow and unleash your creativity!

For Tasks:

Click tags to check more tools for each tasks

enqueue prompts manage queued tasks view generation results edit queued tasks access functionality through apis

For Jobs:

graphic designer web developer creative director ai engineer software developer

Alternative AI tools for sd-webui-agent-scheduler

Similar Open Source Tools

sd-webui-agent-scheduler

github

: 646

deep-research

Deep Research is a lightning-fast tool that uses powerful AI models to generate comprehensive research reports in just a few minutes. It leverages advanced 'Thinking' and 'Task' models, combined with an internet connection, to provide fast and insightful analysis on various topics. The tool ensures privacy by processing and storing all data locally. It supports multi-platform deployment, offers support for various large language models, web search functionality, knowledge graph generation, research history preservation, local and server API support, PWA technology, multi-key payload support, multi-language support, and is built with modern technologies like Next.js and Shadcn UI. Deep Research is open-source under the MIT License.

github

: 4.0k

UFO

UFO is a UI-focused dual-agent framework to fulfill user requests on Windows OS by seamlessly navigating and operating within individual or spanning multiple applications.

github

: 6.6k

llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output (objects). It provides a simple yet robust interface and supports llama-cpp-python and OpenAI endpoints with GBNF grammar support (like the llama-cpp-python server) and the llama.cpp backend server. It works by generating a formal GGML-BNF grammar of the user defined structures and functions, which is then used by llama.cpp to generate text valid to that grammar. In contrast to most GBNF grammar generators it also supports nested objects, dictionaries, enums and lists of them.

github

: 454

patchwork

PatchWork is an open-source framework designed for automating development tasks using large language models. It enables users to automate workflows such as PR reviews, bug fixing, security patching, and more through a self-hosted CLI agent and preferred LLMs. The framework consists of reusable atomic actions called Steps, customizable LLM prompts known as Prompt Templates, and LLM-assisted automations called Patchflows. Users can run Patchflows locally in their CLI/IDE or as part of CI/CD pipelines. PatchWork offers predefined patchflows like AutoFix, PRReview, GenerateREADME, DependencyUpgrade, and ResolveIssue, with the flexibility to create custom patchflows. Prompt templates are used to pass queries to LLMs and can be customized. Contributions to new patchflows, steps, and the core framework are encouraged, with chat assistants available to aid in the process. The roadmap includes expanding the patchflow library, introducing a debugger and validation module, supporting large-scale code embeddings, parallelization, fine-tuned models, and an open-source GUI. PatchWork is licensed under AGPL-3.0 terms, while custom patchflows and steps can be shared using the Apache-2.0 licensed patchwork template repository.

github

: 1.3k

open-parse

Open Parse is a Python library for visually discerning document layouts and chunking them effectively. It is designed to fill the gap in open-source libraries for handling complex documents. Unlike text splitting, which converts a file to raw text and slices it up, Open Parse visually analyzes documents for superior LLM input. It also supports basic markdown for parsing headings, bold, and italics, and has high-precision table support, extracting tables into clean Markdown formats with accuracy that surpasses traditional tools. Open Parse is extensible, allowing users to easily implement their own post-processing steps. It is also intuitive, with great editor support and completion everywhere, making it easy to use and learn.

github

: 2.4k

aiconfig

AIConfig is a framework that makes it easy to build generative AI applications for production. It manages generative AI prompts, models and model parameters as JSON-serializable configs that can be version controlled, evaluated, monitored and opened in a local editor for rapid prototyping. It allows you to store and iterate on generative AI behavior separately from your application code, offering a streamlined AI development workflow.

github

: 833

voice-chat-ai

Voice Chat AI is a project that allows users to interact with different AI characters using speech. Users can choose from various characters with unique personalities and voices, and have conversations or role play with them. The project supports OpenAI, xAI, or Ollama language models for chat, and provides text-to-speech synthesis using XTTS, OpenAI TTS, or ElevenLabs. Users can seamlessly integrate visual context into conversations by having the AI analyze their screen. The project offers easy configuration through environment variables and can be run via WebUI or Terminal. It also includes a huge selection of built-in characters for engaging conversations.

github

: 193

crewAI

CrewAI is a cutting-edge framework designed to orchestrate role-playing autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It enables AI agents to assume roles, share goals, and operate in a cohesive unit, much like a well-oiled crew. Whether you're building a smart assistant platform, an automated customer service ensemble, or a multi-agent research team, CrewAI provides the backbone for sophisticated multi-agent interactions. With features like role-based agent design, autonomous inter-agent delegation, flexible task management, and support for various LLMs, CrewAI offers a dynamic and adaptable solution for both development and production workflows.

github

: 38.6k

deer-flow

DeerFlow is a community-driven Deep Research framework that combines language models with specialized tools for tasks like web search, crawling, and Python code execution. It supports FaaS deployment and one-click deployment based on Volcengine. The framework includes core capabilities like LLM integration, search and retrieval, RAG integration, MCP seamless integration, human collaboration, report post-editing, and content creation. The architecture is based on a modular multi-agent system with components like Coordinator, Planner, Research Team, and Text-to-Speech integration. DeerFlow also supports interactive mode, human-in-the-loop mechanism, and command-line arguments for customization.

github

: 17.2k

ChatGPT-desktop

ChatGPT Desktop Application is a multi-platform tool that provides a powerful AI wrapper for generating text. It offers features like text-to-speech, exporting chat history in various formats, automatic application upgrades, system tray hover window, support for slash commands, customization of global shortcuts, and pop-up search. The application is built using Tauri and aims to enhance user experience by simplifying text generation tasks. It is available for Mac, Windows, and Linux, and is designed for personal learning and research purposes.

github

: 84

Upsonic

Upsonic offers a cutting-edge enterprise-ready framework for orchestrating LLM calls, agents, and computer use to complete tasks cost-effectively. It provides reliable systems, scalability, and a task-oriented structure for real-world cases. Key features include production-ready scalability, task-centric design, MCP server support, tool-calling server, computer use integration, and easy addition of custom tools. The framework supports client-server architecture and allows seamless deployment on AWS, GCP, or locally using Docker.

github

: 7.7k

chatgpt-vscode

ChatGPT-VSCode is a Visual Studio Code integration that allows users to prompt OpenAI's GPT-4, GPT-3.5, GPT-3, and Codex models within the editor. It offers features like using improved models via OpenAI API Key, Azure OpenAI Service deployments, generating commit messages, storing conversation history, explaining and suggesting fixes for compile-time errors, viewing code differences, and more. Users can customize prompts, quick fix problems, save conversations, and export conversation history. The extension is designed to enhance developer experience by providing AI-powered assistance directly within VS Code.

github

: 1.2k

AI-Scientist

The AI Scientist is a comprehensive system for fully automatic scientific discovery, enabling Foundation Models to perform research independently. It aims to tackle the grand challenge of developing agents capable of conducting scientific research and discovering new knowledge. The tool generates papers on various topics using Large Language Models (LLMs) and provides a platform for exploring new research ideas. Users can create their own templates for specific areas of study and run experiments to generate papers. However, caution is advised as the codebase executes LLM-written code, which may pose risks such as the use of potentially dangerous packages and web access.

github

: 10.2k

cosdata

Cosdata is a cutting-edge AI data platform designed to power the next generation search pipelines. It features immutability, version control, and excels in semantic search, structured knowledge graphs, hybrid search capabilities, real-time search at scale, and ML pipeline integration. The platform is customizable, scalable, efficient, enterprise-grade, easy to use, and can manage multi-modal data. It offers high performance, indexing, low latency, and high requests per second. Cosdata is designed to meet the demands of modern search applications, empowering businesses to harness the full potential of their data.

github

: 110

Deep-Live-Cam

Deep-Live-Cam is a software tool designed to assist artists in tasks such as animating custom characters or using characters as models for clothing. The tool includes built-in checks to prevent unethical applications, such as working on inappropriate media. Users are expected to use the tool responsibly and adhere to local laws, especially when using real faces for deepfake content. The tool supports both CPU and GPU acceleration for faster processing and provides a user-friendly GUI for swapping faces in images or videos.

github

: 45.2k

For similar tasks

sd-webui-agent-scheduler

github

: 646

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k