prompt-generator-comfyui
Custom AI prompt generator node for the ComfyUI
Stars: 72
Custom AI prompt generator node for ComfyUI. With this node, you can use text generation models to generate prompts. Before using, text generation model has to be trained with prompt dataset.
README:
Custom AI prompt generator node for ComfyUI. With this node, you can use text generation models to generate prompts. Before using, text generation model has to be trained with prompt dataset.
- prompt-generator-comfyui
- Table Of Contents
- Setup
- Features
- Example Workflow
- Pretrained Prompt Models
- Variables
- Troubleshooting
- Contributing
- Example Outputs
- [x] Automatic installation is added for portable version.
- Clone the repository with
git clone https://github.com/alpertunga-bile/prompt-generator-comfyui.git
command undercustom_nodes
folder. - Go to the
ComfyUI_windows_portable
folder and run the run_nvidia_gpu.bat file - Open the
hires.fixWithPromptGenerator.json
orbasicWorkflowWithPromptGenerator.json
workflow - Put your generator under the
models/prompt_generators
folder. You can create your prompt generator with this repository. You have to put generator as folder. Do not just putpytorch_model.bin
file for example. - Click
Refresh
button in ComfyUI
- Clone the repository with
git clone https://github.com/alpertunga-bile/prompt-generator-comfyui.git
command undercustom_nodes
folder. - Run the ComfyUI
- Open the
hires.fixWithPromptGenerator.json
orbasicWorkflowWithPromptGenerator.json
workflow - Put your generator under the
models/prompt_generators
folder. You can create your prompt generator with this repository. You have to put generator as folder. Do not just putpytorch_model.bin
file for example. - Click
Refresh
button in ComfyUI
- Download the node with ComfyUI Manager
- Restart the ComfyUI
- Open the
hires.fixWithPromptGenerator.json
orbasicWorkflowWithPromptGenerator.json
workflow - Put your generator under the
models/prompt_generators
folder. You can create your prompt generator with this repository. You have to put generator as folder. Do not just putpytorch_model.bin
file for example. - Click
Refresh
button in ComfyUI
- Multiple output generation is added. You can choose from 5 outputs with the index value. You can check the generated prompts from the log file and terminal. The prompts are logged and printed in order.
- Randomness is added. See this section.
- Quantization is added with Quanto and Bitsandbytes packages. See this section.
- Lora adapter model loading is added with Peft package. (The feature is not full tested in this repository because of my VRAM but I am using the same implementation in Google Colab for training and inference and it is working there)
- Optimizations are done with Optimum package.
- ONNX and transformers models are supported.
- Preprocessing outputs. See this section.
- Recursive generation is supported. See this section.
- Print generated text to terminal and log the node's state under the
generated_prompts
folder with date as filename.
- Prompt Generator Node may look different with final version but workflow in ComfyUI is not going to change
-
You can find the models in this link
-
For to use the pretrained model follow these steps:
- Download the model and unzip to
models/prompt_generators
folder. - Click
Refresh
button in ComfyUI. - Then select the generator with the node's
model_name
variable (If you can't see the generator, restart the ComfyUI).
- Download the model and unzip to
- The dataset has 2.150.395 rows of unique prompts currently.
- Process of data cleaning and gathering can be found here
-
The model versions are used to differentiate models rather than showing which one is better.
-
The v2 version is the latest trained model and the v4 model is an experimental model.
-
female_positive_generator_v2 | (Training In Process)
- Base model
- using distilgpt2 model
- ~500 MB
-
female_positive_generator_v3 | (Training In Process)
- Base model
- using bigscience/bloom-560m model
- ~1.3 GB
-
female_positive_generator_v4 | Experimental
- Lora adapter model
- using senseable/WestLake-7B-v2 model as the base model
- Base model is ~14 GB
Variable Names | Definitions |
---|---|
model_name | Folder name that contains the model |
accelerate | Open optimizations. Some of the models are not supported by BetterTransformer (Check your model). If it is not supported switch this option to disable or convert your model to ONNX |
quantize | Quantize the model. The quantize type is changed based on your OS and torch version. none value disables the quantization. Check this section for more information |
prompt | Input prompt for the generator |
seed | Seed value for the model |
lock | Lock the generation and select from the last generated prompts with index value |
random_index | Random index value in [1, 5]. If the value is enable, the index value is not used |
index | User specified index value for selecting prompt from the generated prompts. random_index variable must be disable |
cfg | CFG is enabled by setting guidance_scale > 1. Higher guidance scale encourages the model to generate samples that are more closely linked to the input prompt, usually at the expense of poorer quality |
min_new_tokens | The minimum numbers of tokens to generate, ignoring the number of tokens in the prompt. |
max_new_tokens | The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. |
do_sample | Whether or not to use sampling; use greedy decoding otherwise |
early_stopping | Controls the stopping condition for beam-based methods, like beam-search |
num_beams | Number of steps for each search path |
num_beam_groups | Number of groups to divide num_beams into in order to ensure diversity among different groups of beams |
diversity_penalty | This value is subtracted from a beam’s score if it generates a token same as any beam from other group at a particular time. Note that diversity_penalty is only effective if group beam search is enabled. |
temperature | How sensitive the algorithm is to selecting low probability options |
top_k | The number of highest probability vocabulary tokens to keep for top-k-filtering |
top_p | If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation |
repetition_penalty | The parameter for repetition penalty. 1.0 means no penalty |
no_repeat_ngram_size | The size of an n-gram that cannot occur more than once. (0=infinity) |
remove_invalid_values | Whether to remove possible nan and inf outputs of the model to prevent the generation method to crash. Note that using remove_invalid_values can slow down generation. |
self_recursive | See this section |
recursive_level | See this section |
preprocess_mode | See this section |
- Quantization is added with Quanto and Bitsandbytes packages.
- The Quanto package requires
torch >= 2.2
and Bitsandbytes package works out-of-box with Linux OS. So the node is checking which package to use:- If requirements are not specified for this packages, you can not use the
quantize
variable and it has onlynone
value. - If the Quanto requirements are filled then you can choose between
none, int8, float8, int4
values. - If the Bitsandbytes requirements are filled then you can choose between
none, int8, int4
values.
- If requirements are not specified for this packages, you can not use the
- If your environment can use the Quanto and Bitsandbytes packages, the node selects the Bitsandbytes package.
-
For random generation:
- Enable do_sample
-
You can find this text generation strategy from the upper link. The strategy is called Multinomial sampling.
-
Changing variable of do_sample to disable gives deterministic generation.
-
For more randomness, you can:
- Enable random_index variable
- Increase recursive_level
- Enable self_recursive
- Enabling the lock variable skip the generation and let you choose from the last generated prompts.
- You can choose from the index value or use the random_index.
- If random_index is enabled, the index value is ignored.
- Let's say we give
a,
as seed and recursive level is 1. I am going to use the same outputs for this example to explain the functionality more accurately. - With self recursive, let's say generator's output is
b
. So next seed is going to beb
and generator's output isc
. Final output isa, c
. It can be used for generating random outputs. - Without self recursive, let's say generator's output is
b
. So next seed is going to bea, b
and generator's output isc
. Final output isa, b, c
. It can be used for more accurate prompts.
-
exact_keyword =>
(masterpiece), ((masterpiece))
is not allowed. Checking the pure keyword without parantheses and weights. The algorithm is adding the prompts from the beginning of the generated text, so add important prompts to seed. -
exact_prompt =>
(masterpiece), ((masterpiece))
is allowed but(masterpiece), (masterpiece)
is not. Checking the exact match of the prompt. - none => Everything is allowed even the repeated prompts.
# ---------------------------------------------------------------------- Original ---------------------------------------------------------------------- #
((masterpiece)), ((masterpiece:1.2)), (masterpiece), blahblah, blah, blah, ((blahblah)), (((((blah))))), ((same prompt)), same prompt, (masterpiece)
# ------------------------------------------------------------- Preprocess (Exact Keyword) ------------------------------------------------------------- #
((masterpiece)), blahblah, blah, ((same prompt))
# ------------------------------------------------------------- Preprocess (Exact Prompt) -------------------------------------------------------------- #
((masterpiece)), ((masterpiece:1.2)), (masterpiece), blahblah, blah, ((blahblah)), (((((blah))))), ((same prompt)), same prompt
- If the below solutions are not fixed your issue please create an issue with
bug
label
- The node is based on transformers and optimum packages. So most of the problems may be caused from these packages. For overcome these problems you can try to update these packages:
- Activate the virtual environment if there is one.
- Run the
pip install --upgrade transformers optimum optimum[onnxruntime-gpu]
command.
- Go to the
ComfyUI_windows_portable
folder. - Open the command prompt in this folder.
- Run the
.\python_embeded\python.exe -s -m pip install --upgrade transformers optimum optimum[onnxruntime-gpu]
command.
- The users have to check if they activate the virtual environment if there is one.
- The users have to check that they are starting the ComfyUI in the
ComfyUI_windows_portable
folder. - Because the node is checking the
python_embeded
folder if it is exists and is using it to install the required packages.
- Sometimes the variables are changed with updates, so it may broke the workflow. But don't worry, you have to just delete the node in the workflow and add it again.
-
Contributions are welcome. If you have an idea and want to implement it by yourself please follow these steps:
- Create a fork
- Pull request the fork with the description that explaining the new feature
-
If you have an idea but don't know how to implement it, please create an issue with
enhancement
label. -
[x] The contributing can be done in several ways. You can contribute to code or to README file.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for prompt-generator-comfyui
Similar Open Source Tools
prompt-generator-comfyui
Custom AI prompt generator node for ComfyUI. With this node, you can use text generation models to generate prompts. Before using, text generation model has to be trained with prompt dataset.
easydiffusion
Easy Diffusion 3.0 is a user-friendly tool for installing and using Stable Diffusion on your computer. It offers hassle-free installation, clutter-free UI, task queue, intelligent model detection, live preview, image modifiers, multiple prompts file, saving generated images, UI themes, searchable models dropdown, and supports various image generation tasks like 'Text to Image', 'Image to Image', and 'InPainting'. The tool also provides advanced features such as custom models, merge models, custom VAE models, multi-GPU support, auto-updater, developer console, and more. It is designed for both new users and advanced users looking for powerful AI image generation capabilities.
obsidian-arcana
Arcana is a plugin for Obsidian that offers a collection of AI-powered tools inspired by famous historical figures to enhance creativity and productivity. It includes tools for conversation, text-to-speech transcription, speech-to-text replies, metadata markup, text generation, file moving, flashcard generation, auto tagging, and note naming. Users can interact with these tools using the command palette and sidebar views, with an OpenAI API key required for usage. The plugin aims to assist users in various note-taking and knowledge management tasks within the Obsidian vault environment.
Starmoon
Starmoon is an affordable, compact AI-enabled device that can understand and respond to your emotions with empathy. It offers supportive conversations and personalized learning assistance. The device is cost-effective, voice-enabled, open-source, compact, and aims to reduce screen time. Users can assemble the device themselves using off-the-shelf components and deploy it locally for data privacy. Starmoon integrates various APIs for AI language models, speech-to-text, text-to-speech, and emotion intelligence. The hardware setup involves components like ESP32S3, microphone, amplifier, speaker, LED light, and button, along with software setup instructions for developers. The project also includes a web app, backend API, and background task dashboard for monitoring and management.
Local-File-Organizer
The Local File Organizer is an AI-powered tool designed to help users organize their digital files efficiently and securely on their local device. By leveraging advanced AI models for text and visual content analysis, the tool automatically scans and categorizes files, generates relevant descriptions and filenames, and organizes them into a new directory structure. All AI processing occurs locally using the Nexa SDK, ensuring privacy and security. With support for multiple file types and customizable prompts, this tool aims to simplify file management and bring order to users' digital lives.
AirConnect-Synology
AirConnect-Synology is a minimal Synology package that allows users to use AirPlay to stream to UPnP/Sonos & Chromecast devices that do not natively support AirPlay. It is compatible with DSM 7.0 and DSM 7.1, and provides detailed information on installation, configuration, supported devices, troubleshooting, and more. The package automates the installation and usage of AirConnect on Synology devices, ensuring compatibility with various architectures and firmware versions. Users can customize the configuration using the airconnect.conf file and adjust settings for specific speakers like Sonos, Bose SoundTouch, and Pioneer/Phorus/Play-Fi.
ppl.llm.serving
PPL LLM Serving is a serving based on ppl.nn for various Large Language Models (LLMs). It provides inference support for LLaMA. Key features include: * **High Performance:** Optimized for fast and efficient inference on LLM models. * **Scalability:** Supports distributed deployment across multiple GPUs or machines. * **Flexibility:** Allows for customization of model configurations and inference pipelines. * **Ease of Use:** Provides a user-friendly interface for deploying and managing LLM models. This tool is suitable for various tasks, including: * **Text Generation:** Generating text, stories, or code from scratch or based on a given prompt. * **Text Summarization:** Condensing long pieces of text into concise summaries. * **Question Answering:** Answering questions based on a given context or knowledge base. * **Language Translation:** Translating text between different languages. * **Chatbot Development:** Building conversational AI systems that can engage in natural language interactions. Keywords: llm, large language model, natural language processing, text generation, question answering, language translation, chatbot development
spaCy
spaCy is an industrial-strength Natural Language Processing (NLP) library in Python and Cython. It incorporates the latest research and is designed for real-world applications. The library offers pretrained pipelines supporting 70+ languages, with advanced neural network models for tasks such as tagging, parsing, named entity recognition, and text classification. It also facilitates multi-task learning with pretrained transformers like BERT, along with a production-ready training system and streamlined model packaging, deployment, and workflow management. spaCy is commercial open-source software released under the MIT license.
RealtimeSTT_LLM_TTS
RealtimeSTT is an easy-to-use, low-latency speech-to-text library for realtime applications. It listens to the microphone and transcribes voice into text, making it ideal for voice assistants and applications requiring fast and precise speech-to-text conversion. The library utilizes Voice Activity Detection, Realtime Transcription, and Wake Word Activation features. It supports GPU-accelerated transcription using PyTorch with CUDA support. RealtimeSTT offers various customization options for different parameters to enhance user experience and performance. The library is designed to provide a seamless experience for developers integrating speech-to-text functionality into their applications.
Vodalus-Expert-LLM-Forge
Vodalus Expert LLM Forge is a tool designed for crafting datasets and efficiently fine-tuning models using free open-source tools. It includes components for data generation, LLM interaction, RAG engine integration, model training, fine-tuning, and quantization. The tool is suitable for users at all levels and is accompanied by comprehensive documentation. Users can generate synthetic data, interact with LLMs, train models, and optimize performance for local execution. The tool provides detailed guides and instructions for setup, usage, and customization.
llmcord.py
llmcord.py is a tool that allows users to chat with Language Model Models (LLMs) directly in Discord. It supports various LLM providers, both remote and locally hosted, and offers features like reply-based chat system, choosing any LLM, support for image and text file attachments, customizable system prompt, private access via DM, user identity awareness, streamed responses, warning messages, efficient message data caching, and asynchronous operation. The tool is designed to facilitate seamless conversations with LLMs and enhance user experience on Discord.
trip_planner_agent
VacAIgent is an AI tool that automates and enhances trip planning by leveraging the CrewAI framework. It integrates a user-friendly Streamlit interface for interactive travel planning. Users can input preferences and receive tailored travel plans with the help of autonomous AI agents. The tool allows for collaborative decision-making on cities and crafting complete itineraries based on specified preferences, all accessible via a streamlined Streamlit user interface. VacAIgent can be customized to use different AI models like GPT-3.5 or local models like Ollama for enhanced privacy and customization.
t3rn-airdrop-bot
A bot designed to automate transactions and bridge assets on the t3rn network, making the process seamless and efficient. It supports multiple wallets through a JSON file containing private keys, with robust error handling and retry mechanisms. The tool is user-friendly, easy to set up, and supports bridging from Optimism Sepolia and Arbitrum Sepolia.
UglyFeed
UglyFeed is a simple Python application designed to retrieve, aggregate, filter, rewrite, evaluate, and serve content (RSS feeds) written by a large language model. It provides features such as retrieving RSS feeds, aggregating feed items by similarity, rewriting content using various APIs, saving rewritten feeds to JSON files, converting JSON to valid RSS feed, serving XML feed via an HTTP server, deploying XML feed to GitHub or GitLab, and evaluating generated content. The tool can be used for smart content curation, dynamic blog generation, interactive educational tools, personalized reading experiences, brand monitoring, multilingual content delivery, enhanced RSS feeds, creative writing assistance, content repurposing, and fake news detection datasets. It is modular, extensible, and aims to empower users in content manipulation and delivery.
crawlee
Crawlee is a web scraping and browser automation library that helps you build reliable scrapers quickly. Your crawlers will appear human-like and fly under the radar of modern bot protections even with the default configuration. Crawlee gives you the tools to crawl the web for links, scrape data, and store it to disk or cloud while staying configurable to suit your project's needs.
For similar tasks
prompt-generator-comfyui
Custom AI prompt generator node for ComfyUI. With this node, you can use text generation models to generate prompts. Before using, text generation model has to be trained with prompt dataset.
lumentis
Lumentis is a tool that allows users to generate beautiful and comprehensive documentation from meeting transcripts and large documents with a single command. It reads transcripts, asks questions to understand themes and audience, generates an outline, and creates detailed pages with visual variety and styles. Users can switch models for different tasks, control the process, and deploy the generated docs to Vercel. The tool is designed to be open, clean, fast, and easy to use, with upcoming features including folders, PDFs, auto-transcription, website scraping, scientific papers handling, summarization, and continuous updates.
khoj
Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.
quivr
Quivr is a personal assistant powered by Generative AI, designed to be a second brain for users. It offers fast and efficient access to data, ensuring security and compatibility with various file formats. Quivr is open source and free to use, allowing users to share their brains publicly or keep them private. The marketplace feature enables users to share and utilize brains created by others, boosting productivity. Quivr's offline mode provides anytime, anywhere access to data. Key features include speed, security, OS compatibility, file compatibility, open source nature, public/private sharing options, a marketplace, and offline mode.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
wingman-ai
Wingman-AI is a free and open-source AI coding assistant that brings high-quality AI-assisted coding right to your computer. It offers features such as code completion, interactive chat, and support for multiple AI providers, including Ollama, Hugging Face, and OpenAI. Wingman-AI is designed to enhance your coding workflow by providing real-time assistance and suggestions, making it an ideal tool for developers of all levels.
morphic
Morphic is an AI-powered answer engine with a generative UI. It utilizes a stack of Next.js, Vercel AI SDK, OpenAI, Tavily AI, shadcn/ui, Radix UI, and Tailwind CSS. To get started, fork and clone the repo, install dependencies, fill out secrets in the .env.local file, and run the app locally using 'bun dev'. You can also deploy your own live version of Morphic with Vercel. Verified models that can be specified to writers include Groq, LLaMA3 8b, and LLaMA3 70b.
gpt-engineer
GPT-Engineer is a tool that allows you to specify a software in natural language, sit back and watch as an AI writes and executes the code, and ask the AI to implement improvements.
For similar jobs
ChatFAQ
ChatFAQ is an open-source comprehensive platform for creating a wide variety of chatbots: generic ones, business-trained, or even capable of redirecting requests to human operators. It includes a specialized NLP/NLG engine based on a RAG architecture and customized chat widgets, ensuring a tailored experience for users and avoiding vendor lock-in.
anything-llm
AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
mikupad
mikupad is a lightweight and efficient language model front-end powered by ReactJS, all packed into a single HTML file. Inspired by the likes of NovelAI, it provides a simple yet powerful interface for generating text with the help of various backends.
glide
Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
firecrawl
Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.