comfyui_LLM_party
Dify in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai/gemini interfaces, such as o1,ollama, qwen, GLM, deepseek, moonshot,doubao. Adapted to local llms, vlm, gguf such as llama-3.2, Linkage neo4j KG, graphRAG / RAG / html 2 img
Stars: 879
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
README:
Comfyui_llm_party aims to develop a complete set of nodes for LLM workflow construction based on comfyui as the front end. It allows users to quickly and conveniently build their own LLM workflows and easily integrate them into their existing image workflows.
https://github.com/user-attachments/assets/945493c0-92b3-4244-ba8f-0c4b2ad4eba6
ComfyUI LLM Party, from the most basic LLM multi-tool call, role setting to quickly build your own exclusive AI assistant, to the industry-specific word vector RAG and GraphRAG to localize the management of the industry knowledge base; from a single agent pipeline, to the construction of complex agent-agent radial interaction mode and ring interaction mode; from the access to their own social APP (QQ, Feishu, Discord) required by individual users, to the one-stop LLM + TTS + ComfyUI workflow required by streaming media workers; from the simple start of the first LLM application required by ordinary students, to the various parameter debugging interfaces commonly used by scientific researchers, model adaptation. All of this, you can find the answer in ComfyUI LLM Party.
- Drag the following workflows into your comfyui, then use comfyui-Manager to install the missing nodes.
- Use API to call LLM: start_with_LLM_api
- Manage local LLM with ollama: start_with_Ollama
- Use local LLM in distributed format: start_with_LLM_local
- Use local LLM in GGUF format: start_with_LLM_GGUF
- Use local VLM in distributed format: start_with_VLM_local (testing, currently only supports Llama-3.2-Vision-Instruct)
- Use local VLM in GGUF format: start_with_VLM_GGUF
- If you are using API, fill in your
base_url
(it can be a relay API, make sure it ends with/v1/
), for example:https://api.openai.com/v1/
andapi_key
in the API LLM loader node. - If you are using ollama, turn on the
is_ollama
option in the API LLM loader node, no need to fill inbase_url
andapi_key
. - If you are using a local model, fill in your model path in the local model loader node, for example:
E:\model\Llama-3.2-1B-Instruct
. You can also fill in the Huggingface model repo id in the local model loader node, for example:lllyasviel/omost-llama-3-8b-4bits
. - Due to the high usage threshold of this project, even if you choose the quick start, I hope you can patiently read through the project homepage.
- The automatic model name list node has been removed and replaced with a simple API LLM loader node, which automatically retrieves your model name list from the configuration in your config.ini file. You just need to select a name to load the model. Additionally, the simple LLM loader, simple LLM-GGUF loader, simple VLM loader, simple VLM-GGUF loader, and simple LLM lora loader nodes have been updated. They all automatically read the model paths from the model folder within the party folder, making it easier for everyone to load various local models.
- LLMs can now dynamically load lora like SD and FLUX. You can chain multiple loras to load more loras on the same LLM. Example workflow: start_with_LLM_LORA.
- Added the searxng tool, which can aggregate searches across the entire web. Perplexica also relies on this aggregation search tool, so you can set up a Perplexica at your party. You can deploy the searxng/searxng public image in Docker, then start it using
docker run -d -p 8080:8080 searxng/searxng
, and access it usinghttp://localhost:8080
. You can fill in this URLhttp://localhost:8080
in the party's searxng tool, and then you can use searxng as a tool for LLM. -
Major Update!!! Now you can encapsulate any ComfyUI workflow into an LLM tool node. You can have your LLM control multiple ComfyUI workflows simultaneously. When you want it to complete some tasks, it can choose the appropriate ComfyUI workflow based on your prompt, complete your task, and return the result to you. Example workflow: comfyui_workflows_tool. The specific steps are as follows:
- First, connect the text input interface of the workflow you want to encapsulate as a tool to the "user_prompt" output of the "Start Workflow" node. This is where the prompt passed in when the LLM calls the tool.
- Connect the positions where you want to output text and images to the corresponding input positions of the "End Workflow" node.
- Save this workflow as an API (you need to enable developer mode in the settings to see this button).
- Save this workflow to the workflow_api folder of this project.
- Restart ComfyUI and create a simple LLM workflow, such as: start_with_LLM_api.
- Add a "Workflow Tool" node to this LLM node and connect it to the tool input of the LLM node.
- In the "Workflow Tool" node, write the name of the workflow file you want to call in the first input box, for example: draw.json. You can write multiple workflow file names. In the second input box, write the function of each workflow so that the LLM understands how to use these workflows.
- Run it to see the LLM call your encapsulated workflow and return the result to you. If the return is an image, connect the "Preview Image" node to the image output of the LLM node to view the generated image. Note! This method calls a new ComfyUI on your 8190 port, please do not occupy this port. A new terminal will be opened on Windows and Mac systems, please do not close it. The Linux system uses the screen process to achieve this, when you do not need to use it, close this screen process, otherwise, it will always occupy your port.
-
For the instructions for using the node, please refer to: how to use nodes
-
If there are any issues with the plugin or you have other questions, feel free to join the QQ group: 931057213 | discord:discord.
-
Please refer to the workflow tutorial: Workflow Tutorial, thanks to HuangYuChuh for your contribution!
-
Advanced workflow gameplay account:openart
-
More workflows please refer to the workflow folder.
- Support all API calls in openai format(Combined with oneapi can call almost all LLM APIs, also supports all transit APIs), base_url selection reference config.ini.example, which has been tested so far:
- openai (Perfectly compatible with all OpenAI models, including the 4o and o1 series!)
- ollama (Recommended! If you are calling locally, it is highly recommended to use the ollama method to host your local model!)
- Azure OpenAI
- llama.cpp (Recommended! If you want to use the local gguf format model, you can use the llama.cpp project's API to access this project!)
- Tongyi Qianwen /qwen
- zhipu qingyan/glm
- deepseek
- kimi/moonshot
- doubao
- API calls that support Gemini format:
- Compatible with most local models in the transformer library (the model type on the local LLM model chain node has been changed to LLM, VLM-GGUF, and LLM-GGUF, corresponding to directly loading LLM models, loading VLM models, and loading GGUF format LLM models). If your VLM or GGUF format LLM model reports an error, please download the latest version of llama-cpp-python from llama-cpp-python. Currently tested models include:
- ClosedCharacter/Peach-9B-8k-Roleplay(Recommended! Role-playing model)
- lllyasviel/omost-llama-3-8b-4bits(Recommended! Rich prompt model)
- meta-llama/llama-2-7b-chat-hf
- Qwen/Qwen2-7B-Instruct
- xtuner/llava-llama-3-8b-v1_1-gguf
- lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
- meta-llama/Llama-3.2-11B-Vision-Instruct
- Model download
- Baidu cloud address, extraction code: qyhu
- You can configure the language in
config.ini
, currently only Chinese (zh_CN) and English (en_US), the default is your system language. - Install using one of the following methods:
- Search for comfyui_LLM_party in the comfyui manager and install it with one click.
- Restart comfyui.
- Navigate to the
custom_nodes
subfolder under the ComfyUI root folder. - Clone this repository with
git clone https://github.com/heshengtao/comfyui_LLM_party.git
.
- Click
CODE
in the upper right corner. - Click
download zip
. - Unzip the downloaded package into the
custom_nodes
subfolder under the ComfyUI root folder.
- Navigate to the
comfyui_LLM_party
project folder. - Enter
pip install -r requirements.txt
in the terminal to deploy the third-party libraries required by the project into the comfyui environment. Please ensure you are installing within the comfyui environment and pay attention to anypip
errors in the terminal. - If you are using the comfyui launcher, you need to enter
path_in_launcher_configuration\python_embeded\python.exe -m pip install -r requirements.txt
in the terminal to install. Thepython_embeded
folder is usually at the same level as yourComfyUI
folder. - If you have some environment configuration problems, you can try to use the dependencies in
requirements_fixed.txt
.
APIKEY can be configured using one of the following methods
- Open the
config.ini
file in the project folder of thecomfyui_LLM_party
. - Enter your openai_api_key, base_url in
config.ini
. - If you are using an ollama model, fill in
http://127.0.0.1:11434/v1/
inbase_url
,ollama
inopenai_api_key
, and your model name inmodel_name
, for example:llama3
. - If you want to use Google search or Bing search tools, enter your
google_api_key
,cse_id
orbing_api_key
inconfig.ini
. - If you want to use image input LLM, it is recommended to use image bed imgbb and enter your imgbb_api in
config.ini
. - Each model can be configured separately in the
config.ini
file, which can be filled in by referring to theconfig.ini.example
file. After you configure it, just entermodel_name
on the node.
- Open the comfyui interface.
- Create a Large Language Model (LLM) node and enter your openai_api_key and base_url directly in the node.
- If you use the ollama model, use LLM_api node, fill in
http://127.0.0.1:11434/v1/
inbase_url
node, fill inollama
inapi_key
, and fill in your model name inmodel_name
, for example:llama3
. - If you want to use image input LLM, it is recommended to use graph bed imgbb and enter your
imgbb_api_key
on the node.
- You can right-click in the comfyui interface, select
llm
from the context menu, and you will find the nodes for this project. how to use nodes - Supports API integration or local large model integration. Modular implementation for tool invocation.When entering the base_url, please use a URL that ends with
/v1/
.You can use ollama to manage your model. Then, enterhttp://127.0.0.1:11434/v1/
for the base_url,ollama
for the api_key, and your model name for the model_name, such as: llama3.
- API access sample workflow: start_with_LLM_api
- Local model access sample workflow: start_with_LLM_local
- ollama access sample workflow: ollama
- Local knowledge base integration with RAG support.sample workflow: Knowledge Base RAG Search
- Ability to invoke code interpreters.
- Enables online queries, including Google search support.sample workflow: movie query workflow
- Implement conditional statements within ComfyUI to categorize user queries and provide targeted responses.sample workflow: intelligent customer service
- Supports looping links for large models, allowing two large models to engage in debates.sample workflow: Tram Challenge Debate
- Attach any persona mask, customize prompt templates.
- Supports various tool invocations, including weather lookup, time lookup, knowledge base, code execution, web search, and single-page search.
- Use LLM as a tool node.sample workflow: LLM Matryoshka dolls
- Rapidly develop your own web applications using API + Streamlit.
- Added a dangerous omnipotent interpreter node that allows the large model to perform any task.
- It is recommended to use the
show_text
node under thefunction
submenu of the right-click menu as the display output for the LLM node. - Supported the visual features of GPT-4O!sample workflow:GPT-4o
- A new workflow intermediary has been added, which allows your workflow to call other workflows!sample workflow:Invoke another workflow
- Adapted to all models with an interface similar to OpenAI, such as: Tongyi Qianwen/QWEN, Zhigu Qingyan/GLM, DeepSeek, Kimi/Moonshot. Please fill in the base_url, api_key, and model_name of these models into the LLM node to call them.
- Added an LVM loader, now you can call the LVM model locally, support lava-llama-3-8b-v1_1-gguf model, other LVM models should theoretically run if they are GGUF format.The example workflow can be found here: start_with_LVM.json.
- I wrote a
fastapi.py
file, and if you run it directly, you’ll get an OpenAI interface onhttp://127.0.0.1:8817/v1/
. Any application that can call GPT can now invoke your comfyui workflow! I will create a tutorial to demonstrate the details on how to do this. - I’ve separated the LLM loader and the LLM chain, dividing the model loading and model configuration. This allows for sharing models across different LLM nodes!
- macOS and mps devices are now supported! Thanks to bigcat88 for their contribution!
- You can build your own interactive novel game, and go to different endings according to the user's choice! Example workflow reference: interactive_novel
- Adapted to OpenAI's whisper and tts functions, voice input and output can be realized. Example workflow reference: voice_input&voice_output
- Compatible with Omost!!! Please download omost-llama-3-8b-4bits to experience it now! Sample workflow reference: start_with_OMOST
- Added LLM tools to send messages to WeCom, DingTalk, and Feishu, as well as external functions to call.
- Added a new text iterator, which can output only part of the characters at a time. It is safe to split the text according to Carriage Return and chunk size, and will not be divided from the middle of the text. chunk_overlap refers to how many characters the divided text overlaps. In this way, you can enter super long text in batches, as long as you don't have a brain to click, or open the loop in comfyui to execute, it can be automatically executed. Remember to turn on the is_locked property, which can automatically lock the workflow at the end of the input and will not continue to execute. Example workflow: text iteration input
- Added the model name attribute to the local LLM loader, local llava loader. If it is empty, it will be loaded using various local paths in the node. If it is not empty, it will be loaded using the path parameters you fill in yourself in
config.ini
. If it is not empty and not inconfig.ini
, it will be downloaded from huggingface or loaded from the model save directory of huggingface. If you want to download from huggingface, please fill in the format of for example:THUDM/glm-4-9b-chat
.Attention! Models loaded in this way must be adapted to the transformer library. - Added JSON file parsing node and JSON value node, which allows you to get the value of a key from a file or text. Thanks to guobalove for your contribution!
- Improved the code of tool call. Now LLM without tool call function can also open is_tools_in_sys_prompt attribute (local LLM does not need to be opened by default, automatic adaptation). After opening, the tool information will be added to the system prompt word, so that LLM can call the tool.Related papers on implementation principles: Achieving Tool Calling Functionality in LLMs Using Only Prompt Engineering Without Fine-Tuning
- A new custom_tool folder is created to store the code of the custom tool. You can refer to the code in the custom_tool folder, put the code of the custom tool into the custom_tool folder, and you can call the custom tool in LLM.
- Added Knowledge Graph tool, so that LLM and Knowledge Graph can interact perfectly. LLM can modify Knowledge Graph according to your input, and can reason on Knowledge Graph to get the answers you need. Example workflow reference: graphRAG_neo4j
- Added personality AI function, 0 code to develop your own girlfriend AI or boyfriend AI, unlimited dialogue, permanent memory, stable personality. Example workflow reference: Mylover Personality AI
- You can use this LLM tool maker to automatically generate LLM tools, save the tool code you generated as a python file, and then copy the code to the custom_tool folder, and then you create a new node. Example workflow: LLM tool generator.
- It supports duckduckgo search, but it has significant limitations. It seems that only English keywords can be entered, and multiple concepts cannot appear in keywords. The advantage is that there are no APIkey restrictions.
- It supports the function of calling multiple knowledge bases separately, and it is possible to specify which knowledge base is used to answer questions in the prompt word. Example workflow: multiple knowledge bases are called separately.
- Support LLM input extra parameters, including advanced parameters such as json out. Example workflow: LLM input extra parameters.Separate prompt words with json_out.
- Added the function of connecting the agent to discord. (still testing)
- Added the function of connecting the agent to Feishu, thank you very much guobalove for your contribution! Refer to the workflow Feishu robot.
- Added universal API call node and a large number of auxiliary nodes for constructing the request body and grabbing the information in the response.
- Added empty model node, you can uninstall LLM from video memory at any location!
- The chatTTS node has been added, thank you very much for the contribution of guobalove!
model_path
parameter can be empty! It is recommended to useHF
mode to load the model, the model will be automatically downloaded from hugging face, no need to download manually; if usinglocal
loading, please put the model'sasset
andconfig
folders in the root directory. Baidu cloud address, extraction code: qyhu; if usingcustom
mode to load, please put the model'sasset
andconfig
folders undermodel_path
. - Updated a series of conversion nodes: markdown to HTML, svg to image, HTML to image, mermaid to image, markdown to Excel.
- Compatible with the llama3.2 vision model, supports multi-turn dialogue, visual functions. Model address: meta-llama/Llama-3.2-11B-Vision-Instruct. Example workflow: llama3.2_vision.
- Adapted GOT-OCR2, supports formatted output results, supports fine text recognition using position boxes and colors. Model address: GOT-OCR2. Example workflow converts a screenshot of a webpage into HTML code and then opens the browser to display this webpage: img2web.
- The local LLM loader nodes have been significantly adjusted, so you no longer need to choose the model type yourself. The llava loader node and GGUF loader node have been re-added. The model type on the local LLM model chain node has been changed to LLM, VLM-GGUF, and LLM-GGUF, corresponding to directly loading LLM models, loading VLM models, and loading GGUF format LLM models. VLM models and GGUF format LLM models are now supported again. Local calls can now be compatible with more models! Example workflows: LLM_local, llava, GGUF
- Added EasyOCR node for recognizing text and positions in images. It can generate corresponding masks and return a JSON string for LLM to view. There are standard and premium versions available for everyone to choose from!
- In the comfyui LLM party, the strawberry system of the chatgpt-o1 series model was reproduced, referring to the prompts of Llamaberry. Example workflow: Strawberry system compared to o1.
- A new GPT-sovits node has been added, allowing you to call the GPT-sovits model to convert text into speech based on your reference audio. You can also fill in the path of your fine-tuned model (if not filled, the base model will be used for inference) to get any desired voice. To use it, you need to download the GPT-sovits project and the corresponding base model locally, then start the API service with
runtime\python.exe api_v2.py
in the GPT-sovits project folder. Additionally, the chatTTS node has been moved to comfyui LLM mafia. The reason is that chatTTS has many dependencies, and its license on PyPi is CC BY-NC 4.0, which is a non-commercial license. Even though the chatTTS GitHub project is under the AGPL license, we moved the chatTTS node to comfyui LLM mafia to avoid unnecessary trouble. We hope everyone understands! - Now supports OpenAI’s latest model, the o1 series!
- Added a local file control tool that allows the LLM to control files in your specified folder, such as reading, writing, appending, deleting, renaming, moving, and copying files.Due to the potential danger of this node, it is included in comfyui LLM mafia.
- New SQL tools allow LLM to query SQL databases.
- Updated the multilingual version of the README. Workflow for translating the README document: translate_readme
- Updated 4 iterator nodes (text iterator, picture iterator, excel iterator, json iterator). The iterator modes are: sequential, random, and infinite. The order will be output in sequence until the index limit is exceeded, the process will be automatically aborted, and the index value will be reset to 0. Random will choose a random index output, and infinite will loop output.
- Added Gemini API loader node, now compatible with Gemini official API!Since Gemini generates an error with a return code of 500 if the returned parameter contains Chinese characters during the tool call, some tool nodes are unavailable.example workflow:start_with_gemini
- Added lore book node, you can insert your background settings when talking to LLM, example workflow: lorebook
- Added FLUX prompt word generator mask node, which can generate Hearthstone cards, Game King cards, posters, comics and other styles of prompt words, which can make the FLUX model straight out. Reference workflow: FLUX prompt word
- More model adaptations;
- More ways to build agents;
- More automation features;
- More knowledge base management features;
- More tools, more personas.
This open-source project and its contents (hereinafter referred to as "Project") are provided for reference purposes only and do not imply any form of warranty, either expressed or implied. The contributors of the Project shall not be held responsible for the completeness, accuracy, reliability, or suitability of the Project. Any reliance you place on the Project is strictly at your own risk. In no event shall the contributors of the Project be liable for any indirect, special, or consequential damages or any damages whatsoever resulting from the use of the Project.
Some of the nodes in this project have borrowed from the following projects. Thank you for your contributions to the open-source community!
If there is a problem with the plugin or you have any other questions, please join our community.
- discord:discord link
- QQ group:
931057213
- WeChat group:
Choo-Yong
(enter the group after adding the small assistant WeChat)
- If you want to continue to pay attention to the latest features of this project, please follow the Bilibili account: Party host BB machine
- The OpenArt account is continuously updated with the most useful party workflows:openart
If my work has brought value to your day, consider fueling it with a coffee! Your support not only energizes the project but also warms the heart of the creator. ☕💖 Every cup makes a difference!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for comfyui_LLM_party
Similar Open Source Tools
comfyui_LLM_party
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
GlaDOS
This project aims to create a real-life version of GLaDOS, an aware, interactive, and embodied AI entity. It involves training a voice generator, developing a 'Personality Core,' implementing a memory system, providing vision capabilities, creating 3D-printable parts, and designing an animatronics system. The software architecture focuses on low-latency voice interactions, utilizing a circular buffer for data recording, text streaming for quick transcription, and a text-to-speech system. The project also emphasizes minimal dependencies for running on constrained hardware. The hardware system includes servo- and stepper-motors, 3D-printable parts for GLaDOS's body, animations for expression, and a vision system for tracking and interaction. Installation instructions cover setting up the TTS engine, required Python packages, compiling llama.cpp, installing an inference backend, and voice recognition setup. GLaDOS can be run using 'python glados.py' and tested using 'demo.ipynb'.
atomic_agents
Atomic Agents is a modular and extensible framework designed for creating powerful applications. It follows the principles of Atomic Design, emphasizing small and single-purpose components. Leveraging Pydantic for data validation and serialization, the framework offers a set of tools and agents that can be combined to build AI applications. It depends on the Instructor package and supports various APIs like OpenAI, Cohere, Anthropic, and Gemini. Atomic Agents is suitable for developers looking to create AI agents with a focus on modularity and flexibility.
ollama-autocoder
Ollama Autocoder is a simple to use autocompletion engine that integrates with Ollama AI. It provides options for streaming functionality and requires specific settings for optimal performance. Users can easily generate text completions by pressing a key or using a command pallete. The tool is designed to work with Ollama API and a specified model, offering real-time generation of text suggestions.
chronon
Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.
trackmania_rl_public
This repository contains the reinforcement learning training code for Trackmania AI with Reinforcement Learning. It is a research work-in-progress project that aims to apply reinforcement learning principles to play Trackmania. The code is constantly evolving and may not be clean or easily usable. The training hyperparameters are intentionally changed in the public repository to encourage understanding of reinforcement learning principles. The project may not receive active support for setup or usage at the moment.
airbroke
Airbroke is an open-source error catcher tool designed for modern web applications. It provides a PostgreSQL-based backend with an Airbrake-compatible HTTP collector endpoint and a React-based frontend for error management. The tool focuses on simplicity, maintaining a small database footprint even under heavy data ingestion. Users can ask AI about issues, replay HTTP exceptions, and save/manage bookmarks for important occurrences. Airbroke supports multiple OAuth providers for secure user authentication and offers occurrence charts for better insights into error occurrences. The tool can be deployed in various ways, including building from source, using Docker images, deploying on Vercel, Render.com, Kubernetes with Helm, or Docker Compose. It requires Node.js, PostgreSQL, and specific system resources for deployment.
AppAgent
AppAgent is a novel LLM-based multimodal agent framework designed to operate smartphone applications. Our framework enables the agent to operate smartphone applications through a simplified action space, mimicking human-like interactions such as tapping and swiping. This novel approach bypasses the need for system back-end access, thereby broadening its applicability across diverse apps. Central to our agent's functionality is its innovative learning method. The agent learns to navigate and use new apps either through autonomous exploration or by observing human demonstrations. This process generates a knowledge base that the agent refers to for executing complex tasks across different applications.
foyle
Foyle is a project focused on building agents to assist software developers in deploying and operating software. It aims to improve agent performance by collecting human feedback on agent suggestions and human examples of reasoning traces. Foyle utilizes a literate environment using vscode notebooks to interact with infrastructure, capturing prompts, AI-provided answers, and user corrections. The goal is to continuously retrain AI to enhance performance. Additionally, Foyle emphasizes the importance of reasoning traces for training agents to work with internal systems, providing a self-documenting process for operations and troubleshooting.
amazon-transcribe-live-call-analytics
The Amazon Transcribe Live Call Analytics (LCA) with Agent Assist Sample Solution is designed to help contact centers assess and optimize caller experiences in real time. It leverages Amazon machine learning services like Amazon Transcribe, Amazon Comprehend, and Amazon SageMaker to transcribe and extract insights from contact center audio. The solution provides real-time supervisor and agent assist features, integrates with existing contact centers, and offers a scalable, cost-effective approach to improve customer interactions. The end-to-end architecture includes features like live call transcription, call summarization, AI-powered agent assistance, and real-time analytics. The solution is event-driven, ensuring low latency and seamless processing flow from ingested speech to live webpage updates.
llama3-tokenizer-js
JavaScript tokenizer for LLaMA 3 designed for client-side use in the browser and Node, with TypeScript support. It accurately calculates token count, has 0 dependencies, optimized running time, and somewhat optimized bundle size. Compatible with most LLaMA 3 models. Can encode and decode text, but training is not supported. Pollutes global namespace with `llama3Tokenizer` in the browser. Mostly compatible with LLaMA 3 models released by Facebook in April 2024. Can be adapted for incompatible models by passing custom vocab and merge data. Handles special tokens and fine tunes. Developed by belladore.ai with contributions from xenova, blaze2004, imoneoi, and ConProgramming.
lfai-landscape
LF AI & Data Landscape is a map to explore open source projects in the AI & Data domains, highlighting companies that are members of LF AI & Data. It showcases members of the Foundation and is modelled after the Cloud Native Computing Foundation landscape. The landscape includes current version, interactive version, new entries, logos, proper SVGs, corrections, external data, best practices badge, non-updated items, license, formats, installation, vulnerability reporting, and adjusting the landscape view.
FigStep
FigStep is a black-box jailbreaking algorithm against large vision-language models (VLMs). It feeds harmful instructions through the image channel and uses benign text prompts to induce VLMs to output contents that violate common AI safety policies. The tool highlights the vulnerability of VLMs to jailbreaking attacks, emphasizing the need for safety alignments between visual and textual modalities.
llama-on-lambda
This project provides a proof of concept for deploying a scalable, serverless LLM Generative AI inference engine on AWS Lambda. It leverages the llama.cpp project to enable the usage of more accessible CPU and RAM configurations instead of limited and expensive GPU capabilities. By deploying a container with the llama.cpp converted models onto AWS Lambda, this project offers the advantages of scale, minimizing cost, and maximizing compute availability. The project includes AWS CDK code to create and deploy a Lambda function leveraging your model of choice, with a FastAPI frontend accessible from a Lambda URL. It is important to note that you will need ggml quantized versions of your model and model sizes under 6GB, as your inference RAM requirements cannot exceed 9GB or your Lambda function will fail.
PromptAgent
PromptAgent is a repository for a novel automatic prompt optimization method that crafts expert-level prompts using language models. It provides a principled framework for prompt optimization by unifying prompt sampling and rewarding using MCTS algorithm. The tool supports different models like openai, palm, and huggingface models. Users can run PromptAgent to optimize prompts for specific tasks by strategically sampling model errors, generating error feedbacks, simulating future rewards, and searching for high-reward paths leading to expert prompts.
llm.c
LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython. For example, training GPT-2 (CPU, fp32) is ~1,000 lines of clean code in a single file. It compiles and runs instantly, and exactly matches the PyTorch reference implementation. I chose GPT-2 as the first working example because it is the grand-daddy of LLMs, the first time the modern stack was put together.
For similar tasks
comfyui_LLM_party
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
hollama
Hollama is a minimal web-UI tool designed for interacting with Ollama servers. It features large prompt fields, streams completions, ability to copy completions as raw text, Markdown parsing with syntax highlighting, and saves sessions/context in the browser's localStorage. Users can access the latest version of Hollama at https://hollama.fernando.is without sign up, and data is stored locally on the browser. The tool can also be run as a Docker image by executing a specific command. Developers can connect to an Ollama server by updating the ORIGIN settings. Hollama facilitates easy development by providing instructions to set up the environment, install dependencies, and start a development server. Building a production version of the app is straightforward with a single command, and deployment may require installing an adapter for the target environment.
holmesgpt
HolmesGPT is an open-source DevOps assistant powered by OpenAI or any tool-calling LLM of your choice. It helps in troubleshooting Kubernetes, incident response, ticket management, automated investigation, and runbook automation in plain English. The tool connects to existing observability data, is compliance-friendly, provides transparent results, supports extensible data sources, runbook automation, and integrates with existing workflows. Users can install HolmesGPT using Brew, prebuilt Docker container, Python Poetry, or Docker. The tool requires an API key for functioning and supports OpenAI, Azure AI, and self-hosted LLMs.
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.