data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="comfyui_LLM_party"
comfyui_LLM_party
LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1,ollama, gemini, grok, qwen, GLM, deepseek, kimi,doubao. Adapted to local llms, vlm, gguf such as llama-3.3, Linkage graphRAG / RAG
Stars: 1248
data:image/s3,"s3://crabby-images/09af9/09af983f4e17be114722910d7e3725a9e1dfede1" alt="screenshot"
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
README:
Comfyui_llm_party aims to develop a complete set of nodes for LLM workflow construction based on comfyui as the front end. It allows users to quickly and conveniently build their own LLM workflows and easily integrate them into their existing image workflows.
https://github.com/user-attachments/assets/945493c0-92b3-4244-ba8f-0c4b2ad4eba6
ComfyUI LLM Party, from the most basic LLM multi-tool call, role setting to quickly build your own exclusive AI assistant, to the industry-specific word vector RAG and GraphRAG to localize the management of the industry knowledge base; from a single agent pipeline, to the construction of complex agent-agent radial interaction mode and ring interaction mode; from the access to their own social APP (QQ, Feishu, Discord) required by individual users, to the one-stop LLM + TTS + ComfyUI workflow required by streaming media workers; from the simple start of the first LLM application required by ordinary students, to the various parameter debugging interfaces commonly used by scientific researchers, model adaptation. All of this, you can find the answer in ComfyUI LLM Party.
- If you have never used ComfyUI and encounter some dependency issues while installing the LLM party in ComfyUI, please click here to download the Windows portable package that includes the LLM party. Please note that this portable package contains only the party and manager plugins, and is exclusively compatible with the Windows operating system.(If you need to install LLM party into an existing comfyui, this step can be skipped.)
- Drag the following workflows into your comfyui, then use comfyui-Manager to install the missing nodes.
- Use API to call LLM: start_with_LLM_api
- Using aisuite to call LLM: start_with_aisuite
- Manage local LLM with ollama: start_with_Ollama
- Use local LLM in distributed format: start_with_LLM_local
- Use local LLM in GGUF format: start_with_LLM_GGUF
- Use local VLM in distributed format: start_with_VLM_local (testing, currently only supports Llama-3.2-Vision-Instruct)
- Use local VLM in GGUF format: start_with_VLM_GGUF
- If you are using API, fill in your
base_url
(it can be a relay API, make sure it ends with/v1/
), for example:https://api.openai.com/v1/
andapi_key
in the API LLM loader node. - If you are using ollama, turn on the
is_ollama
option in the API LLM loader node, no need to fill inbase_url
andapi_key
. - If you are using a local model, fill in your model path in the local model loader node, for example:
E:\model\Llama-3.2-1B-Instruct
. You can also fill in the Huggingface model repo id in the local model loader node, for example:lllyasviel/omost-llama-3-8b-4bits
. - Due to the high usage threshold of this project, even if you choose the quick start, I hope you can patiently read through the project homepage.
- A brand new image hosting node has been added, currently supporting the image hosting services at https://sm.ms (with the regional domain for China being https://smms.app) and https://imgbb.com. More image hosting services will be supported in the future. Sample workflow: Image Hosting
-
The imgbb image hosting service, which is compatible by default with the party, has been updated to the domain imgbb. The previous image hosting service was replaced due to its unfriendliness towards users in mainland China.I sincerely apologize, as it seems that the API service for the image hosting at https://imgbb.io has been discontinued. Therefore, the code has reverted to the original https://imgbb.com. Thank you for your understanding. In the future, I will update a node that supports more image hosting services. - The MCP tool has been updated. You can modify the configuration in the 'mcp_config.json' file located in the party project folder to connect to your desired MCP server. You can find various MCP server configuration parameters that you may want to add here: modelcontextprotocol/servers. The default configuration for this project is the Everything server, which serves as a testing MCP server to verify its functionality. Reference workflow: start_with_MCP. Developer note: The MCP tool node can connect to the MCP server you have configured and convert the tools from the server into tools that can be directly used by LLMs. By configuring different local or cloud servers, you can experience all LLM tools available in the world.
-
For the instructions for using the node, please refer to: how to use nodes
-
If there are any issues with the plugin or you have other questions, feel free to join the QQ group: 931057213 | discord:discord.
-
More workflows please refer to the workflow folder.
data:image/s3,"s3://crabby-images/a88bb/a88bb552060416258a4a1d569e278946764f26e6" alt="octocat"
data:image/s3,"s3://crabby-images/28f79/28f79fddbc3dae25179b58adba0b6dc8f3f5cc08" alt="octocat"
- Support all API calls in openai format(Combined with oneapi can call almost all LLM APIs, also supports all transit APIs), base_url selection reference config.ini.example, which has been tested so far:
- openai (Perfectly compatible with all OpenAI models, including the 4o and o1 series!)
- ollama (Recommended! If you are calling locally, it is highly recommended to use the ollama method to host your local model!)
- Azure OpenAI
- llama.cpp (Recommended! If you want to use the local gguf format model, you can use the llama.cpp project's API to access this project!)
- Grok
- Tongyi Qianwen /qwen
- zhipu qingyan/glm
- deepseek
- kimi/moonshot
- doubao
- spark
- Gemini(The original Gemini API LLM loader node has been deprecated in the new version. Please use the LLM API loader node, with the base_url selected as: https://generativelanguage.googleapis.com/v1beta/)
- Support for all API calls compatible with aisuite:
- Compatible with most local models in the transformer library (the model type on the local LLM model chain node has been changed to LLM, VLM-GGUF, and LLM-GGUF, corresponding to directly loading LLM models, loading VLM models, and loading GGUF format LLM models). If your VLM or GGUF format LLM model reports an error, please download the latest version of llama-cpp-python from llama-cpp-python. Currently tested models include:
- ClosedCharacter/Peach-9B-8k-Roleplay(Recommended! Role-playing model)
- lllyasviel/omost-llama-3-8b-4bits(Recommended! Rich prompt model)
- meta-llama/llama-2-7b-chat-hf
- Qwen/Qwen2-7B-Instruct
- openbmb/MiniCPM-V-2_6-gguf
- lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
- meta-llama/Llama-3.2-11B-Vision-Instruct
- Model download
- Quark cloud address
- Baidu cloud address, extraction code: qyhu
- You can configure the language in
config.ini
, currently only Chinese (zh_CN) and English (en_US), the default is your system language. - Install using one of the following methods:
- Search for comfyui_LLM_party in the comfyui manager and install it with one click.
- Restart comfyui.
- Navigate to the
custom_nodes
subfolder under the ComfyUI root folder. - Clone this repository with
git clone https://github.com/heshengtao/comfyui_LLM_party.git
.
- Click
CODE
in the upper right corner. - Click
download zip
. - Unzip the downloaded package into the
custom_nodes
subfolder under the ComfyUI root folder.
- Navigate to the
comfyui_LLM_party
project folder. - Enter
pip install -r requirements.txt
in the terminal to deploy the third-party libraries required by the project into the comfyui environment. Please ensure you are installing within the comfyui environment and pay attention to anypip
errors in the terminal. - If you are using the comfyui launcher, you need to enter
path_in_launcher_configuration\python_embeded\python.exe -m pip install -r requirements.txt
in the terminal to install. Thepython_embeded
folder is usually at the same level as yourComfyUI
folder. - If you have some environment configuration problems, you can try to use the dependencies in
requirements_fixed.txt
.
APIKEY can be configured using one of the following methods
- Open the
config.ini
file in the project folder of thecomfyui_LLM_party
. - Enter your openai_api_key, base_url in
config.ini
. - If you are using an ollama model, fill in
http://127.0.0.1:11434/v1/
inbase_url
,ollama
inopenai_api_key
, and your model name inmodel_name
, for example:llama3
. - If you want to use Google search or Bing search tools, enter your
google_api_key
,cse_id
orbing_api_key
inconfig.ini
. - If you want to use image input LLM, it is recommended to use image bed imgbb and enter your imgbb_api in
config.ini
. - Each model can be configured separately in the
config.ini
file, which can be filled in by referring to theconfig.ini.example
file. After you configure it, just entermodel_name
on the node.
- Open the comfyui interface.
- Create a Large Language Model (LLM) node and enter your openai_api_key and base_url directly in the node.
- If you use the ollama model, use LLM_api node, fill in
http://127.0.0.1:11434/v1/
inbase_url
node, fill inollama
inapi_key
, and fill in your model name inmodel_name
, for example:llama3
. - If you want to use image input LLM, it is recommended to use graph bed imgbb and enter your
imgbb_api_key
on the node.
- More model adaptations;
- More ways to build agents;
- More automation features;
- More knowledge base management features;
- More tools, more personas.
This open-source project and its contents (hereinafter referred to as "Project") are provided for reference purposes only and do not imply any form of warranty, either expressed or implied. The contributors of the Project shall not be held responsible for the completeness, accuracy, reliability, or suitability of the Project. Any reliance you place on the Project is strictly at your own risk. In no event shall the contributors of the Project be liable for any indirect, special, or consequential damages or any damages whatsoever resulting from the use of the Project.
Some of the nodes in this project have borrowed from the following projects. Thank you for your contributions to the open-source community!
If there is a problem with the plugin or you have any other questions, please join our community.
- discord:discord link
- QQ group:
931057213
- WeChat group:
we_glm
(enter the group after adding the small assistant WeChat)
- If you want to continue to pay attention to the latest features of this project, please follow the Bilibili account: Party host BB machine
- youtube@comfyui-LLM-party
If my work has brought value to your day, consider fueling it with a coffee! Your support not only energizes the project but also warms the heart of the creator. ☕💖 Every cup makes a difference!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for comfyui_LLM_party
Similar Open Source Tools
data:image/s3,"s3://crabby-images/09af9/09af983f4e17be114722910d7e3725a9e1dfede1" alt="comfyui_LLM_party Screenshot"
comfyui_LLM_party
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
data:image/s3,"s3://crabby-images/8954f/8954f8ab9291f47e20b7815e976c342c570e0e32" alt="Pandrator Screenshot"
Pandrator
Pandrator is a GUI tool for generating audiobooks and dubbing using voice cloning and AI. It transforms text, PDF, EPUB, and SRT files into spoken audio in multiple languages. It leverages XTTS, Silero, and VoiceCraft models for text-to-speech conversion and voice cloning, with additional features like LLM-based text preprocessing and NISQA for audio quality evaluation. The tool aims to be user-friendly with a one-click installer and a graphical interface.
data:image/s3,"s3://crabby-images/ce140/ce140d9b2e3038283f604da301c44e045c06ddd4" alt="FigStep Screenshot"
FigStep
FigStep is a black-box jailbreaking algorithm against large vision-language models (VLMs). It feeds harmful instructions through the image channel and uses benign text prompts to induce VLMs to output contents that violate common AI safety policies. The tool highlights the vulnerability of VLMs to jailbreaking attacks, emphasizing the need for safety alignments between visual and textual modalities.
data:image/s3,"s3://crabby-images/51aa7/51aa71ba1cd6c05ebd95d8d3d60bf3b7836c8e71" alt="gemma Screenshot"
gemma
Gemma is a family of open-weights Large Language Model (LLM) by Google DeepMind, based on Gemini research and technology. This repository contains an inference implementation and examples, based on the Flax and JAX frameworks. Gemma can run on CPU, GPU, and TPU, with model checkpoints available for download. It provides tutorials, reference implementations, and Colab notebooks for tasks like sampling and fine-tuning. Users can contribute to Gemma through bug reports and pull requests. The code is licensed under the Apache License, Version 2.0.
data:image/s3,"s3://crabby-images/85536/85536269fa4e97ff6d78c70c6a0601c48e48f3ab" alt="llama3-tokenizer-js Screenshot"
llama3-tokenizer-js
JavaScript tokenizer for LLaMA 3 designed for client-side use in the browser and Node, with TypeScript support. It accurately calculates token count, has 0 dependencies, optimized running time, and somewhat optimized bundle size. Compatible with most LLaMA 3 models. Can encode and decode text, but training is not supported. Pollutes global namespace with `llama3Tokenizer` in the browser. Mostly compatible with LLaMA 3 models released by Facebook in April 2024. Can be adapted for incompatible models by passing custom vocab and merge data. Handles special tokens and fine tunes. Developed by belladore.ai with contributions from xenova, blaze2004, imoneoi, and ConProgramming.
data:image/s3,"s3://crabby-images/1fe71/1fe71b6e89e57eea8e65c1ee045817e4542be5b4" alt="OpenGlass Screenshot"
OpenGlass
OpenGlass is an open-source project that allows users to transform any regular glasses into smart glasses using affordable off-the-shelf components. With a cost of less than $25, users can enhance their glasses to record their daily activities, recognize people, identify objects, translate text, and more. The project provides detailed instructions on hardware setup and software installation, making it accessible for DIY enthusiasts and tech enthusiasts alike. By following the steps outlined in the repository, users can create their own smart glasses and explore various functionalities offered by the project.
data:image/s3,"s3://crabby-images/b5d33/b5d339fc6536aef7b4f387eafde926ecf4517102" alt="council Screenshot"
council
Council is an open-source platform designed for the rapid development and deployment of customized generative AI applications using teams of agents. It extends the LLM tool ecosystem by providing advanced control flow and scalable oversight for AI agents. Users can create sophisticated agents with predictable behavior by leveraging Council's powerful approach to control flow using Controllers, Filters, Evaluators, and Budgets. The framework allows for automated routing between agents, comparing, evaluating, and selecting the best results for a task. Council aims to facilitate packaging and deploying agents at scale on multiple platforms while enabling enterprise-grade monitoring and quality control.
data:image/s3,"s3://crabby-images/07aae/07aae9f6695cd8ec2ef21b69b3a66013b94ae689" alt="ersilia Screenshot"
ersilia
The Ersilia Model Hub is a unified platform of pre-trained AI/ML models dedicated to infectious and neglected disease research. It offers an open-source, low-code solution that provides seamless access to AI/ML models for drug discovery. Models housed in the hub come from two sources: published models from literature (with due third-party acknowledgment) and custom models developed by the Ersilia team or contributors.
data:image/s3,"s3://crabby-images/556c9/556c9d6f4a5051cd993025b97ac45b645524c3b8" alt="aici Screenshot"
aici
The Artificial Intelligence Controller Interface (AICI) lets you build Controllers that constrain and direct output of a Large Language Model (LLM) in real time. Controllers are flexible programs capable of implementing constrained decoding, dynamic editing of prompts and generated text, and coordinating execution across multiple, parallel generations. Controllers incorporate custom logic during the token-by-token decoding and maintain state during an LLM request. This allows diverse Controller strategies, from programmatic or query-based decoding to multi-agent conversations to execute efficiently in tight integration with the LLM itself.
data:image/s3,"s3://crabby-images/1db87/1db878301c67a303e1b1d8989670640fc37b01df" alt="cellm Screenshot"
cellm
Cellm is an Excel extension that allows users to leverage Large Language Models (LLMs) like ChatGPT within cell formulas. It enables users to extract AI responses to text ranges, making it useful for automating repetitive tasks that involve data processing and analysis. Cellm supports various models from Anthropic, Mistral, OpenAI, and Google, as well as locally hosted models via Llamafiles, Ollama, or vLLM. The tool is designed to simplify the integration of AI capabilities into Excel for tasks such as text classification, data cleaning, content summarization, entity extraction, and more.
data:image/s3,"s3://crabby-images/66d8f/66d8f65fcb9f5c4f6ca10bcb098a8158cdaeb4cb" alt="serverless-pdf-chat Screenshot"
serverless-pdf-chat
The serverless-pdf-chat repository contains a sample application that allows users to ask natural language questions of any PDF document they upload. It leverages serverless services like Amazon Bedrock, AWS Lambda, and Amazon DynamoDB to provide text generation and analysis capabilities. The application architecture involves uploading a PDF document to an S3 bucket, extracting metadata, converting text to vectors, and using a LangChain to search for information related to user prompts. The application is not intended for production use and serves as a demonstration and educational tool.
data:image/s3,"s3://crabby-images/e119a/e119adb14b7aa46f00245c84ee0ebd4e824a62d1" alt="kafka-ml Screenshot"
kafka-ml
Kafka-ML is a framework designed to manage the pipeline of Tensorflow/Keras and PyTorch machine learning models on Kubernetes. It enables the design, training, and inference of ML models with datasets fed through Apache Kafka, connecting them directly to data streams like those from IoT devices. The Web UI allows easy definition of ML models without external libraries, catering to both experts and non-experts in ML/AI.
data:image/s3,"s3://crabby-images/5cb83/5cb839b7535d0557372ea0ea17ba8f0f7ad394d1" alt="airflow Screenshot"
airflow
Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative. Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.
data:image/s3,"s3://crabby-images/5fca1/5fca1b5093e7ce030b80e4999b4d3a9639d84081" alt="LLMs-World-Models-for-Planning Screenshot"
LLMs-World-Models-for-Planning
This repository provides a Python implementation of a method that leverages pre-trained large language models to construct and utilize world models for model-based task planning. It includes scripts to generate domain models using natural language descriptions, correct domain models based on feedback, and support plan generation for tasks in different domains. The code has been refactored for better readability and includes tools for validating PDDL syntax and handling corrective feedback.
data:image/s3,"s3://crabby-images/877b9/877b96081dda3228b59f794486ce9f0e7bc405a1" alt="engine-core Screenshot"
engine-core
Engine Core is a project that demonstrates a pattern for enabling Large Language Models (LLMs) to undertake tasks with a dynamic system prompt and a collection of tool functions known as chat strategies. These strategies allow for the dynamic alteration of chat history, system prompts, and available tools on every run. The project includes example strategies such as demoStrategy, backendStrategy, and shellStrategy. Additionally, LLM integrations like Anthropic or OpenAI have been extracted into adapters to enable running the same app code and strategies while switching foundation models.
data:image/s3,"s3://crabby-images/d817a/d817aa323f7992cac269e3f2ca1239939fc8eba1" alt="vigenair Screenshot"
vigenair
ViGenAiR is a tool that harnesses the power of Generative AI models on Google Cloud Platform to automatically transform long-form Video Ads into shorter variants, targeting different audiences. It generates video, image, and text assets for Demand Gen and YouTube video campaigns. Users can steer the model towards generating desired videos, conduct A/B testing, and benefit from various creative features. The tool offers benefits like diverse inventory, compelling video ads, creative excellence, user control, and performance insights. ViGenAiR works by analyzing video content, splitting it into coherent segments, and generating variants following Google's best practices for effective ads.
For similar tasks
data:image/s3,"s3://crabby-images/09af9/09af983f4e17be114722910d7e3725a9e1dfede1" alt="comfyui_LLM_party Screenshot"
comfyui_LLM_party
COMFYUI LLM PARTY is a node library designed for LLM workflow development in ComfyUI, an extremely minimalist UI interface primarily used for AI drawing and SD model-based workflows. The project aims to provide a complete set of nodes for constructing LLM workflows, enabling users to easily integrate them into existing SD workflows. It features various functionalities such as API integration, local large model integration, RAG support, code interpreters, online queries, conditional statements, looping links for large models, persona mask attachment, and tool invocations for weather lookup, time lookup, knowledge base, code execution, web search, and single-page search. Users can rapidly develop web applications using API + Streamlit and utilize LLM as a tool node. Additionally, the project includes an omnipotent interpreter node that allows the large model to perform any task, with recommendations to use the 'show_text' node for display output.
data:image/s3,"s3://crabby-images/f0700/f07004d5a506cbbe959fcf888b4a2f2f1670a3d9" alt="n8n Screenshot"
n8n
n8n is a workflow automation platform that combines the flexibility of code with the speed of no-code. It offers 400+ integrations, native AI capabilities, and a fair-code license, empowering users to create powerful automations while maintaining control over data and deployments. With features like code customization, AI agent workflows, self-hosting options, enterprise-ready functionalities, and an active community, n8n provides a comprehensive solution for technical teams seeking efficient workflow automation.
data:image/s3,"s3://crabby-images/d9a23/d9a23ed06b5ea90479ca2be8788ab4ebacf67573" alt="hollama Screenshot"
hollama
Hollama is a minimal web-UI tool designed for interacting with Ollama servers. It features large prompt fields, streams completions, ability to copy completions as raw text, Markdown parsing with syntax highlighting, and saves sessions/context in the browser's localStorage. Users can access the latest version of Hollama at https://hollama.fernando.is without sign up, and data is stored locally on the browser. The tool can also be run as a Docker image by executing a specific command. Developers can connect to an Ollama server by updating the ORIGIN settings. Hollama facilitates easy development by providing instructions to set up the environment, install dependencies, and start a development server. Building a production version of the app is straightforward with a single command, and deployment may require installing an adapter for the target environment.
data:image/s3,"s3://crabby-images/83dbc/83dbc31387020699cdc13b49c3892d39b204051e" alt="holmesgpt Screenshot"
holmesgpt
HolmesGPT is an open-source DevOps assistant powered by OpenAI or any tool-calling LLM of your choice. It helps in troubleshooting Kubernetes, incident response, ticket management, automated investigation, and runbook automation in plain English. The tool connects to existing observability data, is compliance-friendly, provides transparent results, supports extensible data sources, runbook automation, and integrates with existing workflows. Users can install HolmesGPT using Brew, prebuilt Docker container, Python Poetry, or Docker. The tool requires an API key for functioning and supports OpenAI, Azure AI, and self-hosted LLMs.
For similar jobs
data:image/s3,"s3://crabby-images/43708/437080ec744fd1aaa91d5cbae9630bcd2fe48ef0" alt="promptflow Screenshot"
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
data:image/s3,"s3://crabby-images/ab8b8/ab8b8cebd0341c74187b3d61aeb87e0f2fb2cdb3" alt="deepeval Screenshot"
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
data:image/s3,"s3://crabby-images/e1c9c/e1c9cb6476b28bd2e7747bd8bb648f589e7a8a58" alt="MegaDetector Screenshot"
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
data:image/s3,"s3://crabby-images/293f8/293f804c9c75f7eea066dbb9641a9e2a720352a9" alt="leapfrogai Screenshot"
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
data:image/s3,"s3://crabby-images/e9e57/e9e57c48e1f1a24513c9f0787d43e28ff7e2f1e0" alt="llava-docker Screenshot"
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
data:image/s3,"s3://crabby-images/42ce0/42ce00b37a94142cfef613e1bd0b671a2b2ac93b" alt="carrot Screenshot"
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
data:image/s3,"s3://crabby-images/05dd1/05dd14da234de136a653943437543f3f64d17b13" alt="TrustLLM Screenshot"
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
data:image/s3,"s3://crabby-images/a2f2b/a2f2bf9f354435d8b89f863ff2d3666def187740" alt="AI-YinMei Screenshot"
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.