
gpt_server
gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。
Stars: 208

The GPT Server project leverages the basic capabilities of FastChat to provide the capabilities of an openai server. It perfectly adapts more models, optimizes models with poor compatibility in FastChat, and supports loading vllm, LMDeploy, and hf in various ways. It also supports all sentence_transformers compatible semantic vector models, including Chat templates with function roles, Function Calling (Tools) capability, and multi-modal large models. The project aims to reduce the difficulty of model adaptation and project usage, making it easier to deploy the latest models with minimal code changes.
README:
本项目依托fastchat的基础能力来提供openai server的能力.
- 支持Chat、Embedding、ReRanker、text-moderation(文本审核,分类)、ASR、TTS(支持声音克隆)、SD(Stable Diffusion,文生图、文生视频、图片编辑、) 模型的 openai规范 接口服务。
- 支持HF、vLLM、LMDeploy和SGLang 多种加速推理后端引擎。
- 多个模型共用openai server的同一个端口进行调用,自动进行模型调度。
如果 GPT Server 对您有帮助,欢迎留下一个 ⭐ Star!
功能 | 说明 | |
---|---|---|
🎨 | OpenAI服务接口 | 支持 OpenAI 服务接口规范,兼容所有支持 OpenAI的项目工程 |
🚀 | 多后端引擎推理 | 支持 vLLM 、SGLang 、LMDeploy 、HF 多种高性能推理引擎 |
🎯 | Embedding/Reranker | 支持所有兼容Sentence_Transformers 的语义向量或重排模型,支持了Infinity后端,Embedding推理速度大于onnx/tensorrt,支持动态组批 |
🎛️ | Text-moderation(文本审核,分类) | 支持OpenAI 服务接口规范的文本审核,分类 |
📱 | ASR(语音转文本) | 支持基于FunASR 的ASR模型 |
🔊 | TTS(文本转语音) | 支持基于SparkTTS 的TTS模型,支持基于vLLM 、SGLang 后端对齐加速,RTF<<1 ,支持流式音频流输出 |
🖌️ | SD(Stable Diffusion,文生图) | 支持基于diffusers 的 文生图 模型 |
🏔️ | SD(Stable Diffusion,图片编辑) | 支持基于diffusers 的 图片编辑 模型 |
🔄 | 支持LM/VL模型 | 支持多种大语言模型或多模态语言模型 |
🎭 | 推理服务性能测试 | 基于Evalscope 实现Throughput 、TTFT 、TPOT 等服务性能指标 |
- 支持guided_decoding,强制模型按照Schema的要求进行JSON格式输出。
- 支持了Tools(Function Calling)功能,并优化Tools解析方式,大大提高tools的调用成功率。兼容LangChain的 bind_tools、with_structured_output写法(目前支持Qwen系列、GLM系列)
- 支持了cohere库接口规范的 /v1/rerank 接口,在dify中可用。
- 全球唯一扩展了openai库,实现Reranker模型(rerank, /v1/rerank)。(代码样例见gpt_server/tests/test_openai_rerank.py)
- 全球唯一支持了openai库的文本审核模型接口(text-moderation, /v1/moderations)。(代码样例见gpt_server/tests/test_openai_moderation.py)
- 全球唯一支持了openai库的TTS模型接口(tts, /v1/audio/speech)(代码样例见gpt_server/tests/test_openai_tts_stream.py)
- 全球唯一支持了openai库的ASR模型接口(asr, /v1/audio/transcriptions),基于fanasr后端(代码样例见gpt_server/tests/test_openai_transcriptions.py)
- 全球唯一支持了openai库的SD,文生图模型接口(sd, /v1/images/generations),基于diffusers后端(代码样例见gpt_server/tests/test_image_gen.py)
- 全球唯一支持了openai库的SD,文生图模型接口(sd, /v1/images/edits),基于diffusers后端(代码样例见gpt_server/tests/test_image_edit.py)
通过这个样例文件,可以很快的掌握项目的配置方式。
配置文件的详细说明信息位于:config_example.yaml
2025
2025-9-7 支持了 文本编辑模型 (代码样例见gpt_server/tests/test_image_edit.py)
2025-8-8 初步支持了 embedding 的 vllm 加速
2025-6-17 支持了 jina-reranker-m0 全球首个支持多模态多语言的重排模型
2025-6-12 支持了 文生图模型 flux (代码样例见gpt_server/tests/test_image_gen.py)
2025-6-6 支持了 bge-vl 系列 (代码样例见gpt_server/tests/test_openai_embedding_vl.py)
2025-6-6 支持了 ritrieve_zh_v1
2025-4-29 支持了 Qwen3
2025-4-24 支持了 Spark-TTS后端的 TTS
2025-4-14 支持了 SGLang后端以及部分VL模型
2025-4-2 支持了 OpenAI的ASR接口 /v1/audio/transcriptions
2025-4-1 支持了 internvl2.5模型
2025-2-9 支持了 QVQ
2024
2024-12-22 支持了 tts, /v1/audio/speech TTS模型
2024-12-21 支持了 text-moderation, /v1/moderations 文本审核模型
2024-12-14 支持了 phi-4
2024-12-7 支持了 /v1/rerank 接口
2024-12-1 支持了 QWQ-32B-Preview
2024-10-15 支持了 Qwen2-VL
2024-9-19 支持了 minicpmv 模型
2024-8-17 支持了 vllm/hf 后端的 lora 部署
2024-8-14 支持了 InternVL2 系列多模态模型
2024-7-28 支持了 embedding/reranker 的动态组批加速(infinity后端, 比onnx/tensorrt更快)
2024-7-19 支持了多模态模型 glm-4v-gb 的LMDeploy PyTorch后端
2024-6-22 支持了 Qwen系列、ChatGLM系列 function call (tools) 能力
2024-6-12 支持了 qwen-2
2024-6-5 支持了 Yinka、zpoint_large_embedding_zh 嵌入模型
2024-6-5 支持了 glm4-9b系列(hf和vllm)
2024-4-27 支持了 LMDeploy 加速推理后端
2024-4-20 支持了 llama-3
2024-4-13 支持了 deepseek
2024-4-4 支持了 embedding模型 acge_text_embedding
2024-3-9 支持了 reranker 模型 ( bge-reranker,bce-reranker-base_v1)
2024-3-3 支持了 internlm-1.0 ,internlm-2.0
2024-3-2 支持了 qwen-1.5 0.5B, 1.8B, 4B, 7B, 14B, and 72B
2024-2-4 支持了 vllm 实现
2024-1-6 支持了 Yi-34B
2023
2023-12-31 支持了 qwen-7b, qwen-14b
2023-12-30 支持了 all-embedding(理论上支持所有的词嵌入模型)
2023-12-24 支持了 chatglm3-6b
- [X] 支持HF后端
- [X] 支持vLLM后端
- [X] 支持LMDeploy后端
- [X] 支持SGLang后端
- [X] 支持 文本转语音 TTS 模型
- [X] 支持 语音转文本 ASR 模型
- [X] 支持 文本审核 模型
- [X] 支持 function call 功能 (tools)(Qwen系列、ChatGLM系列已经支持,后面有需求再继续扩展)
- [X] 支持多模态模型(初步支持glm-4v,其它模型后续慢慢支持)
- [X] 支持Embedding模型动态组批(实现方式:infinity后端)
- [X] 支持Reranker模型动态组批(实现方式:infinity后端)
- [X] 可视化启动界面(不稳定,对开发人员来说比较鸡肋,后期将弃用!)
- [X] 并行的function call功能(tools)
- [X] 支持 文生图 模型
- [X] 支持 图片编辑 模型
- [ ] 支持 pip install 方式进行安装
# 安装 uv
pip install uv -U # 或查看教程 https://docs.astral.sh/uv/getting-started/installation/#standalone-installer
# uv venv --seed # (可选)创建 uv 虚拟环境,并设置seed
uv sync
source .venv/bin/activate # 激活 uv 环境
# 1. 创建conda 环境
conda create -n gpt_server python=3.10
# 2. 激活conda 环境
conda activate gpt_server
# 3. 安装仓库(一定要使用 install.sh 安装,否则无法解决依赖冲突)
bash install.sh
配置文件的详细说明信息位于:config_example.yaml
# 进入script目录
cd gpt_server/script
# 复制样例配置文件
cp config_example.yaml config.yaml
uv run gpt_server/serving/main.py
或者
sh gpt_server/script/start.sh
或者
python gpt_server/serving/main.py
docker pull 506610466/gpt_server:latest # 如果拉取失败可尝试下面的方式
# 如果国内无法拉取docker镜像,可以尝试下面的国内镜像拉取的方式(不保证国内镜像源一直可用)
docker pull docker.1ms.run/506610466/gpt_server:latest
- 构建镜像
docker build --rm -f "Dockerfile" -t gpt_server:latest "."
docker-compose -f "docker-compose.yml" up -d --build gpt_server
3.3 可视化UI方式启动服务(有Bug,已弃用,欢迎大佬优化代码)
cd gpt_server/serving
streamlit run server_ui.py
见 gpt_server/tests 目录 样例测试代码: https://github.com/shell-nlp/gpt_server/tree/main/tests
cd gpt_server/gpt_server/serving
streamlit run chat_ui.py
Chat UI界面:
推理速度: LMDeploy TurboMind > SGLang > vllm > LMDeploy PyTorch > HF
官方支持的模型本项目可以五分钟之内进行兼容,但由于本人时间关系,暂时本项目只支持了常用的一些模型,如果想要支持其它模型,请提Issue.
Models / BackEnd | model_type | HF | vllm | LMDeploy TurboMind | LMDeploy PyTorch | SGLang |
---|---|---|---|---|---|---|
chatglm4-9b | chatglm | √ | √ | √ | √ | √ |
chatglm3-6b | chatglm | √ | √ | × | √ | √ |
Qwen-1.0--3.0 | qwen | √ | √ | √ | √ | √ |
Yi-34B | yi | √ | √ | √ | √ | √ |
Internlm-1.0--2.0 | internlm | √ | √ | √ | √ | √ |
Deepseek | deepseek | √ | √ | √ | √ | √ |
Llama-3 | llama | √ | √ | √ | √ | √ |
Baichuan-2 | baichuan | √ | √ | √ | √ | √ |
QWQ-32B | qwen | √ | √ | √ | √ | √ |
Phi-4 | phi | √ | √ | × | × | √ |
VLM (视觉大模型榜单 https://rank.opencompass.org.cn/leaderboard-multimodal)
Models / BackEnd | model_type | HF | vllm | LMDeploy TurboMind | LMDeploy PyTorch | SGLang |
---|---|---|---|---|---|---|
glm-4v-9b | chatglm | × | × | × | √ | × |
InternVL2 | internvl | × | × | √ | √ | × |
InternVL2.5--3.5 | internvl | × | × | √ | √ | × |
MiniCPM-V-2.6 | minicpmv | × | √ | √ | × | × |
MiniCPM-V-4.5 | minicpmv | × | √ | × | × | × |
Qwen2-VL | qwen | × | √ | × | √ | √ |
Qwen2.5-VL | qwen | × | √ | × | √ | √ |
QVQ | qwen | × | √ | × | × | × |
原则上支持所有的Embedding/Rerank/Classify模型
推理速度: infinity > sentence_transformers
以下模型经过测试可放心使用:
Models / BackEnd | sentence_transformers | infinity | vllm |
---|---|---|---|
bge-m3 | √ | √ | √ |
bge-embedding | √ | √ | √ |
bce-embedding | √ | √ | √ |
puff | √ | √ | √ |
piccolo-base-zh-embedding | √ | √ | √ |
acge_text_embedding | √ | √ | √ |
Yinka | √ | √ | √ |
zpoint_large_embedding_zh | √ | √ | √ |
xiaobu-embedding | √ | √ | √ |
Conan-embedding-v1 | √ | √ | √ |
qwen3-embedding | √ | √ | √ |
ritrieve_zh_v1 | √ | √ | √ |
jina-embeddings-v3 | √ | √ | √ |
KoalaAI/Text-Moderation(文本审核/多分类,审核文本是否存在暴力、色情等) | × | √ | × |
protectai/deberta-v3-base-prompt-injection-v2(提示注入/2分类,审核文本为提示注入) | × | √ | × |
bge-vl | √ | × | × |
jina-reranker-m0 | √ | × | × |
bge-reranker | √ | √ | × |
bce-reranker | √ | √ | × |
目前 ritrieve_zh_v1 C-MTEB榜单排行第一(MTEB: https://huggingface.co/spaces/mteb/leaderboard)
ASR (支持FunASR非实时模型 https://github.com/modelscope/FunASR/blob/main/README_zh.md)
目前只测试了SenseVoiceSmall模型(性能最优的),其它模型的支持情况只是从官方文档中拷贝过来,不一定可以正常使用,欢迎测试/提issue。
Models / BackEnd | model_type |
---|---|
SenseVoiceSmall | funasr |
paraformer-zh | funasr |
paraformer-en | funasr |
conformer-en | funasr |
Whisper-large-v3 | funasr |
Whisper-large-v3-turbo | funasr |
Qwen-Audio | funasr |
Qwen-Audio-Chat | funasr |
Models / BackEnd | model_type |
---|---|
Spark-TTS | spark_tts |
Models / BackEnd | model_type |
---|---|
flux | flux |
Models / BackEnd | model_type |
---|---|
Qwen-Image-Edit | qwen_image_edit |
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for gpt_server
Similar Open Source Tools

gpt_server
The GPT Server project leverages the basic capabilities of FastChat to provide the capabilities of an openai server. It perfectly adapts more models, optimizes models with poor compatibility in FastChat, and supports loading vllm, LMDeploy, and hf in various ways. It also supports all sentence_transformers compatible semantic vector models, including Chat templates with function roles, Function Calling (Tools) capability, and multi-modal large models. The project aims to reduce the difficulty of model adaptation and project usage, making it easier to deploy the latest models with minimal code changes.

DISC-LawLLM
DISC-LawLLM is a legal domain large model that aims to provide professional, intelligent, and comprehensive **legal services** to users. It is developed and open-sourced by the Data Intelligence and Social Computing Lab (Fudan-DISC) at Fudan University.

pmhub
PmHub is a smart project management system based on SpringCloud, SpringCloud Alibaba, and LLM. It aims to help students quickly grasp the architecture design and development process of microservices/distributed projects. PmHub provides a platform for students to experience the transformation from monolithic to microservices architecture, understand the pros and cons of both architectures, and prepare for job interviews. It offers popular technologies like SpringCloud-Gateway, Nacos, Sentinel, and provides high-quality code, continuous integration, product design documents, and an enterprise workflow system. PmHub is suitable for beginners and advanced learners who want to master core knowledge of microservices/distributed projects.

MedicalGPT
MedicalGPT is a training medical GPT model with ChatGPT training pipeline, implement of Pretraining, Supervised Finetuning, RLHF(Reward Modeling and Reinforcement Learning) and DPO(Direct Preference Optimization).

AstrBot
AstrBot is an open-source one-stop Agentic chatbot platform and development framework. It supports large model conversations, multiple messaging platforms, Agent capabilities, plugin extensions, and WebUI for visual configuration and management of the chatbot.

llms-from-scratch-cn
This repository provides a detailed tutorial on how to build your own large language model (LLM) from scratch. It includes all the code necessary to create a GPT-like LLM, covering the encoding, pre-training, and fine-tuning processes. The tutorial is written in a clear and concise style, with plenty of examples and illustrations to help you understand the concepts involved. It is suitable for developers and researchers with some programming experience who are interested in learning more about LLMs and how to build them.

MiniCPM
MiniCPM is a series of open-source large models on the client side jointly developed by Face Intelligence and Tsinghua University Natural Language Processing Laboratory. The main language model MiniCPM-2B has only 2.4 billion (2.4B) non-word embedding parameters, with a total of 2.7B parameters. - After SFT, MiniCPM-2B performs similarly to Mistral-7B on public comprehensive evaluation sets (better in Chinese, mathematics, and code capabilities), and outperforms models such as Llama2-13B, MPT-30B, and Falcon-40B overall. - After DPO, MiniCPM-2B also surpasses many representative open-source large models such as Llama2-70B-Chat, Vicuna-33B, Mistral-7B-Instruct-v0.1, and Zephyr-7B-alpha on the current evaluation set MTBench, which is closest to the user experience. - Based on MiniCPM-2B, a multi-modal large model MiniCPM-V 2.0 on the client side is constructed, which achieves the best performance of models below 7B in multiple test benchmarks, and surpasses larger parameter scale models such as Qwen-VL-Chat 9.6B, CogVLM-Chat 17.4B, and Yi-VL 34B on the OpenCompass leaderboard. MiniCPM-V 2.0 also demonstrates leading OCR capabilities, approaching Gemini Pro in scene text recognition capabilities. - After Int4 quantization, MiniCPM can be deployed and inferred on mobile phones, with a streaming output speed slightly higher than human speech speed. MiniCPM-V also directly runs through the deployment of multi-modal large models on mobile phones. - A single 1080/2080 can efficiently fine-tune parameters, and a single 3090/4090 can fully fine-tune parameters. A single machine can continuously train MiniCPM, and the secondary development cost is relatively low.

AstrBot
AstrBot is a powerful and versatile tool that leverages the capabilities of large language models (LLMs) like GPT-3, GPT-3.5, and GPT-4 to enhance communication and automate tasks. It seamlessly integrates with popular messaging platforms such as QQ, QQ Channel, and Telegram, enabling users to harness the power of AI within their daily conversations and workflows.

MindChat
MindChat is a psychological large language model designed to help individuals relieve psychological stress and solve mental confusion, ultimately improving mental health. It aims to provide a relaxed and open conversation environment for users to build trust and understanding. MindChat offers privacy, warmth, safety, timely, and convenient conversation settings to help users overcome difficulties and challenges, achieve self-growth, and development. The tool is suitable for both work and personal life scenarios, providing comprehensive psychological support and therapeutic assistance to users while strictly protecting user privacy. It combines psychological knowledge with artificial intelligence technology to contribute to a healthier, more inclusive, and equal society.

ai-app
The 'ai-app' repository is a comprehensive collection of tools and resources related to artificial intelligence, focusing on topics such as server environment setup, PyCharm and Anaconda installation, large model deployment and training, Transformer principles, RAG technology, vector databases, AI image, voice, and music generation, and AI Agent frameworks. It also includes practical guides and tutorials on implementing various AI applications. The repository serves as a valuable resource for individuals interested in exploring different aspects of AI technology.

Chinese-LLaMA-Alpaca
This project open sources the **Chinese LLaMA model and the Alpaca large model fine-tuned with instructions**, to further promote the open research of large models in the Chinese NLP community. These models **extend the Chinese vocabulary based on the original LLaMA** and use Chinese data for secondary pre-training, further enhancing the basic Chinese semantic understanding ability. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, significantly improving the model's understanding and execution of instructions.

go-cyber
Cyber is a superintelligence protocol that aims to create a decentralized and censorship-resistant internet. It uses a novel consensus mechanism called CometBFT and a knowledge graph to store and process information. Cyber is designed to be scalable, secure, and efficient, and it has the potential to revolutionize the way we interact with the internet.

Chinese-LLaMA-Alpaca-2
Chinese-LLaMA-Alpaca-2 is a large Chinese language model developed by Meta AI. It is based on the Llama-2 model and has been further trained on a large dataset of Chinese text. Chinese-LLaMA-Alpaca-2 can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. Here are some of the key features of Chinese-LLaMA-Alpaca-2: * It is the largest Chinese language model ever trained, with 13 billion parameters. * It is trained on a massive dataset of Chinese text, including books, news articles, and social media posts. * It can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. * It is open-source and available for anyone to use. Chinese-LLaMA-Alpaca-2 is a powerful tool that can be used to improve the performance of a wide range of natural language processing tasks. It is a valuable resource for researchers and developers working in the field of artificial intelligence.

Llama-Chinese
Llama中文社区是一个专注于Llama模型在中文方面的优化和上层建设的高级技术社区。 **已经基于大规模中文数据,从预训练开始对Llama2模型进行中文能力的持续迭代升级【Done】**。**正在对Llama3模型进行中文能力的持续迭代升级【Doing】** 我们热忱欢迎对大模型LLM充满热情的开发者和研究者加入我们的行列。

jiwu-mall-chat-tauri
Jiwu Chat Tauri APP is a desktop chat application based on Nuxt3 + Tauri + Element Plus framework. It provides a beautiful user interface with integrated chat and social functions. It also supports AI shopping chat and global dark mode. Users can engage in real-time chat, share updates, and interact with AI customer service through this application.

VoiceBench
VoiceBench is a repository containing code and data for benchmarking LLM-Based Voice Assistants. It includes a leaderboard with rankings of various voice assistant models based on different evaluation metrics. The repository provides setup instructions, datasets, evaluation procedures, and a curated list of awesome voice assistants. Users can submit new voice assistant results through the issue tracker for updates on the ranking list.
For similar tasks

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

ray
Ray is a unified framework for scaling AI and Python applications. It consists of a core distributed runtime and a set of AI libraries for simplifying ML compute, including Data, Train, Tune, RLlib, and Serve. Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations. With Ray, you can seamlessly scale the same code from a laptop to a cluster, making it easy to meet the compute-intensive demands of modern ML workloads.

labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

djl
Deep Java Library (DJL) is an open-source, high-level, engine-agnostic Java framework for deep learning. It is designed to be easy to get started with and simple to use for Java developers. DJL provides a native Java development experience and allows users to integrate machine learning and deep learning models with their Java applications. The framework is deep learning engine agnostic, enabling users to switch engines at any point for optimal performance. DJL's ergonomic API interface guides users with best practices to accomplish deep learning tasks, such as running inference and training neural networks.

mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking

tt-metal
TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.

burn
Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.

awsome-distributed-training
This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.