new-api

AI模型接口管理与分发系统，支持将多种大模型转为统一格式调用，支持OpenAI、Claude等格式，可供个人或者企业内部管理与分发渠道使用，本项目基于One API二次开发。🍥 The next-generation LLM gateway and AI asset management system supports multiple languages.

Stars: 6490

Visit

New API is an open-source project based on One API with additional features and improvements. It offers a new UI interface, supports Midjourney-Proxy(Plus) interface, online recharge functionality, model-based charging, channel weight randomization, data dashboard, token-controlled models, Telegram authorization login, Suno API support, Rerank model integration, and various third-party models. Users can customize models, retry channels, and configure caching settings. The deployment can be done using Docker with SQLite or MySQL databases. The project provides documentation for Midjourney and Suno interfaces, and it is suitable for AI enthusiasts and developers looking to enhance AI capabilities.

README:

中文 | English

New API

🍥新一代大模型网关与AI资产管理系统

📝 项目说明

[!NOTE]
本项目为开源项目，在One API的基础上进行二次开发

[!IMPORTANT]

本项目仅供个人学习使用，不保证稳定性，且不提供任何技术支持。

使用者必须在遵循 OpenAI 的使用条款以及法律法规的情况下使用，不得用于非法用途。

根据《生成式人工智能服务管理暂行办法》的要求，请勿对中国地区公众提供一切未经备案的生成式人工智能服务。

📚 文档

详细文档请访问我们的官方Wiki：https://docs.newapi.pro/

✨ 主要特性

New API提供了丰富的功能，详细特性请参考特性说明：

🎨 全新的UI界面
🌍 多语言支持
💰 支持在线充值功能（易支付）
🔍 支持用key查询使用额度（配合neko-api-key-tool）
🔄 兼容原版One API的数据库
💵 支持模型按次数收费
⚖️ 支持渠道加权随机
📈 数据看板（控制台）
🔒 令牌分组、模型限制
🤖 支持更多授权登陆方式（LinuxDO,Telegram、OIDC）
🔄 支持Rerank模型（Cohere和Jina），接口文档
⚡ 支持OpenAI Realtime API（包括Azure渠道），接口文档
⚡ 支持Claude Messages 格式，接口文档
支持使用路由/chat2link进入聊天界面
🧠 支持通过模型名称后缀设置 reasoning effort：
1. OpenAI o系列模型
  - 添加后缀 -high 设置为 high reasoning effort (例如: o3-mini-high)
  - 添加后缀 -medium 设置为 medium reasoning effort (例如: o3-mini-medium)
  - 添加后缀 -low 设置为 low reasoning effort (例如: o3-mini-low)
2. Claude 思考模型
  - 添加后缀 -thinking 启用思考模式 (例如: claude-3-7-sonnet-20250219-thinking)
🔄 思考转内容功能
🔄 针对用户的模型限流功能
💰 缓存计费支持，开启后可以在缓存命中时按照设定的比例计费：
1. 在 系统设置-运营设置 中设置 提示缓存倍率 选项
2. 在渠道中设置 提示缓存倍率，范围 0-1，例如设置为 0.5 表示缓存命中时按照 50% 计费
3. 支持的渠道：
  - [x] OpenAI
  - [x] Azure
  - [x] DeepSeek
  - [x] Claude

模型支持

此版本支持多种模型，详情请参考接口文档-中继接口：

第三方模型 gpts （gpt-4-gizmo-*）
第三方渠道Midjourney-Proxy(Plus)接口，接口文档
第三方渠道Suno API接口，接口文档
自定义渠道，支持填入完整调用地址
Rerank模型（Cohere和Jina），接口文档
Claude Messages 格式，接口文档
Dify，当前仅支持chatflow

环境变量配置

详细配置说明请参考安装指南-环境变量配置：

GENERATE_DEFAULT_TOKEN：是否为新注册用户生成初始令牌，默认为 false
STREAMING_TIMEOUT：流式回复超时时间，默认60秒
DIFY_DEBUG：Dify渠道是否输出工作流和节点信息，默认 true
FORCE_STREAM_OPTION：是否覆盖客户端stream_options参数，默认 true
GET_MEDIA_TOKEN：是否统计图片token，默认 true
GET_MEDIA_TOKEN_NOT_STREAM：非流情况下是否统计图片token，默认 true
UPDATE_TASK：是否更新异步任务（Midjourney、Suno），默认 true
COHERE_SAFETY_SETTING：Cohere模型安全设置，可选值为 NONE, CONTEXTUAL, STRICT，默认 NONE
GEMINI_VISION_MAX_IMAGE_NUM：Gemini模型最大图片数量，默认 16
MAX_FILE_DOWNLOAD_MB: 最大文件下载大小，单位MB，默认 20
CRYPTO_SECRET：加密密钥，用于加密数据库内容
AZURE_DEFAULT_API_VERSION：Azure渠道默认API版本，默认 2024-12-01-preview
NOTIFICATION_LIMIT_DURATION_MINUTE：通知限制持续时间，默认 10分钟
NOTIFY_LIMIT_COUNT：用户通知在指定持续时间内的最大数量，默认 2

部署

详细部署指南请参考安装指南-部署方式：

[!TIP] 最新版Docker镜像：calciumion/new-api:latest

多机部署注意事项

必须设置环境变量 SESSION_SECRET，否则会导致多机部署时登录状态不一致
如果公用Redis，必须设置 CRYPTO_SECRET，否则会导致多机部署时Redis内容无法获取

部署要求

本地数据库（默认）：SQLite（Docker部署必须挂载/data目录）
远程数据库：MySQL版本 >= 5.7.8，PgSQL版本 >= 9.6

部署方式

使用宝塔面板Docker功能部署

安装宝塔面板（9.2.0版本及以上），在应用商店中找到New-API安装即可。图文教程

使用Docker Compose部署（推荐）

# 下载项目
git clone https://github.com/Calcium-Ion/new-api.git
cd new-api
# 按需编辑docker-compose.yml
# 启动
docker-compose up -d

直接使用Docker镜像

# 使用SQLite
docker run --name new-api -d --restart always -p 3000:3000 -e TZ=Asia/Shanghai -v /home/ubuntu/data/new-api:/data calciumion/new-api:latest

# 使用MySQL
docker run --name new-api -d --restart always -p 3000:3000 -e SQL_DSN="root:123456@tcp(localhost:3306)/oneapi" -e TZ=Asia/Shanghai -v /home/ubuntu/data/new-api:/data calciumion/new-api:latest

渠道重试与缓存

渠道重试功能已经实现，可以在设置->运营设置->通用设置设置重试次数，建议开启缓存功能。

缓存设置方法

REDIS_CONN_STRING：设置Redis作为缓存
MEMORY_CACHE_ENABLED：启用内存缓存（设置了Redis则无需手动设置）

接口文档

详细接口文档请参考接口文档：

帮助支持

如有问题，请参考帮助支持：

🌟 Star History

For Tasks:

Click tags to check more tools for each tasks

customize models retry channels configure caching integrate rerank model support online recharge

For Jobs:

ai developer machine learning engineer data scientist software engineer research scientist

Alternative AI tools for new-api

Similar Open Source Tools

new-api

github

: 6.5k

HiveChat

HiveChat is an AI chat application designed for small and medium teams. It supports various models such as DeepSeek, Open AI, Claude, and Gemini. The tool allows easy configuration by one administrator for the entire team to use different AI models. It supports features like email or Feishu login, LaTeX and Markdown rendering, DeepSeek mind map display, image understanding, AI agents, cloud data storage, and integration with multiple large model service providers. Users can engage in conversations by logging in, while administrators can configure AI service providers, manage users, and control account registration. The technology stack includes Next.js, Tailwindcss, Auth.js, PostgreSQL, Drizzle ORM, and Ant Design.

github

: 864

KuiperLLama

KuiperLLama is a custom large model inference framework that guides users in building a LLama-supported inference framework with Cuda acceleration from scratch. The framework includes modules for architecture design, LLama2 model support, model quantization, Cuda basics, operator implementation, and fun tasks like text generation and storytelling. It also covers learning other commercial inference frameworks for comprehensive understanding. The project provides detailed tutorials and resources for developing and optimizing large models for efficient inference.

github

: 317

langchain4j-aideepin-web

The langchain4j-aideepin-web repository is the frontend project of langchain4j-aideepin, an open-source, offline deployable retrieval enhancement generation (RAG) project based on large language models such as ChatGPT and application frameworks such as Langchain4j. It includes features like registration & login, multi-sessions (multi-roles), image generation (text-to-image, image editing, image-to-image), suggestions, quota control, knowledge base (RAG) based on large models, model switching, and search engine switching.

github

: 59

AMchat

AMchat is a large language model that integrates advanced math concepts, exercises, and solutions. The model is based on the InternLM2-Math-7B model and is specifically designed to answer advanced math problems. It provides a comprehensive dataset that combines Math and advanced math exercises and solutions. Users can download the model from ModelScope or OpenXLab, deploy it locally or using Docker, and even retrain it using XTuner for fine-tuning. The tool also supports LMDeploy for quantization, OpenCompass for evaluation, and various other features for model deployment and evaluation. The project contributors have provided detailed documentation and guides for users to utilize the tool effectively.

github

: 153

llm-jp-eval

LLM-jp-eval is a tool designed to automatically evaluate Japanese large language models across multiple datasets. It provides functionalities such as converting existing Japanese evaluation data to text generation task evaluation datasets, executing evaluations of large language models across multiple datasets, and generating instruction data (jaster) in the format of evaluation data prompts. Users can manage the evaluation settings through a config file and use Hydra to load them. The tool supports saving evaluation results and logs using wandb. Users can add new evaluation datasets by following specific steps and guidelines provided in the tool's documentation. It is important to note that using jaster for instruction tuning can lead to artificially high evaluation scores, so caution is advised when interpreting the results.

github

: 125

wechat-bot

WeChat Bot is a simple and easy-to-use WeChat robot based on chatgpt and wechaty. It can help you automatically reply to WeChat messages or manage WeChat groups/friends. The tool requires configuration of AI services such as Xunfei, Kimi, or ChatGPT. Users can customize the tool to automatically reply to group or private chat messages based on predefined conditions. The tool supports running in Docker for easy deployment and provides a convenient way to interact with various AI services for WeChat automation.

github

: 5.9k

rag-chatbot

rag-chatbot is a tool that allows users to chat with multiple PDFs using Ollama and LlamaIndex. It provides an easy setup for running on local machines or Kaggle notebooks. Users can leverage models from Huggingface and Ollama, process multiple PDF inputs, and chat in multiple languages. The tool offers a simple UI with Gradio, supporting chat with history and QA modes. Setup instructions are provided for both Kaggle and local environments, including installation steps for Docker, Ollama, Ngrok, and the rag_chatbot package. Users can run the tool locally and access it via a web interface. Future enhancements include adding evaluation, better embedding models, knowledge graph support, improved document processing, MLX model integration, and Corrective RAG.

github

: 245

wealth-tracker

Wealth Tracker is a personal finance management tool designed to help users track their income, expenses, and investments in one place. With intuitive features and customizable categories, users can easily monitor their financial health and make informed decisions. The tool provides detailed reports and visualizations to analyze spending patterns and set financial goals. Whether you are budgeting, saving for a big purchase, or planning for retirement, Wealth Tracker offers a comprehensive solution to manage your money effectively.

github

: 376

one-api

One API 是一个开源项目，它通过标准的 OpenAI API 格式访问所有的大模型，开箱即用。它支持多种大模型，包括 OpenAI ChatGPT 系列模型、Anthropic Claude 系列模型、Google PaLM2/Gemini 系列模型、Mistral 系列模型、百度文心一言系列模型、阿里通义千问系列模型、讯飞星火认知大模型、智谱 ChatGLM 系列模型、360 智脑、腾讯混元大模型、Moonshot AI、百川大模型、MINIMAX、Groq、Ollama、零一万物、阶跃星辰。One API 还支持配置镜像以及众多第三方代理服务，支持通过负载均衡的方式访问多个渠道，支持 stream 模式，支持多机部署，支持令牌管理，支持兑换码管理，支持渠道管理，支持用户分组以及渠道分组，支持渠道设置模型列表，支持查看额度明细，支持用户邀请奖励，支持以美元为单位显示额度，支持发布公告，设置充值链接，设置新用户初始额度，支持模型映射，支持失败自动重试，支持绘图接口，支持 Cloudflare AI Gateway，支持丰富的自定义设置，支持通过系统访问令牌调用管理 API，进而**在无需二开的情况下扩展和自定义** One API 的功能，支持 Cloudflare Turnstile 用户校验，支持用户管理，支持多种用户登录注册方式，支持主题切换，配合 Message Pusher 可将报警信息推送到多种 App 上。

github

: 22.7k

airdrop-tools

Airdrop-tools is a repository containing tools for all Telegram bots. Users can join the Telegram group for support and access various bot apps like Moonbix, Blum, Major, Memefi, and more. The setup requires Node.js and Python, with instructions on creating data directories and installing extensions. Users can run different tools like Blum, Major, Moonbix, Yescoin, Matchain, Fintopio, Agent301, IAMDOG, Banana, Cats, Wonton, and Xkucoin by following specific commands. The repository also provides contact information and options for supporting the creator.

github

: 78

ai-terminal

github

: 66

chatgpt-webui

ChatGPT WebUI is a user-friendly web graphical interface for various LLMs like ChatGPT, providing simplified features such as core ChatGPT conversation and document retrieval dialogues. It has been optimized for better RAG retrieval accuracy and supports various search engines. Users can deploy local language models easily and interact with different LLMs like GPT-4, Azure OpenAI, and more. The tool offers powerful functionalities like GPT4 API configuration, system prompt setup for role-playing, and basic conversation features. It also provides a history of conversations, customization options, and a seamless user experience with themes, dark mode, and PWA installation support.

github

: 79

ChatPilot

ChatPilot is a chat agent tool that enables AgentChat conversations, supports Google search, URL conversation (RAG), and code interpreter functionality, replicates Kimi Chat (file, drag and drop; URL, send out), and supports OpenAI/Azure API. It is based on LangChain and implements ReAct and OpenAI Function Call for agent Q&A dialogue. The tool supports various automatic tools such as online search using Google Search API, URL parsing tool, Python code interpreter, and enhanced RAG file Q&A with query rewriting support. It also allows front-end and back-end service separation using Svelte and FastAPI, respectively. Additionally, it supports voice input/output, image generation, user management, permission control, and chat record import/export.

github

: 523

CareGPT

CareGPT is a medical large language model (LLM) that explores medical data, training, and deployment related research work. It integrates resources, open-source models, rich data, and efficient deployment methods. It supports various medical tasks, including patient diagnosis, medical dialogue, and medical knowledge integration. The model has been fine-tuned on diverse medical datasets to enhance its performance in the healthcare domain.

github

: 508

xlings

Xlings is a developer tool for programming learning, development, and course building. It provides features such as software installation, one-click environment setup, project dependency management, and cross-platform language package management. Additionally, it offers real-time compilation and running, AI code suggestions, tutorial project creation, automatic code checking for practice, and demo examples collection.

github

: 390

For similar tasks

new-api

github

: 6.5k

cria

Cria is a Python library designed for running Large Language Models with minimal configuration. It provides an easy and concise way to interact with LLMs, offering advanced features such as custom models, streams, message history management, and running multiple models in parallel. Cria simplifies the process of using LLMs by providing a straightforward API that requires only a few lines of code to get started. It also handles model installation automatically, making it efficient and user-friendly for various natural language processing tasks.

github

: 105

ChuanhuChatGPT

Chuanhu Chat is a user-friendly web graphical interface that provides various additional features for ChatGPT and other language models. It supports GPT-4, file-based question answering, local deployment of language models, online search, agent assistant, and fine-tuning. The tool offers a range of functionalities including auto-solving questions, online searching with network support, knowledge base for quick reading, local deployment of language models, GPT 3.5 fine-tuning, and custom model integration. It also features system prompts for effective role-playing, basic conversation capabilities with options to regenerate or delete dialogues, conversation history management with auto-saving and search functionalities, and a visually appealing user experience with themes, dark mode, LaTeX rendering, and PWA application support.

github

: 15.2k

herc.ai

Herc.ai is a powerful library for interacting with the Herc.ai API. It offers free access to users and supports all languages. Users can benefit from Herc.ai's features unlimitedly with a one-time subscription and API key. The tool provides functionalities for question answering and text-to-image generation, with support for various models and customization options. Herc.ai can be easily integrated into CLI, CommonJS, TypeScript, and supports beta models for advanced usage. Developed by FiveSoBes and Luppux Development.

github

: 62

LightRAG

LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.

github

: 562

llm-on-ray

LLM-on-Ray is a comprehensive solution for building, customizing, and deploying Large Language Models (LLMs). It simplifies complex processes into manageable steps by leveraging the power of Ray for distributed computing. The tool supports pretraining, finetuning, and serving LLMs across various hardware setups, incorporating industry and Intel optimizations for performance. It offers modular workflows with intuitive configurations, robust fault tolerance, and scalability. Additionally, it provides an Interactive Web UI for enhanced usability, including a chatbot application for testing and refining models.

github

: 87

BentoVLLM

BentoVLLM is an example project demonstrating how to serve and deploy open-source Large Language Models using vLLM, a high-throughput and memory-efficient inference engine. It provides a basis for advanced code customization, such as custom models, inference logic, or vLLM options. The project allows for simple LLM hosting with OpenAI compatible endpoints without the need to write any code. Users can interact with the server using Swagger UI or other methods, and the service can be deployed to BentoCloud for better management and scalability. Additionally, the repository includes integration examples for different LLM models and tools.

github

: 97

abliteration

Abliteration is a tool that allows users to create abliterated models using transformers quickly and easily. It is not a tool for uncensorship, but rather for making models that will not explicitly refuse users. Users can clone the repository, install dependencies, and make abliterations using the provided commands. The tool supports adjusting parameters for stubborn models and offers various options for customization. Abliteration can be used for creating modified models for specific tasks or topics.

github

: 64

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529

new-api

README:

New API

📝 项目说明

📚 文档

✨ 主要特性

模型支持

环境变量配置

部署

多机部署注意事项

部署要求

部署方式

使用宝塔面板Docker功能部署

使用Docker Compose部署（推荐）

直接使用Docker镜像

渠道重试与缓存

缓存设置方法

接口文档

相关项目

帮助支持

🌟 Star History

For Tasks:

For Jobs:

Alternative AI tools for new-api

Similar Open Source Tools

new-api

HiveChat

KuiperLLama

langchain4j-aideepin-web

AMchat

llm-jp-eval

wechat-bot

rag-chatbot

wealth-tracker

one-api

airdrop-tools

ai-terminal

chatgpt-webui

ChatPilot

CareGPT

xlings

For similar tasks

new-api

cria

ChuanhuChatGPT

herc.ai

LightRAG

llm-on-ray

BentoVLLM

abliteration

For similar jobs

promptflow

deepeval

MegaDetector

leapfrogai

llava-docker

carrot

TrustLLM

AI-YinMei