new-api
A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatible, Claude-compatible, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management. ๐ฅ
Stars: 17482
New API is a next-generation large model gateway and AI asset management system that provides a wide range of features, including a new UI interface, multi-language support, online recharge function, key query for usage quota, compatibility with the original One API database, model charging by usage count, channel weighted randomization, data dashboard, token grouping and model restrictions, support for various authorization login methods, support for Rerank models, OpenAI Realtime API, Claude Messages format, reasoning effort setting, content reasoning, user-specific model rate limiting, request format conversion, cache billing support, and various model support such as gpts, Midjourney-Proxy, Suno API, custom channels, Rerank models, Claude Messages format, Dify, and more.
README:
๐ฅ Next-Generation LLM Gateway and AI Asset Management System
็ฎไฝไธญๆ | ็น้ซไธญๆ | English | Franรงais | ๆฅๆฌ่ช
Quick Start โข Key Features โข Deployment โข Documentation โข Help
[!IMPORTANT]
- This project is for personal learning purposes only, with no guarantee of stability or technical support
- Users must comply with OpenAI's Terms of Use and applicable laws and regulations, and must not use it for illegal purposes
- According to the ใInterim Measures for the Management of Generative Artificial Intelligence Servicesใ, please do not provide any unregistered generative AI services to the public in China.
No particular order
Thanks to JetBrains for providing free open-source development license for this project
# Clone the project
git clone https://github.com/QuantumNous/new-api.git
cd new-api
# Edit docker-compose.yml configuration
nano docker-compose.yml
# Start the service
docker-compose up -dUsing Docker Commands
# Pull the latest image
docker pull calciumion/new-api:latest
# Using SQLite (default)
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest
# Using MySQL
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e SQL_DSN="root:123456@tcp(localhost:3306)/oneapi" \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest๐ก Tip:
-v ./data:/datawill save data in thedatafolder of the current directory, you can also change it to an absolute path like-v /your/custom/path:/data
๐ After deployment is complete, visit http://localhost:3000 to start using!
๐ For more deployment methods, please refer to Deployment Guide
๐ Official Documentation |
Quick Navigation:
| Category | Link |
|---|---|
| ๐ Deployment Guide | Installation Documentation |
| โ๏ธ Environment Configuration | Environment Variables |
| ๐ก API Documentation | API Documentation |
| โ FAQ | FAQ |
| ๐ฌ Community Interaction | Communication Channels |
For detailed features, please refer to Features Introduction
| Feature | Description |
|---|---|
| ๐จ New UI | Modern user interface design |
| ๐ Multi-language | Supports Simplified Chinese, Traditional Chinese, English, French, Japanese |
| ๐ Data Compatibility | Fully compatible with the original One API database |
| ๐ Data Dashboard | Visual console and statistical analysis |
| ๐ Permission Management | Token grouping, model restrictions, user management |
- โ Online recharge (EPay, Stripe)
- โ Pay-per-use model pricing
- โ Cache billing support (OpenAI, Azure, DeepSeek, Claude, Qwen and all supported models)
- โ Flexible billing policy configuration
- ๐ Discord authorization login
- ๐ค LinuxDO authorization login
- ๐ฑ Telegram authorization login
- ๐ OIDC unified authentication
- ๐ Key quota query usage (with neko-api-key-tool)
API Format Support:
- โก OpenAI Responses
- โก OpenAI Realtime API (including Azure)
- โก Claude Messages
- โก Google Gemini
- ๐ Rerank Models (Cohere, Jina)
Intelligent Routing:
- โ๏ธ Channel weighted random
- ๐ Automatic retry on failure
- ๐ฆ User-level model rate limiting
Format Conversion:
- ๐ OpenAI Compatible โ Claude Messages
- ๐ OpenAI Compatible โ Google Gemini
- ๐ Google Gemini โ OpenAI Compatible - Text only, function calling not supported yet
- ๐ง OpenAI Compatible โ OpenAI Responses - In development
- ๐ Thinking-to-content functionality
Reasoning Effort Support:
View detailed configuration
OpenAI series models:
-
o3-mini-high- High reasoning effort -
o3-mini-medium- Medium reasoning effort -
o3-mini-low- Low reasoning effort -
gpt-5-high- High reasoning effort -
gpt-5-medium- Medium reasoning effort -
gpt-5-low- Low reasoning effort
Claude thinking models:
-
claude-3-7-sonnet-20250219-thinking- Enable thinking mode
Google Gemini series models:
-
gemini-2.5-flash-thinking- Enable thinking mode -
gemini-2.5-flash-nothinking- Disable thinking mode -
gemini-2.5-pro-thinking- Enable thinking mode -
gemini-2.5-pro-thinking-128- Enable thinking mode with thinking budget of 128 tokens - You can also append
-low,-medium, or-highto any Gemini model name to request the corresponding reasoning effort (no extra thinking-budget suffix needed).
For details, please refer to API Documentation - Relay Interface
| Model Type | Description | Documentation |
|---|---|---|
| ๐ค OpenAI-Compatible | OpenAI compatible models | Documentation |
| ๐ค OpenAI Responses | OpenAI Responses format | Documentation |
| ๐จ Midjourney-Proxy | Midjourney-Proxy(Plus) | Documentation |
| ๐ต Suno-API | Suno API | Documentation |
| ๐ Rerank | Cohere, Jina | Documentation |
| ๐ฌ Claude | Messages format | Documentation |
| ๐ Gemini | Google Gemini format | Documentation |
| ๐ง Dify | ChatFlow mode | - |
| ๐ฏ Custom | Supports complete call address | - |
View complete interface list
[!TIP] Latest Docker image:
calciumion/new-api:latest
| Component | Requirement |
|---|---|
| Local database | SQLite (Docker must mount /data directory) |
| Remote database | MySQL โฅ 5.7.8 or PostgreSQL โฅ 9.6 |
| Container engine | Docker / Docker Compose |
Common environment variable configuration
| Variable Name | Description | Default Value |
|---|---|---|
SESSION_SECRET |
Session secret (required for multi-machine deployment) | - |
CRYPTO_SECRET |
Encryption secret (required for Redis) | - |
SQL_DSN |
Database connection string | - |
REDIS_CONN_STRING |
Redis connection string | - |
STREAMING_TIMEOUT |
Streaming timeout (seconds) | 300 |
STREAM_SCANNER_MAX_BUFFER_MB |
Max per-line buffer (MB) for the stream scanner; increase when upstream sends huge image/base64 payloads | 64 |
MAX_REQUEST_BODY_MB |
Max request body size (MB, counted after decompression; prevents huge requests/zip bombs from exhausting memory). Exceeding it returns 413
|
32 |
AZURE_DEFAULT_API_VERSION |
Azure API version | 2025-04-01-preview |
ERROR_LOG_ENABLED |
Error log switch | false |
PYROSCOPE_URL |
Pyroscope server address | - |
PYROSCOPE_APP_NAME |
Pyroscope application name | new-api |
PYROSCOPE_BASIC_AUTH_USER |
Pyroscope basic auth user | - |
PYROSCOPE_BASIC_AUTH_PASSWORD |
Pyroscope basic auth password | - |
PYROSCOPE_MUTEX_RATE |
Pyroscope mutex sampling rate | 5 |
PYROSCOPE_BLOCK_RATE |
Pyroscope block sampling rate | 5 |
HOSTNAME |
Hostname tag for Pyroscope | new-api |
๐ Complete configuration: Environment Variables Documentation
Method 1: Docker Compose (Recommended)
# Clone the project
git clone https://github.com/QuantumNous/new-api.git
cd new-api
# Edit configuration
nano docker-compose.yml
# Start service
docker-compose up -dMethod 2: Docker Commands
Using SQLite:
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latestUsing MySQL:
docker run --name new-api -d --restart always \
-p 3000:3000 \
-e SQL_DSN="root:123456@tcp(localhost:3306)/oneapi" \
-e TZ=Asia/Shanghai \
-v ./data:/data \
calciumion/new-api:latest๐ก Path explanation:
./data:/data- Relative path, data saved in the data folder of the current directory- You can also use absolute path, e.g.:
/your/custom/path:/data
Method 3: BaoTa Panel
- Install BaoTa Panel (โฅ 9.2.0 version)
- Search for New-API in the application store
- One-click installation
๐ Tutorial with images
[!WARNING]
- Must set
SESSION_SECRET- Otherwise login status inconsistent- Shared Redis must set
CRYPTO_SECRET- Otherwise data cannot be decrypted
Retry configuration: Settings โ Operation Settings โ General Settings โ Failure Retry Count
Cache configuration:
-
REDIS_CONN_STRING: Redis cache (recommended) -
MEMORY_CACHE_ENABLED: Memory cache
| Project | Description |
|---|---|
| One API | Original project base |
| Midjourney-Proxy | Midjourney interface support |
| Project | Description |
|---|---|
| neko-api-key-tool | Key quota query tool |
| new-api-horizon | New API high-performance optimized version |
| Resource | Link |
|---|---|
| ๐ FAQ | FAQ |
| ๐ฌ Community Interaction | Communication Channels |
| ๐ Issue Feedback | Issue Feedback |
| ๐ Complete Documentation | Official Documentation |
Welcome all forms of contribution!
- ๐ Report Bugs
- ๐ก Propose New Features
- ๐ Improve Documentation
- ๐ง Submit Code
This project is licensed under the GNU Affero General Public License v3.0 (AGPLv3).
This is an open-source project developed based on One API (MIT License).
If your organization's policies do not permit the use of AGPLv3-licensed software, or if you wish to avoid the open-source obligations of AGPLv3, please contact us at: [email protected]
If this project is helpful to you, welcome to give us a โญ๏ธ Star๏ผ
Official Documentation โข Issue Feedback โข Latest Release
Built with โค๏ธ by QuantumNous
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for new-api
Similar Open Source Tools
new-api
New API is a next-generation large model gateway and AI asset management system that provides a wide range of features, including a new UI interface, multi-language support, online recharge function, key query for usage quota, compatibility with the original One API database, model charging by usage count, channel weighted randomization, data dashboard, token grouping and model restrictions, support for various authorization login methods, support for Rerank models, OpenAI Realtime API, Claude Messages format, reasoning effort setting, content reasoning, user-specific model rate limiting, request format conversion, cache billing support, and various model support such as gpts, Midjourney-Proxy, Suno API, custom channels, Rerank models, Claude Messages format, Dify, and more.
ai-dev-kit
The AI Dev Kit is a comprehensive toolkit designed to enhance AI-driven development on Databricks. It provides trusted sources for AI coding assistants like Claude Code and Cursor to build faster and smarter on Databricks. The kit includes features such as Spark Declarative Pipelines, Databricks Jobs, AI/BI Dashboards, Unity Catalog, Genie Spaces, Knowledge Assistants, MLflow Experiments, Model Serving, Databricks Apps, and more. Users can choose from different adventures like installing the kit, using the visual builder app, teaching AI assistants Databricks patterns, executing Databricks actions, or building custom integrations with the core library. The kit also includes components like databricks-tools-core, databricks-mcp-server, databricks-skills, databricks-builder-app, and ai-dev-project.
sf-skills
sf-skills is a collection of reusable skills for Agentic Salesforce Development, enabling AI-powered code generation, validation, testing, debugging, and deployment. It includes skills for development, quality, foundation, integration, AI & automation, DevOps & tooling. The installation process is newbie-friendly and includes an installer script for various CLIs. The skills are compatible with platforms like Claude Code, OpenCode, Codex, Gemini, Amp, Droid, Cursor, and Agentforce Vibes. The repository is community-driven and aims to strengthen the Salesforce ecosystem.
axonhub
AxonHub is an all-in-one AI development platform that serves as an AI gateway allowing users to switch between model providers without changing any code. It provides features like vendor lock-in prevention, integration simplification, observability enhancement, and cost control. Users can access any model using any SDK with zero code changes. The platform offers full request tracing, enterprise RBAC, smart load balancing, and real-time cost tracking. AxonHub supports multiple databases, provides a unified API gateway, and offers flexible model management and API key creation for authentication. It also integrates with various AI coding tools and SDKs for seamless usage.
everything-claude-code
The 'Everything Claude Code' repository is a comprehensive collection of production-ready agents, skills, hooks, commands, rules, and MCP configurations developed over 10+ months. It includes guides for setup, foundations, and philosophy, as well as detailed explanations of various topics such as token optimization, memory persistence, continuous learning, verification loops, parallelization, and subagent orchestration. The repository also provides updates on bug fixes, multi-language rules, installation wizard, PM2 support, OpenCode plugin integration, unified commands and skills, and cross-platform support. It offers a quick start guide for installation, ecosystem tools like Skill Creator and Continuous Learning v2, requirements for CLI version compatibility, key concepts like agents, skills, hooks, and rules, running tests, contributing guidelines, OpenCode support, background information, important notes on context window management and customization, star history chart, and relevant links.
agentscope
AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: Easy-to-Use, High Robustness, and Actor-Based Distribution. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs, including OpenAI API, DashScope API, Gemini API, and ollama. It also enables developers to rapidly deploy local model services using libraries such as ollama (CPU inference), Flask + Transformers, Flask + ModelScope, FastChat, and vllm. AgentScope supports various services, including Web Search, Data Query, Retrieval, Code Execution, File Operation, and Text Processing. Example applications include Conversation, Game, and Distribution. AgentScope is released under Apache License 2.0 and welcomes contributions.
GraphGen
GraphGen is a framework for synthetic data generation guided by knowledge graphs. It enhances supervised fine-tuning for large language models (LLMs) by generating synthetic data based on a fine-grained knowledge graph. The tool identifies knowledge gaps in LLMs, prioritizes generating QA pairs targeting high-value knowledge, incorporates multi-hop neighborhood sampling, and employs style-controlled generation to diversify QA data. Users can use LLaMA-Factory and xtuner for fine-tuning LLMs after data generation.
PraisonAI
Praison AI is a low-code, centralised framework that simplifies the creation and orchestration of multi-agent systems for various LLM applications. It emphasizes ease of use, customization, and human-agent interaction. The tool leverages AutoGen and CrewAI frameworks to facilitate the development of AI-generated scripts and movie concepts. Users can easily create, run, test, and deploy agents for scriptwriting and movie concept development. Praison AI also provides options for full automatic mode and integration with OpenAI models for enhanced AI capabilities.
NornicDB
NornicDB is a high-performance graph database designed for AI agents and knowledge systems. It is Neo4j-compatible, GPU-accelerated, and features memory that evolves. The database automatically discovers and manages relationships in the data, allowing meaning to emerge from the knowledge graph. NornicDB is suitable for AI agent memory, knowledge graphs, RAG systems, session context, and research tools. It offers features like intelligent memory, auto-relationships, performance benchmarks, vector search, Heimdall AI assistant, APOC functions, and various Docker images for different platforms. The tool is built with Neo4j Bolt protocol, Cypher query engine, memory decay system, GPU acceleration, vector search, auto-relationship engine, and more.
rag-web-ui
RAG Web UI is an intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology. It helps enterprises and individuals build intelligent Q&A systems based on their own knowledge bases. By combining document retrieval and large language models, it delivers accurate and reliable knowledge-based question-answering services. The system is designed with features like intelligent document management, advanced dialogue engine, and a robust architecture. It supports multiple document formats, async document processing, multi-turn contextual dialogue, and reference citations in conversations. The architecture includes a backend stack with Python FastAPI, MySQL + ChromaDB, MinIO, Langchain, JWT + OAuth2 for authentication, and a frontend stack with Next.js, TypeScript, Tailwind CSS, Shadcn/UI, and Vercel AI SDK for AI integration. Performance optimization includes incremental document processing, streaming responses, vector database performance tuning, and distributed task processing. The project is licensed under the Apache-2.0 License and is intended for learning and sharing RAG knowledge only, not for commercial purposes.
monoscope
Monoscope is an open-source monitoring and observability platform that uses artificial intelligence to understand and monitor systems automatically. It allows users to ingest and explore logs, traces, and metrics in S3 buckets, query in natural language via LLMs, and create AI agents to detect anomalies. Key capabilities include universal data ingestion, AI-powered understanding, natural language interface, cost-effective storage, and zero configuration. Monoscope is designed to reduce alert fatigue, catch issues before they impact users, and provide visibility across complex systems.
shodh-memory
Shodh-Memory is a cognitive memory system designed for AI agents to persist memory across sessions, learn from experience, and run entirely offline. It features Hebbian learning, activation decay, and semantic consolidation, packed into a single ~17MB binary. Users can deploy it on cloud, edge devices, or air-gapped systems to enhance the memory capabilities of AI agents.
ReGraph
ReGraph is a decentralized AI compute marketplace that connects hardware providers with developers who need inference and training resources. It democratizes access to AI computing power by creating a global network of distributed compute nodes. It is cost-effective, decentralized, easy to integrate, supports multiple models, and offers pay-as-you-go pricing.
ASTRA.ai
Astra.ai is a multimodal agent powered by TEN, showcasing its capabilities in speech, vision, and reasoning through RAG from local documentation. It provides a platform for developing AI agents with features like RTC transportation, extension store, workflow builder, and local deployment. Users can build and test agents locally using Docker and Node.js, with prerequisites including Agora App ID, Azure's speech-to-text and text-to-speech API keys, and OpenAI API key. The platform offers advanced customization options through config files and API keys setup, enabling users to create and deploy their AI agents for various tasks.
gpt-load
GPT-Load is a high-performance, enterprise-grade AI API transparent proxy service designed for enterprises and developers needing to integrate multiple AI services. Built with Go, it features intelligent key management, load balancing, and comprehensive monitoring capabilities for high-concurrency production environments. The tool serves as a transparent proxy service, preserving native API formats of various AI service providers like OpenAI, Google Gemini, and Anthropic Claude. It supports dynamic configuration, distributed leader-follower deployment, and a Vue 3-based web management interface. GPT-Load is production-ready with features like dual authentication, graceful shutdown, and error recovery.
paiml-mcp-agent-toolkit
PAIML MCP Agent Toolkit (PMAT) is a zero-configuration AI context generation system with extreme quality enforcement and Toyota Way standards. It allows users to analyze any codebase instantly through CLI, MCP, or HTTP interfaces. The toolkit provides features such as technical debt analysis, advanced monitoring, metrics aggregation, performance profiling, bottleneck detection, alert system, multi-format export, storage flexibility, and more. It also offers AI-powered intelligence for smart recommendations, polyglot analysis, repository showcase, and integration points. PMAT enforces quality standards like complexity โค20, zero SATD comments, test coverage >80%, no lint warnings, and synchronized documentation with commits. The toolkit follows Toyota Way development principles for iterative improvement, direct AST traversal, automated quality gates, and zero SATD policy.
For similar tasks
new-api
New API is a next-generation large model gateway and AI asset management system that provides a wide range of features, including a new UI interface, multi-language support, online recharge function, key query for usage quota, compatibility with the original One API database, model charging by usage count, channel weighted randomization, data dashboard, token grouping and model restrictions, support for various authorization login methods, support for Rerank models, OpenAI Realtime API, Claude Messages format, reasoning effort setting, content reasoning, user-specific model rate limiting, request format conversion, cache billing support, and various model support such as gpts, Midjourney-Proxy, Suno API, custom channels, Rerank models, Claude Messages format, Dify, and more.
unitycatalog
Unity Catalog is an open and interoperable catalog for data and AI, supporting multi-format tables, unstructured data, and AI assets. It offers plugin support for extensibility and interoperates with Delta Sharing protocol. The catalog is fully open with OpenAPI spec and OSS implementation, providing unified governance for data and AI with asset-level access control enforced through REST APIs.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.
