ai-hub

AI Hub 是一个为了接入包括ChatGPT、Baichuan、Zhipu、混元、MiniMax、Moonshot等多种大型语言模型而设计的服务。它旨在积累和管理各种有效的模型调用提示（prompt），并对这些大型语言模型进行持续的测试和评估。

Stars: 54

Visit

AI Hub Project aims to continuously test and evaluate mainstream large language models, while accumulating and managing various effective model invocation prompts. It has integrated all mainstream large language models in China, including OpenAI GPT-4 Turbo, Baidu ERNIE-Bot-4, Tencent ChatPro, MiniMax abab5.5-chat, and more. The project plans to continuously track, integrate, and evaluate new models. Users can access the models through REST services or Java code integration. The project also provides a testing suite for translation, coding, and benchmark testing.

README:

AI Hub Project

简介

AI Hub旨在持续测试和评估主流大型语言模型，同时积累和管理各种有效的模型调用提示（prompt）。目前，AI Hub已接入国内所有主流的大型语言模型，包括文心一言、腾讯混元、智谱AI、MiniMax、百川智能等，并计划持续追踪、接入和评估新模型。

已支持模型列表：

OpenAI / gpt-4-turbo
OpenAI / gpt-3.5-turbo
Baidu / ERNIE-Bot-4（文心一言4）
Baidu / ERNIE-Bot-turbo（文心一言）
Zhipu / glm-4（智谱GLM-4）
Zhipu / chatGLM_turbo（智谱chatGLM）
Ali / qwen-plus（通义千问plus）
Ali / qwen-turbo（通义千问）
Tencent / ChatPro（腾讯混元）
Tencent / ChatStd（腾讯混元）
Tencent / hunyuan-lite（腾讯混元)
Baichuan / Baichuan2-Turbo（百川）
Minimax / abab5.5-chat（MiniMax）
Minimax / abab6-chat（MiniMax）
Xunfei / Spark3.1（讯飞星火）
Moonshot / moonshot-v1-8k (月之暗面)
Xunfei / Spark3.5 (讯飞星火3.5)
ByteDance / Skylark-chat (字节豆包)
Lingyi / yi-34b-chat-0205 (零一万物)
Lingyi / yi-34b-chat-200k (零一万物)
Lingyi / yi-vl-plus (零一万物)
Deepseek / DeepSeek-V2 (Deepseek)
Baidu / ERNIE-Lite-8K（文心一言）
Baidu / ERNIE-Speed-8K（文心一言）
Xunfei / Spark-Lite（讯飞星火）

在大模型列表部分，有更完整的大语言模型列表。请注意，其中的一些大语言模型尚未经过评估，我将陆续对这些模型进行评估。

使用前请在 Settings 页面设置模型的 credentials：

评估结果

大模型接入

如果你想自己接入列表中的大模型，可以通过以下方式。

Rest 服务

启动 ai-hub-server，访问

http://127.0.0.1:3000/api/v1/models/${provider}/${model}:chat

Post:

{
    "input": "${input}"
}

Java 代码接入

可以参考这里

@Service
public class AIModelInvokerFactory {

    private final ApplicationContext context;

    @Autowired
    public AIModelInvokerFactory(ApplicationContext context) {
        this.context = context;
    }

    public AIModelInvoker getProviderAdapter(String providerName) {
        AIProvider provider = AIProvider.fromName(providerName);

        switch (provider) {
            case OPENAI:
                return context.getBean(OpenAIInvoker.class);
            case BAICHUAN:
                return context.getBean(BaichuanInvoker.class);
            case ALI:
                return context.getBean(AliInvoker.class);
            case BAIDU:
                return context.getBean(BaiduInvoker.class);
            case ZHIPU:
                return context.getBean(ZhipuInvoker.class);
            case TENCENT:
                return context.getBean(TencentInvoker.class);
            case XUNFEI:
                return context.getBean(XunfeiInvoker.class);
            case MINIMAX:
                return context.getBean(MiniMaxInvoker.class);
            default:
                throw new IllegalArgumentException("Unknown provider: " + provider);
        }
    }

}

运行

Docker

推荐使用 docker-compose 启动服务

cd docker
docker-compose up -d

数据库

参考脚本

前端

cd ai-hub-fe
npm run start

服务端

需要 JDK 11 以上版本

cd ai-hub-server
mvn clean package
java -jar ai-hub-server-1.0.0-SNAPSHOT-exec.jar

测试集

翻译

编程

z-bench 测试集

大模型列表

低成本模型

Company	Model	Price(1M tokens)	Context Length
Baidu	ERNIE Speed	免费	8k
Baidu	ERNIE Lite	免费	8k
Tencent	hunyuan-lite	免费	256k
ByteDance	Doubao-lite	Input: 0.3 \| Output: 0.6	32k
Zhipu	GLM-3-Turbo	1	128k
Lingyi	yi-spark	1	16k
Ali	qwen-long	Input: 0.5 \| Output: 2	10m
ByteDance	Doubao-pro	Input: 0.8 \| Output: 2	32k
DeepSeek	deepseek-chat	Input: 1 \| Output: 2	32k
Lingyi	yi-medium	2.5	16k

中低成本模型

Company	Model	Price(1M tokens)	Context Length
Ali	qwen-turbo	Input: 2 \| Output: 6	8k
Tencent	hunyuan-standard	Input: 4.5 \| Output: 5	32k
MiniMax	abab5.5s	5	8k
OpenAI	GPT-3.5 Turbo	Input: $0.50 \| Output: $1.50	16k
ByteDance	Doubao-pro-128k	Input: 5 \| Output: 9	128k
Baichuan	Baichuan2-Turbo	8	32k
MiniMax	abab6.5s	10	245k
Ali	qwen-plus	Input: 4 \| Output: 12	32k
Baidu	ERNIE 3.0	12	8k
Baichuan	Baichuan3-Turbo	12	32k
Lingyi	yi-large-turbo	12	16k
Lingyi	yi-medium-200k	12	200k
Moonshot	moonshot-v1-8k	12	8k

中高成本模型

Company	Model	Price(1M tokens)	Context Length
Moonshot	moonshot-v1-32k	24	32k
Baichuan	Baichuan3-Turbo-128k	24	128k
MiniMax	abab6.5	30	8k
Tencent	hunyuan-standard-256k	Input: 15 \| Output: 60	256k
Moonshot	moonshot-v1-128k	60	128k

高成本模型

Company	Model	Price(1M tokens)	Context Length
OpenAI	GPT-4o	Input: $5 \| Output: $15	128k
Baidu	ERNIE-3.5-128k	Input: 48 \| Output: 96	128k
Tencent	hunyuan-pro	Input: 30 \| Output: 100	32k
Ali	qwen-max	Input: 40 \| Output: 120	8k
Zhipu	GLM-4	100	128k
Baichuan	Baichuan4	100	32k
Baidu	ERNIE 4.0	120	8k

For Tasks:

Click tags to check more tools for each tasks

evaluate models integrate models manage prompts test translations benchmark performance

For Jobs:

data scientist machine learning engineer ai researcher nlp engineer software developer

Alternative AI tools for ai-hub

Similar Open Source Tools

ai-hub

github

: 54

tt-metal

TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.

github

: 786

go-cyber

Cyber is a superintelligence protocol that aims to create a decentralized and censorship-resistant internet. It uses a novel consensus mechanism called CometBFT and a knowledge graph to store and process information. Cyber is designed to be scalable, secure, and efficient, and it has the potential to revolutionize the way we interact with the internet.

github

: 353

pmhub

PmHub is a smart project management system based on SpringCloud, SpringCloud Alibaba, and LLM. It aims to help students quickly grasp the architecture design and development process of microservices/distributed projects. PmHub provides a platform for students to experience the transformation from monolithic to microservices architecture, understand the pros and cons of both architectures, and prepare for job interviews. It offers popular technologies like SpringCloud-Gateway, Nacos, Sentinel, and provides high-quality code, continuous integration, product design documents, and an enterprise workflow system. PmHub is suitable for beginners and advanced learners who want to master core knowledge of microservices/distributed projects.

github

: 280

gpt_server

The GPT Server project leverages the basic capabilities of FastChat to provide the capabilities of an openai server. It perfectly adapts more models, optimizes models with poor compatibility in FastChat, and supports loading vllm, LMDeploy, and hf in various ways. It also supports all sentence_transformers compatible semantic vector models, including Chat templates with function roles, Function Calling (Tools) capability, and multi-modal large models. The project aims to reduce the difficulty of model adaptation and project usage, making it easier to deploy the latest models with minimal code changes.

github

: 163

LLamaTuner

LLamaTuner is a repository for the Efficient Finetuning of Quantized LLMs project, focusing on building and sharing instruction-following Chinese baichuan-7b/LLaMA/Pythia/GLM model tuning methods. The project enables training on a single Nvidia RTX-2080TI and RTX-3090 for multi-round chatbot training. It utilizes bitsandbytes for quantization and is integrated with Huggingface's PEFT and transformers libraries. The repository supports various models, training approaches, and datasets for supervised fine-tuning, LoRA, QLoRA, and more. It also provides tools for data preprocessing and offers models in the Hugging Face model hub for inference and finetuning. The project is licensed under Apache 2.0 and acknowledges contributions from various open-source contributors.

github

: 586

yudao-boot-mini

yudao-boot-mini is an open-source project focused on developing a rapid development platform for developers in China. It includes features like system functions, infrastructure, member center, data reports, workflow, mall system, WeChat official account, CRM, ERP, etc. The project is based on Spring Boot with Java backend and Vue for frontend. It offers various functionalities such as user management, role management, menu management, department management, workflow management, payment system, code generation, API documentation, database documentation, file service, WebSocket integration, message queue, Java monitoring, and more. The project is licensed under the MIT License, allowing both individuals and enterprises to use it freely without restrictions.

github

: 54

AstrBot

github

: 7.0k

Awesome-LLM-Eval

Awesome-LLM-Eval: a curated list of tools, benchmarks, demos, papers for Large Language Models (like ChatGPT, LLaMA, GLM, Baichuan, etc) Evaluation on Language capabilities, Knowledge, Reasoning, Fairness and Safety.

github

: 280

ruoyi-vue-pro

The ruoyi-vue-pro repository is an open-source project that provides a comprehensive development platform with various functionalities such as system features, infrastructure, member center, data reports, workflow, payment system, mall system, ERP system, CRM system, and AI big model. It is built using Java backend with Spring Boot framework and Vue frontend with different versions like Vue3 with element-plus, Vue3 with vben(ant-design-vue), and Vue2 with element-ui. The project aims to offer a fast development platform for developers and enterprises, supporting features like dynamic menu loading, button-level access control, SaaS multi-tenancy, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and cloud services, and more.

github

: 28.9k

yudao-cloud

Yudao-cloud is an open-source project designed to provide a fast development platform for developers in China. It includes various system functions, infrastructure, member center, data reports, workflow, mall system, WeChat public account, CRM, ERP, etc. The project is based on Java backend with Spring Boot and Spring Cloud Alibaba microservices architecture. It supports multiple databases, message queues, authentication systems, dynamic menu loading, SaaS multi-tenant system, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and more. The project is well-documented and follows the Alibaba Java development guidelines, ensuring clean code and architecture.

github

: 16.5k

KeepChatGPT

KeepChatGPT is a plugin designed to enhance the data security capabilities and efficiency of ChatGPT. It aims to make your chat experience incredibly smooth, eliminating dozens or even hundreds of unnecessary steps, and permanently getting rid of various errors and warnings. It offers innovative features such as automatic refresh, activity maintenance, data security, audit cancellation, conversation cloning, endless conversations, page purification, large screen display, full screen display, tracking interception, rapid changes, and detailed insights. The plugin ensures that your AI experience is secure, smooth, efficient, concise, and seamless.

github

: 13.7k

AstrBot

AstrBot is a powerful and versatile tool that leverages the capabilities of large language models (LLMs) like GPT-3, GPT-3.5, and GPT-4 to enhance communication and automate tasks. It seamlessly integrates with popular messaging platforms such as QQ, QQ Channel, and Telegram, enabling users to harness the power of AI within their daily conversations and workflows.

github

: 6.6k

ailia-models

The collection of pre-trained, state-of-the-art AI models. ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing. # Supported models 323 models as of April 8th, 2024

github

: 2.2k

adata

AData is a free and open-source A-share database that focuses on transaction-related data. It provides comprehensive data on stocks, including basic information, market data, and sentiment analysis. AData is designed to be easy to use and integrate with other applications, making it a valuable tool for quantitative trading and AI training.

github

: 1.9k

aidea-server

AIdea Server is an open-source Golang-based server that integrates mainstream large language models and drawing models. It supports various functionalities including OpenAI's GPT-3.5 and GPT-4, Anthropic's Claude instant and Claude 2.1, Google's Gemini Pro, as well as Chinese models like Tongyi Qianwen, Wenxin Yiyuan, and more. It also supports open-source large models like Yi 34B, Llama2, and AquilaChat 7B. Additionally, it provides features for text-to-image, super-resolution, coloring black and white images, generating art fonts and QR codes, among others.

github

: 1.6k

For similar tasks

arena-hard-auto

Arena-Hard-Auto-v0.1 is an automatic evaluation tool for instruction-tuned LLMs. It contains 500 challenging user queries. The tool prompts GPT-4-Turbo as a judge to compare models' responses against a baseline model (default: GPT-4-0314). Arena-Hard-Auto employs an automatic judge as a cheaper and faster approximator to human preference. It has the highest correlation and separability to Chatbot Arena among popular open-ended LLM benchmarks. Users can evaluate their models' performance on Chatbot Arena by using Arena-Hard-Auto.

github

: 394

max

The Modular Accelerated Xecution (MAX) platform is an integrated suite of AI libraries, tools, and technologies that unifies commonly fragmented AI deployment workflows. MAX accelerates time to market for the latest innovations by giving AI developers a single toolchain that unlocks full programmability, unparalleled performance, and seamless hardware portability.

github

: 341

ai-hub

github

: 54

long-context-attention

Long-Context-Attention (YunChang) is a unified sequence parallel approach that combines the strengths of DeepSpeed-Ulysses-Attention and Ring-Attention to provide a versatile and high-performance solution for long context LLM model training and inference. It addresses the limitations of both methods by offering no limitation on the number of heads, compatibility with advanced parallel strategies, and enhanced performance benchmarks. The tool is verified in Megatron-LM and offers best practices for 4D parallelism, making it suitable for various attention mechanisms and parallel computing advancements.

github

: 266

marlin

Marlin is a highly optimized FP16xINT4 matmul kernel designed for large language model (LLM) inference, offering close to ideal speedups up to batchsizes of 16-32 tokens. It is suitable for larger-scale serving, speculative decoding, and advanced multi-inference schemes like CoT-Majority. Marlin achieves optimal performance by utilizing various techniques and optimizations to fully leverage GPU resources, ensuring efficient computation and memory management.

github

: 542

MMC

This repository, MMC, focuses on advancing multimodal chart understanding through large-scale instruction tuning. It introduces a dataset supporting various tasks and chart types, a benchmark for evaluating reasoning capabilities over charts, and an assistant achieving state-of-the-art performance on chart QA benchmarks. The repository provides data for chart-text alignment, benchmarking, and instruction tuning, along with existing datasets used in experiments. Additionally, it offers a Gradio demo for the MMCA model.

github

: 75

Tiktoken

Tiktoken is a high-performance implementation focused on token count operations. It provides various encodings like o200k_base, cl100k_base, r50k_base, p50k_base, and p50k_edit. Users can easily encode and decode text using the provided API. The repository also includes a benchmark console app for performance tracking. Contributions in the form of PRs are welcome.

github

: 68

ppl.llm.serving

ppl.llm.serving is a serving component for Large Language Models (LLMs) within the PPL.LLM system. It provides a server based on gRPC and supports inference for LLaMA. The repository includes instructions for prerequisites, quick start guide, model exporting, server setup, client usage, benchmarking, and offline inference. Users can refer to the LLaMA Guide for more details on using this serving component.

github

: 126

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

ai-hub

README:

AI Hub Project

简介

评估结果

英文翻译

编程

指令输出

大模型接入

Rest 服务

Java 代码接入

运行

Docker

数据库

前端

服务端

测试集

翻译

编程

z-bench 测试集

大模型列表

低成本模型

中低成本模型

中高成本模型

高成本模型

For Tasks:

For Jobs:

Alternative AI tools for ai-hub

Similar Open Source Tools

ai-hub

tt-metal

go-cyber

pmhub

gpt_server

LLamaTuner

yudao-boot-mini

AstrBot

Awesome-LLM-Eval

ruoyi-vue-pro

yudao-cloud

KeepChatGPT

AstrBot

ailia-models

adata

aidea-server

For similar tasks

arena-hard-auto

max

ai-hub

long-context-attention

marlin

MMC

Tiktoken

ppl.llm.serving

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick