ai-app

本项目旨在分享人工智能相关应用技术以及实战经验，包括大模型、语音合成、数字人、图像生成等。

Stars: 103

Visit

The 'ai-app' repository is a comprehensive collection of tools and resources related to artificial intelligence, focusing on topics such as server environment setup, PyCharm and Anaconda installation, large model deployment and training, Transformer principles, RAG technology, vector databases, AI image, voice, and music generation, and AI Agent frameworks. It also includes practical guides and tutorials on implementing various AI applications. The repository serves as a valuable resource for individuals interested in exploring different aspects of AI technology.

README:

🔥 服务器基础环境安装及常用工具
- 🐫PyCharm安装
- 🐼Anaconda安装及原理介绍
- 💪实战：用VSCode远程服务器开发
🐎 大模型调用总结
- 🐎主流大模型API调用总结
- 🍚大模型部署推理
大模型原理介绍
- Transformer原理介绍
- 实战：从零实现llama3
🚅 大模型训练
- ❄️模型训练基础知识
- 💪从零训练gpt2
- 🚀 LoRA微调
- 💪实战-用LLaMAFactory微调数百种大模型
- 💪实战-微调大模型实现商品评价情感预测
🍄检索增强生成(RAG)
- 🌼 ragflow
- 💪 实战：自己手写一个最简单的RAG
- 💪 实战：基于ragflow做了款初中历史辅导工具
🚀 Agent
- 🏠 Agent原理介绍
- 💪 手写Qwen2大模型function_call
♻️ AI图像生成
- 📐 StableDiffusion1.5
- 💎 SDXL
- 🍦 ControlNet
- 💪 实战-基于easyphoto实现AI换脸
🏡 AI语音合成
- ♻️ VITS-fast-fine-tuning
- 📐 GPT-SoVITS
- 💪 实战-克隆自己的声音
🎵 AI音乐合成
- 📐 Suno
📞 我用AI技术做了一个虚拟女友
💬 AI技术应用交流群
👥 微信公众号

服务器基础环境安装及常用工具

pycharm安装

最新版pycharm专业版（2023.3.4）环境安装以及永久使用方法

Anaconda

Anaconda安装与原理介绍

大模型调用

主流大模型API调用总结

采用语言python

大模型	厂商名称	API调用文档	支持模型(闭源)	依赖下载
chatglm	智谱	链接	GLM-4、GLM-4V、GLM-3-Turbo	`pip install zhipuai`
星火大模型	科大讯飞	链接	V1.5、V2.0、V3.0和V3.5四个版本	`pip install --upgrade spark_ai_python`
通义千问	阿里	链接	qwen-turbo、qwen-plus、qwen-max等	`pip install dashscope`
文心一言	百度	链接	ERNIE-4.0、ERNIE-3.5、ERNIE-Lite等	`pip install qianfan`
kimi	月之暗面	链接	moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k	`pip install openai`
chatgpt	OpenAI	链接	gpt4、gpt3.5	`pip install openai`

大模型部署推理

大模型原理介绍

Transformer原理介绍

从零实现llama3

llama3整体架构图介绍
llama3原始权重与HuggingFace权重的区别
llama3归一化方法RMSNorm实现
llama3分组查询注意力（Grouped Query Attention，GQA）实现
llama3 掩码注意力机制（Masked Grouped Query Attention）实现
llama3前馈网络SwiGLU实现
llama3整体推理实现

大模型训练

模型训练基础知识

从零训练gpt2

项目	教程	代码
如何从零开始训练一个llm模型？	https://www.zhihu.com/question/641255219/answer/3625159394	配套代码

LoRA微调

项目	教程	代码
不依赖微调框架，从0实现LoRA微调		配套代码

实战-基于LLaMAFactory微调

只需三个脚本，单机单卡微调BaiChuan2-13B并发布服务代码：配套代码
只需三个脚本，单机多卡微调BaiChuan2-13B并发布服务
只需三个脚本，多机多卡微调BaiChuan2-13B并发布服务
只需三个脚本，使用vllm部署BaiChuan2-13B

实战-大模型做情感预测

项目	微调方式	教程	视频教程	相关依赖
微调大模型实现情感预测	Lora	我用LLaMA-Factory微调大模型来实现商品评论情感分析，准确率高达91.70%	视频链接	配套代码

检索增强生成

RAG整体技术路线可分为3大块8个小点见下图，其中包含知识库构建、知识检索和知识问答。其中核心在于知识库构建

详细介绍见：

RAG原理介绍

ragflow

常规rag存在的问题：

一是如何应对复杂多变的数据，这些数据包含各种格式，更复杂的还包含各类图表、pdf、excel循环嵌套等。如果在没有理解这些数据的基础之上直接简单粗暴地做RAG ，就会导致知识检索失败，从而导致rag失败。

二是如何查询和排序。假设知识库中有10W条数据，你的问题需要和10W数据匹配检索并且找到最适合的几条，无疑于大海捞针。
ragflow是如何改善这些问题的？

一是基于深度文档理解deepdoc模块，能够从各类复杂格式的非结构化数据中提取真知灼见。

二是引入多路召回和重排序，才能保证数据检索召回的准确度

项目地址：ragflow

⬆ 一键返回目录

实战-基于ragflow做一款初中历史辅导工具

向量数据库	数据处理	语义召回	教程	视频地址
Elasticsearch	deepdoc模块	多路召回、融合重排序	我用ragflow做了一款初中历史辅导助手	https://www.bilibili.com/video/BV1yw4m1y7yA/

⬆ 一键返回目录

实战-手写一个最简单的RAG

github上的代码封装程度高，不利于小白学习入门。

常规的大模型RAG框架有langchain等，但是langchain等框架源码理解困难，debug源码上手难度大。

因此，我写了一个人人都能看懂、人人都能修改的大模型RAG框架代码。

整体项目结构如下图所示：手把手教你大模型RAG框架架构。

代码与教程如下：

章节	教程	代码
01.如何调用大模型API	手把手教你完成大模型RAG知识问答应用构建-01.如何调用大模型API	配套代码
02.RAG介绍	手把手教你完成大模型RAG知识问答应用构建-02.RAG介绍	/
03.部署环境准备	手把手教你完成大模型RAG知识问答应用构建-03.项目依赖环境准备	/
04.知识库构建	手把手教你完成大模型RAG知识问答应用构建-04.知识库构建	/
05.基于知识库的大模型问答	手把手教你完成大模型RAG知识问答应用构建-05.基于知识库的大模型问答	配套代码
06.改进-用自己的embedding模型	手把手教你完成大模型RAG知识问答应用构建-06.用自己的embedding模型	配套代码
07.封装镜像对外提供服务	更新中	更新中
08.改进-基于Faiss的大模型知识索引构建	更新中	配套代码
09.改进-使用向量数据库	更新中	配套代码
10.前端构建	更新中	更新中

⬆ 一键返回目录

Agent

Agent原理介绍

什么是 AI 智能体，和大模型有什么关系？

手写Qwen2大模型function_call

项目	教程	代码
手写Qwen2大模型function call	https://zhuanlan.zhihu.com/p/730995043	配套代码

AI图像生成

AI语音合成

AI音乐合成

Star History

AI技术应用交流群

图片挂掉，可加微信:Code-GUO

微信公众号

图片挂掉，可加微信:Code-GUO

免责声明

如有疑问请提交issue，有违规侵权，请联系本人 [email protected] ，本人立马删除相应链接，感谢！

本仓库仅作学习交流分享使用，任何子环节不作任何商用。

For Tasks:

Click tags to check more tools for each tasks

deploy large models train ai models implement rag applications build ai agents generate ai content

For Jobs:

data scientist machine learning engineer ai researcher ai developer data analyst

Alternative AI tools for ai-app

Similar Open Source Tools

ai-app

github

: 103

llms-from-scratch-cn

This repository provides a detailed tutorial on how to build your own large language model (LLM) from scratch. It includes all the code necessary to create a GPT-like LLM, covering the encoding, pre-training, and fine-tuning processes. The tutorial is written in a clear and concise style, with plenty of examples and illustrations to help you understand the concepts involved. It is suitable for developers and researchers with some programming experience who are interested in learning more about LLMs and how to build them.

github

: 860

GodHook

GodHook is an Xposed module that integrates various fun features, including automatic replies with support for multiple AI language models, subscription functionality for daily news, inspirational quotes, and weather updates, as well as interface functions to execute host app message functions for operations alerts and data push scenarios. It also offers various other features waiting to be explored. The module is designed for learning and communication purposes only and should not be used for malicious purposes. It requires technical knowledge to configure API model information and aims to lower the technical barrier for wider usage in the future.

github

: 110

Tiktoken

Tiktoken is a high-performance implementation focused on token count operations. It provides various encodings like o200k_base, cl100k_base, r50k_base, p50k_base, and p50k_edit. Users can easily encode and decode text using the provided API. The repository also includes a benchmark console app for performance tracking. Contributions in the form of PRs are welcome.

github

: 68

DISC-LawLLM

DISC-LawLLM is a legal domain large model that aims to provide professional, intelligent, and comprehensive **legal services** to users. It is developed and open-sourced by the Data Intelligence and Social Computing Lab (Fudan-DISC) at Fudan University.

github

: 590

pmhub

PmHub is a smart project management system based on SpringCloud, SpringCloud Alibaba, and LLM. It aims to help students quickly grasp the architecture design and development process of microservices/distributed projects. PmHub provides a platform for students to experience the transformation from monolithic to microservices architecture, understand the pros and cons of both architectures, and prepare for job interviews. It offers popular technologies like SpringCloud-Gateway, Nacos, Sentinel, and provides high-quality code, continuous integration, product design documents, and an enterprise workflow system. PmHub is suitable for beginners and advanced learners who want to master core knowledge of microservices/distributed projects.

github

: 280

Chinese-LLaMA-Alpaca

This project open sources the **Chinese LLaMA model and the Alpaca large model fine-tuned with instructions**, to further promote the open research of large models in the Chinese NLP community. These models **extend the Chinese vocabulary based on the original LLaMA** and use Chinese data for secondary pre-training, further enhancing the basic Chinese semantic understanding ability. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, significantly improving the model's understanding and execution of instructions.

github

: 17.2k

Chinese-LLaMA-Alpaca-3

Chinese-LLaMA-Alpaca-3 is a project based on Meta's latest release of the new generation open-source large model Llama-3. It is the third phase of the Chinese-LLaMA-Alpaca open-source large model series projects (Phase 1, Phase 2). This project open-sources the Chinese Llama-3 base model and the Chinese Llama-3-Instruct instruction fine-tuned large model. These models incrementally pre-train with a large amount of Chinese data on the basis of the original Llama-3 and further fine-tune using selected instruction data, enhancing Chinese basic semantics and instruction understanding capabilities. Compared to the second-generation related models, significant performance improvements have been achieved.

github

: 825

Chinese-LLaMA-Alpaca-2

Chinese-LLaMA-Alpaca-2 is a large Chinese language model developed by Meta AI. It is based on the Llama-2 model and has been further trained on a large dataset of Chinese text. Chinese-LLaMA-Alpaca-2 can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. Here are some of the key features of Chinese-LLaMA-Alpaca-2: * It is the largest Chinese language model ever trained, with 13 billion parameters. * It is trained on a massive dataset of Chinese text, including books, news articles, and social media posts. * It can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. * It is open-source and available for anyone to use. Chinese-LLaMA-Alpaca-2 is a powerful tool that can be used to improve the performance of a wide range of natural language processing tasks. It is a valuable resource for researchers and developers working in the field of artificial intelligence.

github

: 6.8k

AstrBot

github

: 7.0k

chinese-llm-benchmark

The Chinese LLM Benchmark is a continuous evaluation list of large models in CLiB, covering a wide range of commercial and open-source models from various companies and research institutions. It supports multidimensional evaluation of capabilities including classification, information extraction, reading comprehension, data analysis, Chinese encoding efficiency, and Chinese instruction compliance. The benchmark not only provides capability score rankings but also offers the original output results of all models for interested individuals to score and rank themselves.

github

: 3.9k

AI-Competition-Collections

AI-Competition-Collections is a repository that collects and curates various experiences and tips from AI competitions. It includes posts on competition experiences in computer vision, NLP, speech, and other AI-related fields. The repository aims to provide valuable insights and techniques for individuals participating in AI competitions, covering topics such as image classification, object detection, OCR, adversarial attacks, and more.

github

: 365

Awesome-LLMs-for-Video-Understanding

Awesome-LLMs-for-Video-Understanding is a repository dedicated to exploring Video Understanding with Large Language Models. It provides a comprehensive survey of the field, covering models, pretraining, instruction tuning, and hybrid methods. The repository also includes information on tasks, datasets, and benchmarks related to video understanding. Contributors are encouraged to add new papers, projects, and materials to enhance the repository.

github

: 1.8k

MiniCPM

MiniCPM is a series of open-source large models on the client side jointly developed by Face Intelligence and Tsinghua University Natural Language Processing Laboratory. The main language model MiniCPM-2B has only 2.4 billion (2.4B) non-word embedding parameters, with a total of 2.7B parameters. - After SFT, MiniCPM-2B performs similarly to Mistral-7B on public comprehensive evaluation sets (better in Chinese, mathematics, and code capabilities), and outperforms models such as Llama2-13B, MPT-30B, and Falcon-40B overall. - After DPO, MiniCPM-2B also surpasses many representative open-source large models such as Llama2-70B-Chat, Vicuna-33B, Mistral-7B-Instruct-v0.1, and Zephyr-7B-alpha on the current evaluation set MTBench, which is closest to the user experience. - Based on MiniCPM-2B, a multi-modal large model MiniCPM-V 2.0 on the client side is constructed, which achieves the best performance of models below 7B in multiple test benchmarks, and surpasses larger parameter scale models such as Qwen-VL-Chat 9.6B, CogVLM-Chat 17.4B, and Yi-VL 34B on the OpenCompass leaderboard. MiniCPM-V 2.0 also demonstrates leading OCR capabilities, approaching Gemini Pro in scene text recognition capabilities. - After Int4 quantization, MiniCPM can be deployed and inferred on mobile phones, with a streaming output speed slightly higher than human speech speed. MiniCPM-V also directly runs through the deployment of multi-modal large models on mobile phones. - A single 1080/2080 can efficiently fine-tune parameters, and a single 3090/4090 can fully fine-tune parameters. A single machine can continuously train MiniCPM, and the secondary development cost is relatively low.

github

: 7.0k

Awesome-Jailbreak-on-LLMs

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, and exciting jailbreak methods on Large Language Models (LLMs). The repository contains papers, codes, datasets, evaluations, and analyses related to jailbreak attacks on LLMs. It serves as a comprehensive resource for researchers and practitioners interested in exploring various jailbreak techniques and defenses in the context of LLMs. Contributions such as additional jailbreak-related content, pull requests, and issue reports are welcome, and contributors are acknowledged. For any inquiries or issues, contact [email protected]. If you find this repository useful for your research or work, consider starring it to show appreciation.

github

: 507

gpt_server

The GPT Server project leverages the basic capabilities of FastChat to provide the capabilities of an openai server. It perfectly adapts more models, optimizes models with poor compatibility in FastChat, and supports loading vllm, LMDeploy, and hf in various ways. It also supports all sentence_transformers compatible semantic vector models, including Chat templates with function roles, Function Calling (Tools) capability, and multi-modal large models. The project aims to reduce the difficulty of model adaptation and project usage, making it easier to deploy the latest models with minimal code changes.

github

: 163

For similar tasks

AutoGPT

AutoGPT is a revolutionary tool that empowers everyone to harness the power of AI. With AutoGPT, you can effortlessly build, test, and delegate tasks to AI agents, unlocking a world of possibilities. Our mission is to provide the tools you need to focus on what truly matters: innovation and creativity.

github

: 174.2k

agent-os

The Agent OS is an experimental framework and runtime to build sophisticated, long running, and self-coding AI agents. We believe that the most important super-power of AI agents is to write and execute their own code to interact with the world. But for that to work, they need to run in a suitable environment—a place designed to be inhabited by agents. The Agent OS is designed from the ground up to function as a long-term computing substrate for these kinds of self-evolving agents.

github

: 60

chatdev

ChatDev IDE is a tool for building your AI agent, Whether it's NPCs in games or powerful agent tools, you can design what you want for this platform. It accelerates prompt engineering through **JavaScript Support** that allows implementing complex prompting techniques.

github

: 384

module-ballerinax-ai.agent

This library provides functionality required to build ReAct Agent using Large Language Models (LLMs).

github

: 108

npi

NPi is an open-source platform providing Tool-use APIs to empower AI agents with the ability to take action in the virtual world. It is currently under active development, and the APIs are subject to change in future releases. NPi offers a command line tool for installation and setup, along with a GitHub app for easy access to repositories. The platform also includes a Python SDK and examples like Calendar Negotiator and Twitter Crawler. Join the NPi community on Discord to contribute to the development and explore the roadmap for future enhancements.

github

: 211

ai-agents

The 'ai-agents' repository is a collection of books and resources focused on developing AI agents, including topics such as GPT models, building AI agents from scratch, machine learning theory and practice, and basic methods and tools for data analysis. The repository provides detailed explanations and guidance for individuals interested in learning about and working with AI agents.

github

: 53

llms

The 'llms' repository is a comprehensive guide on Large Language Models (LLMs), covering topics such as language modeling, applications of LLMs, statistical language modeling, neural language models, conditional language models, evaluation methods, transformer-based language models, practical LLMs like GPT and BERT, prompt engineering, fine-tuning LLMs, retrieval augmented generation, AI agents, and LLMs for computer vision. The repository provides detailed explanations, examples, and tools for working with LLMs.

github

: 266

ai-app

github

: 103

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529