awesome-llm-and-aigc

awesome-llm-and-aigc

🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Visual Language Model(VLM), AI Generated Content(AIGC), the related Datasets and Applications.

Stars: 536

Visit
 screenshot

README:

Awesome-llm-and-aigc

Awesome

🚀🚀🚀 This repository lists some awesome public projects about Large Language Model(LLM), Visual Language Model(VLM), AI Generated Content(AIGC), the related Datasets and Applications.

Contents

Summary

  • Frameworks

    • Official Version

      • Neural Network Architecture
        神经网络架构
      • Large Language Model
        大语言模型(LLM)
        • GPT-1 : "Improving Language Understanding by Generative Pre-Training". (cs.ubc.ca, 2018).

        • GPT-2 : "Language Models are Unsupervised Multitask Learners". (OpenAI blog, 2019). Better language models and their implications.

        • GPT-3 : "GPT-3: Language Models are Few-Shot Learners". (arXiv 2020).

        • InstructGPT : "Training language models to follow instructions with human feedback". (arXiv 2022). "Aligning language models to follow instructions". (OpenAI blog, 2022).

        • ChatGPT: Optimizing Language Models for Dialogue.

        • GPT-4: GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. "Sparks of Artificial General Intelligence: Early experiments with GPT-4". (arXiv 2023). "GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE". (SemianAlysis, 2023).

        • Llama 2 : Inference code for LLaMA models. "LLaMA: Open and Efficient Foundation Language Models". (arXiv 2023). "Llama 2: Open Foundation and Fine-Tuned Chat Models". (ai.meta.com, 2023-07-18). (2023-07-18, Llama 2 is here - get it on Hugging Face).

        • Llama 3 : The official Meta Llama 3 GitHub site.

        • Gemma : The official PyTorch implementation of Google's Gemma models. ai.google.dev/gemma

        • Grok-1 : This repository contains JAX example code for loading and running the Grok-1 open-weights model.

        • Claude : Claude is a next-generation AI assistant based on Anthropic’s research into training helpful, honest, and harmless AI systems.

        • Whisper : Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. "Robust Speech Recognition via Large-Scale Weak Supervision". (arXiv 2022).

        • OpenChat : OpenChat: Advancing Open-source Language Models with Imperfect Data. huggingface.co/openchat/openchat

        • GPT-Engineer : Specify what you want it to build, the AI asks for clarification, and then builds it. GPT Engineer is made to be easy to adapt, extend, and make your agent learn how you want your code to look. It generates an entire codebase based on a prompt.

        • StableLM : StableLM: Stability AI Language Models.

        • JARVIS : JARVIS, a system to connect LLMs with ML community. "HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace". (arXiv 2023).

        • MiniGPT-4 : MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models. minigpt-4.github.io

        • minGPT : A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training.

        • nanoGPT : The simplest, fastest repository for training/finetuning medium-sized GPTs.

        • MicroGPT : A simple and effective autonomous agent compatible with GPT-3.5-Turbo and GPT-4. MicroGPT aims to be as compact and reliable as possible.

        • Dolly : Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform. Hello Dolly: Democratizing the magic of ChatGPT with open models

        • LMFlow : An extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community. Large Language Model for All. optimalscale.github.io/LMFlow/

        • Colossal-AI : Making big AI models cheaper, easier, and scalable. www.colossalai.org. "Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training". (arXiv 2021).

        • Lit-LLaMA : ⚡ Lit-LLaMA. Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

        • GPT-4-LLM : "Instruction Tuning with GPT-4". (arXiv 2023). instruction-tuning-with-gpt-4.github.io/

        • Stanford Alpaca : Stanford Alpaca: An Instruction-following LLaMA Model.

        • Liger-Kernel : Efficient Triton Kernels for LLM Training. arxiv.org/pdf/2410.10989

        • FlagGems : FlagGems is a high-performance general operator library implemented in OpenAI Triton. It aims to provide a suite of kernel functions to accelerate LLM training and inference.

        • feizc/Visual-LLaMA : Open LLaMA Eyes to See the World. This project aims to optimize LLaMA model for visual information understanding like GPT-4 and further explore the potentional of large language model.

        • Lightning-AI/lightning-colossalai : Efficient Large-Scale Distributed Training with Colossal-AI and Lightning AI.

        • GPT4All : GPT4All: An ecosystem of open-source on-edge large language models. GTP4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.

        • ChatALL : Concurrently chat with ChatGPT, Bing Chat, bard, Alpaca, Vincuna, Claude, ChatGLM, MOSS, iFlytek Spark, ERNIE and more, discover the best answers. chatall.ai

        • 1595901624/gpt-aggregated-edition : 聚合ChatGPT官方版、ChatGPT免费版、文心一言、Poe、chatchat等多平台,支持自定义导入平台。

        • FreedomIntelligence/LLMZoo : ⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡ Tech Report

        • shm007g/LLaMA-Cult-and-More : News about 🦙 Cult and other AIGC models.

        • X-PLUG/mPLUG-Owl : mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality.

        • i-Code : The ambition of the i-Code project is to build integrative and composable multimodal Artificial Intelligence. The "i" stands for integrative multimodal learning. "CoDi: Any-to-Any Generation via Composable Diffusion". (arXiv 2023).

        • WorkGPT : WorkGPT is an agent framework in a similar fashion to AutoGPT or LangChain.

        • h2oGPT : h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. "h2oGPT: Democratizing Large Language Models". (arXiv 2023).

        • LongLLaMA : LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

        • LLaMA-Adapter : Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters. LLaMA-Adapter: Efficient Fine-tuning of LLaMA 🚀

        • DemoGPT : Create 🦜️🔗 LangChain apps by just using prompts with the power of Llama 2 🌟 Star to support our work! | 只需使用句子即可创建 LangChain 应用程序。 给个star支持我们的工作吧!DemoGPT: Auto Gen-AI App Generator with the Power of Llama 2. ⚡ With just a prompt, you can create interactive Streamlit apps via 🦜️🔗 LangChain's transformative capabilities & Llama 2.⚡ demogpt.io

        • Lamini : Lamini: The LLM engine for rapidly customizing models 🦙

        • xorbitsai/inference : Xorbits Inference (Xinference) is a powerful and versatile library designed to serve LLMs, speech recognition models, and multimodal models, even on your laptop. It supports a variety of models compatible with GGML, such as llama, chatglm, baichuan, whisper, vicuna, orac, and many others.

        • epfLLM/Megatron-LLM : distributed trainer for LLMs.

        • AmineDiro/cria : OpenAI compatible API for serving LLAMA-2 model.

        • Llama-2-Onnx : Llama 2 Powered By ONNX.

        • gpt-llm-trainer : The goal of this project is to explore an experimental new pipeline to train a high-performing task-specific model. We try to abstract away all the complexity, so it's as easy as possible to go from idea -> performant fully-trained model.

        • Qwen(通义千问) : The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

        • Qwen2.5 : Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

        • ChatGLM-6B : ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型。 ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型,基于 General Language Model (GLM) 架构,具有 62 亿参数。 "GLM: General Language Model Pretraining with Autoregressive Blank Infilling". (ACL 2022). "GLM-130B: An Open Bilingual Pre-trained Model". (ICLR 2023).

        • ChatGLM2-6B : ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型。ChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本,在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上,ChatGLM2-6B 引入了更强大的性能、更强大的性能、更高效的推理、更开放的协议。

        • ChatGLM3 : ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型。

        • InternLM(书生·浦语) : Official release of InternLM2 7B and 20B base and chat models. 200K context support. internlm.intern-ai.org.cn/

        • Baichuan-7B(百川-7B) : A large-scale 7B pretraining language model developed by BaiChuan-Inc. Baichuan-7B 是由百川智能开发的一个开源可商用的大规模预训练语言模型。基于 Transformer 结构,在大约 1.2 万亿 tokens 上训练的 70 亿参数模型,支持中英双语,上下文窗口长度为 4096。在标准的中文和英文 benchmark(C-Eval/MMLU)上均取得同尺寸最好的效果。huggingface.co/baichuan-inc/baichuan-7B

        • Baichuan-13B(百川-13B) : A 13B large language model developed by Baichuan Intelligent Technology. Baichuan-13B 是由百川智能继 Baichuan-7B 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在权威的中文和英文 benchmark 上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。huggingface.co/baichuan-inc/Baichuan-13B-Chat

        • Baichuan2 : A series of large language models developed by Baichuan Intelligent Technology. Baichuan 2 是百川智能推出的新一代开源大语言模型,采用 2.6 万亿 Tokens 的高质量语料训练。Baichuan 2 在多个权威的中文、英文和多语言的通用、领域 benchmark 上取得同尺寸最佳的效果。本次发布包含有 7B、13B 的 Base 和 Chat 版本,并提供了 Chat 版本的 4bits 量化。huggingface.co/baichuan-inc. "Baichuan 2: Open Large-scale Language Models". (arXiv 2023).

        • MOSS : An open-source tool-augmented conversational language model from Fudan University. MOSS是一个支持中英双语和多种插件的开源对话语言模型,moss-moon系列模型具有160亿参数,在FP16精度下可在单张A100/A800或两张3090显卡运行,在INT4/8精度下可在单张3090显卡运行。MOSS基座语言模型在约七千亿中英文以及代码单词上预训练得到,后续经过对话指令微调、插件增强学习和人类偏好训练具备多轮对话能力及使用多种插件的能力。txsun1997.github.io/blogs/moss.html

        • BayLing(百聆) : “百聆”是一个具有增强的语言对齐的英语/中文大语言模型,具有优越的英语/中文能力,在多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction. nlp.ict.ac.cn/bayling. "BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models". (arXiv 2023).

        • FlagAI(悟道·天鹰(Aquila)) : FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. Our goal is to support training, fine-tuning, and deployment of large-scale models on various downstream tasks with multi-modality.

        • YuLan-Chat(玉兰) : YuLan-Chat models are chat-based large language models, which are developed by the researchers in GSAI, Renmin University of China (YuLan, which represents Yulan Magnolia, is the campus flower of Renmin University of China). The newest version is developed by continually-pretraining and instruction-tuning LLaMA-2 with high-quality English and Chinese data. YuLan-Chat系列模型是中国人民大学高瓴人工智能学院师生共同开发的支持聊天的大语言模型(名字"玉兰"取自中国人民大学校花)。 最新版本基于LLaMA-2进行了中英文双语的继续预训练和指令微调。

        • Yi-1.5 : Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.

        • 智海-录问 : 智海-录问(wisdomInterrogatory)是由浙江大学、阿里巴巴达摩院以及华院计算三家单位共同设计研发的法律大模型。核心思想:以“普法共享和司法效能提升”为目标,从推动法律智能化体系入司法实践、数字化案例建设、虚拟法律咨询服务赋能等方面提供支持,形成数字化和智能化的司法基座能力。

        • 活字 : 活字是由哈工大自然语言处理研究所多位老师和学生参与开发的一个开源可商用的大规模预训练语言模型。 该模型基于 Bloom 结构的70 亿参数模型,支持中英双语,上下文窗口长度为 2048。 在标准的中文和英文基准以及主观评测上均取得同尺寸中优异的结果。

        • MiLM-6B : MiLM-6B 是由小米开发的一个大规模预训练语言模型,参数规模为64亿。在 C-Eval 和 CMMLU 上均取得同尺寸最好的效果。

        • Chinese LLaMA and Alpaca : 中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)。"Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca". (arXiv 2023).

        • Chinese-LLaMA-Alpaca-2 : 中文 LLaMA-2 & Alpaca-2 大模型二期项目 (Chinese LLaMA-2 & Alpaca-2 LLMs).

        • FlagAlpha/Llama2-Chinese : Llama中文社区,最好的中文Llama大模型,完全开源可商用。

        • michael-wzhu/Chinese-LlaMA2 : Repo for adapting Meta LlaMA2 in Chinese! META最新发布的LlaMA2的汉化版! (完全开源可商用)

        • CPM-Bee : CPM-Bee是一个完全开源、允许商用的百亿参数中英文基座模型,也是CPM-Live训练的第二个里程碑。

        • PandaLM : PandaLM: Reproducible and Automated Language Model Assessment.

        • SpeechGPT : "SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities". (arXiv 2023).

        • GPT2-Chinese : Chinese version of GPT2 training code, using BERT tokenizer.

        • Chinese-Tiny-LLM : "Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model". (arXiv 2024).

        • 潘多拉 (Pandora) : 潘多拉,一个让你呼吸顺畅的ChatGPT。Pandora, a ChatGPT that helps you breathe smoothly.

        • 百度-文心大模型 : 百度全新一代知识增强大语言模型,文心大模型家族的新成员,能够与人对话互动,回答问题,协助创作,高效便捷地帮助人们获取信息、知识和灵感。

        • 百度智能云-千帆大模型 : 百度智能云千帆大模型平台一站式企业级大模型平台,提供先进的生成式AI生产及应用全流程开发工具链。

        • 华为云-盘古大模型 : 盘古大模型致力于深耕行业,打造金融、政务、制造、矿山、气象、铁路等领域行业大模型和能力集,将行业知识know-how与大模型能力相结合,重塑千行百业,成为各组织、企业、个人的专家助手。"Accurate medium-range global weather forecasting with 3D neural networks". (Nature 2023).

        • 商汤科技-日日新SenseNova : 日日新(SenseNova),是商汤科技宣布推出的大模型体系,包括自然语言处理模型“商量”(SenseChat)、文生图模型“秒画”和数字人视频生成平台“如影”(SenseAvatar)等。

        • 科大讯飞-星火认知大模型 : 新一代认知智能大模型,拥有跨领域知识和语言理解能力,能够基于自然对话方式理解与执行任务。

        • 字节跳动-豆包 : 豆包。

        • CrazyBoyM/llama3-Chinese-chat : Llama3 中文版。

      • Visual Language Model
        视觉语言模型(VLM)
        • LLaVA : 🌋 LLaVA: Large Language and Vision Assistant. Visual instruction tuning towards large language and vision models with GPT-4 level capabilities. llava.hliu.cc. "Visual Instruction Tuning". (arXiv 2023).

        • Qwen2-VL : Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud. "Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution". (arXiv 2024).

        • NVILA : VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops). "NVILA: Efficient Frontier Visual Language Models". (arXiv 2024).

        • Visual ChatGPT : Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. "Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models". (arXiv 2023).

        • CLIP : CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image. "Learning Transferable Visual Models From Natural Language Supervision". (arXiv 2021).

        • GLIP : "Grounded Language-Image Pre-training". (CVPR 2022).

        • GLIPv2 : "GLIPv2: Unifying Localization and Vision-Language Understanding". (arXiv 2022).

        • InternImage : "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions". (CVPR 2023).

        • DINO : "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection". (ICLR 2023).

        • GroundingDINO : "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection". (ECCV 2024).

        • DINOv2 : "DINOv2: Learning Robust Visual Features without Supervision". (arXiv 2023).

        • YOLO-World : "YOLO-World: Real-Time Open-Vocabulary Object Detection". (CVPR 2024). www.yoloworld.cc

        • Autodistill : Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. docs.autodistill.com

        • SAM : The repository provides code for running inference with the Segment Anything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. "Segment Anything". (arXiv 2023).

        • Grounded-SAM : Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs. We plan to create a very interesting demo by combining Grounding DINO and Segment Anything which aims to detect and segment Anything with text inputs!

        • SEEM : We introduce SEEM that can Segment Everything Everywhere with Multi-modal prompts all at once. SEEM allows users to easily segment an image using prompts of different types including visual prompts (points, marks, boxes, scribbles and image segments) and language prompts (text and audio), etc. It can also work with any combinations of prompts or generalize to custom prompts! "Segment Everything Everywhere All at Once". (arXiv 2023).

        • SAM3D : "SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model". (arXiv 2023).

        • ImageBind : "ImageBind: One Embedding Space To Bind Them All". (CVPR 2023).

        • Track-Anything : Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI. "Track Anything: Segment Anything Meets Videos". (arXiv 2023).

        • qianqianwang68/omnimotion : "Tracking Everything Everywhere All at Once". (arXiv 2023).

        • M3I-Pretraining : "Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information". (arXiv 2022).

        • BEVFormer : BEVFormer: a Cutting-edge Baseline for Camera-based Detection. "BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers". (arXiv 2022).

        • Uni-Perceiver : "Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks". (CVPR 2022).

        • AnyLabeling : 🌟 AnyLabeling 🌟. Effortless data labeling with AI support from YOLO and Segment Anything! Effortless data labeling with AI support from YOLO and Segment Anything!

        • X-AnyLabeling : 💫 X-AnyLabeling 💫. Effortless data labeling with AI support from Segment Anything and other awesome models!

        • Label Anything : OpenMMLab PlayGround: Semi-Automated Annotation with Label-Studio and SAM.

        • RevCol : "Reversible Column Networks". (arXiv 2023).

        • Macaw-LLM : Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration.

        • SAM-PT : SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking. "Segment Anything Meets Point Tracking". (arXiv 2023).

        • Video-LLaMA : "Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding". (arXiv 2023).

        • MobileSAM : "Faster Segment Anything: Towards Lightweight SAM for Mobile Applications". (arXiv 2023).

        • BuboGPT : "BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs". (arXiv 2023).

      • AI Generated Content
        人工智能生成内容(AIGC)
        • Sora : Sora is an AI model that can create realistic and imaginative scenes from text instructions.

        • Open Sora Plan : This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project. 本项目希望通过开源社区的力量复现Sora,由北大-兔展AIGC联合实验室共同发起,当前我们资源有限仅搭建了基础架构,无法进行完整训练,希望通过开源社区逐步增加模块并筹集资源进行训练,当前版本离目标差距巨大,仍需持续完善和快速迭代,欢迎Pull request!!!Project Page 中文主页

        • Mini Sora : The Mini Sora project aims to explore the implementation path and future development direction of Sora.

        • EMO : "EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions". (arXiv 2024).

        • Stable Diffusion : Stable Diffusion is a latent text-to-image diffusion model. Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work "High-Resolution Image Synthesis with Latent Diffusion Models". (CVPR 2022).

        • Stable Diffusion Version 2 : This repository contains Stable Diffusion models trained from scratch and will be continuously updated with new checkpoints. "High-Resolution Image Synthesis with Latent Diffusion Models". (CVPR 2022).

        • StableStudio : StableStudio by Stability AI. 👋 Welcome to the community repository for StableStudio, the open-source version of DreamStudio.

        • AudioCraft : Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

        • InvokeAI : Invoke AI - Generative AI for Professional Creatives. Professional Creative Tools for Stable Diffusion, Custom-Trained Models, and more. invoke-ai.github.io/InvokeAI/

        • DragGAN : "Stable Diffusion Training with MosaicML. This repo contains code used to train your own Stable Diffusion model on your own data". (SIGGRAPH 2023).

        • AudioGPT : AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.

        • PandasAI : Pandas AI is a Python library that adds generative artificial intelligence capabilities to Pandas, the popular data analysis and manipulation tool. It is designed to be used in conjunction with Pandas, and is not a replacement for it.

        • mosaicml/diffusion : Stable Diffusion Training with MosaicML. This repo contains code used to train your own Stable Diffusion model on your own data.

        • VisorGPT : Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT. "VisorGPT: Learning Visual Prior via Generative Pre-Training". (arXiv 2023).

        • ControlNet : Let us control diffusion models! "Adding Conditional Control to Text-to-Image Diffusion Models". (arXiv 2023).

        • Fooocus : Fooocus is an image generating software. Fooocus is a rethinking of Stable Diffusion and Midjourney’s designs. "微信公众号「GitHubStore」《Fooocus : 集Stable Diffusion 和 Midjourney 优点于一身的开源AI绘图软件》"。

        • MindDiffuser : "MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion". (arXiv 2023).

        • World Labs : We are a spatial intelligence company building Large World Models to perceive, generate, and interact with the 3D world.

        • Genie 2 : Genie 2: A large-scale foundation world model.

        • Midjourney : Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.

        • DreamStudio : Effortless image generation for creators with big dreams.

        • Firefly : Adobe Firefly: Experiment, imagine, and make an infinite range of creations with Firefly, a family of creative generative AI models coming to Adobe products.

        • Jasper : Meet Jasper. On-brand AI content wherever you create.

        • Copy.ai : Whatever you want to ask, our chat has the answers.

        • Peppertype.ai : Leverage the AI-powered platform to ideate, create, distribute, and measure your content and prove your content marketing ROI.

        • ChatPPT : ChatPPT来袭命令式一键生成PPT。

    • Performance Analysis and Visualization

      性能分析及可视化
      • FlagPerf : FlagPerf is an open-source software platform for benchmarking AI chips. FlagPerf是智源研究院联合AI硬件厂商共建的一体化AI硬件评测引擎,旨在建立以产业实践为导向的指标体系,评测AI硬件在软件栈组合(模型+框架+编译器)下的实际能力。

      • hahnyuan/LLM-Viewer : Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

      • harleyszhang/llm_counts : llm theoretical performance analysis tools and support params, flops, memory and latency analysis.

    • LLM Inference Framework

      大语言模型推理框架
    • Application Development Platform

      应用程序开发平台
      • LangChain : 🦜️🔗 LangChain. ⚡ Building applications with LLMs through composability ⚡ python.langchain.com

      • Dify : An Open-Source Assistants API and GPTs alternative. Dify.AI is an LLM application development platform. It integrates the concepts of Backend as a Service and LLMOps, covering the core tech stack required for building generative AI-native applications, including a built-in RAG engine. dify.ai

      • Lobe Chat : 🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application. chat-preview.lobehub.com

      • AutoChain : AutoChain: Build lightweight, extensible, and testable LLM Agents. autochain.forethought.ai

      • Auto-GPT : Auto-GPT: An Autonomous GPT-4 Experiment. Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains together LLM "thoughts", to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI. agpt.co

      • LiteChain : Build robust LLM applications with true composability 🔗. rogeriochaves.github.io/litechain/

      • Open-Assistant : OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. open-assistant.io

    • Fine-Tuning Framework

      微调框架
      • LLaMA-Factory : Unify Efficient Fine-Tuning of 100+ LLMs. Fine-tuning a large language model can be easy as...
    • RAG Framework

      检索增强生成框架
      • LlamaIndex : LlamaIndex is a data framework for your LLM applications. docs.llamaindex.ai

      • Embedchain : The Open Source RAG framework. docs.embedchain.ai

      • QAnything : Question and Answer based on Anything. qanything.ai

      • R2R : A framework for rapid development and deployment of production-ready RAG systems. docs.sciphi.ai

      • langchain-ai/rag-from-scratch : Retrieval augmented generation (RAG) comes is a general methodology for connecting LLMs with external data sources. These notebooks accompany a video series will build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation.

    • Vector Database

      向量数据库
      • Qdrant : Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. milvus.io

      • Qdrant : Qdrant - Vector Database for the next generation of AI applications. Also available in the cloud https://cloud.qdrant.io/. qdrant.tech

    • Memory Management

      内存管理
  • Awesome List

  • Paper Overview

  • Learning Resources

    • 动手学深度学习(Dive into Deep Learning,D2L.ai) : 《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。zh.d2l.ai

    • zjhellofss/KuiperLLama : KuiperLLama 动手自制大模型推理框架,支持LLama2/3和Qwen2.5。校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。

    • wdndev/llm_interview_note : 主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题。LLMs 相关知识及面试题

    • wdndev/mllm_interview_note : 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识。MLLMs 相关知识

    • wdndev/tiny-llm-zh : 从零实现一个小参数量中文大语言模型。

    • wdndev/tiny-rag : 实现一个很小很小的RAG系统。

    • wdndev/llama3-from-scratch-zh : 从零实现一个 llama3 中文版。

    • wdndev/llm101n-zh : 中文版 LLM101n 课程。

    • harleyszhang/llm_note : LLM notes, including model inference, transformer model structure, and llm framework code analysis notes. Zhang

    • karpathy/build-nanogpt : Video+code lecture on building nanoGPT from scratch.

    • karpathy/LLM101n : LLM101n: Let's build a Storyteller. In this course we will build a Storyteller AI Large Language Model (LLM). Hand in hand, you'll be able create, refine and illustrate little stories with the AI. We are going to build everything end-to-end from basics to a functioning web app similar to ChatGPT, from scratch in Python, C and CUDA, and with minimal computer science prerequisits. By the end you should have a relatively deep understanding of AI, LLMs, and deep learning more generally.

    • karpathy/nn-zero-to-hero : Neural Networks: Zero to Hero. A course on neural networks that starts all the way at the basics. The course is a series of YouTube videos where we code and train neural networks together. The Jupyter notebooks we build in the videos are then captured here inside the lectures directory. Every lecture also has a set of exercises included in the video description.

    • mlabonne/llm-course : Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.mlabonne.github.io/blog/

    • rasbt/LLMs-from-scratch : Implementing a ChatGPT-like LLM from scratch, step by step. https://www.manning.com/books/build-a-large-language-model-from-scratch

    • naklecha/llama3-from-scratch : llama3 implementation one matrix multiplication at a time.

    • DataTalksClub/llm-zoomcamp : LLM Zoomcamp - a free online course about building a Q&A system.

    • datawhalechina/llm-universe : 动手学大模型应用开发。本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/

    • datawhalechina/hugging-llm : HuggingLLM, Hugging Future. 蝴蝶书ButterflyBook. 配套视频教程:https://b23.tv/hdnXn1L

    • zyds/transformers-code : 手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube。

    • DjangoPeng/openai-quickstart : A comprehensive guide to understanding and implementing large language models with hands-on examples using LangChain for GenAI applications. 本项目旨在为所有对大型语言模型及其在生成式人工智能(AIGC)场景中应用的人们提供一站式学习资源。通过提供理论基础,开发基础,和实践示例,该项目对这些前沿主题提供了全面的指导。

    • InternLM/Tutorial : 书生·浦语大模型实战营。为了推动大模型在更多行业落地开花,让开发者们更高效的学习大模型的开发与应用,上海人工智能实验室重磅推出书生·浦语大模型实战营,为广大开发者搭建大模型学习和实践开发的平台,两周时间带你玩转大模型微调、部署与评测全链路。

    • DLLXW/baby-llama2-chinese : 用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

    • charent/ChatLM-mini-Chinese : 中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

    • charent/Phi2-mini-Chinese : Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

    • jiahe7ay/MINI_LLM : This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.

    • SmartFlowAI/Hand-on-RAG : Hand on RAG. 顾名思义:手搓的RAG。

    • liguodongiot/llm-action : 本项目旨在分享大模型相关技术原理以及实战经验。

    • km1994/LLMsNineStoryDemonTower : 【LLMs九层妖塔】分享 LLMs在自然语言处理(ChatGLM、Chinese-LLaMA-Alpaca、小羊驼 Vicuna、LLaMA、GPT4ALL等)、信息检索(langchain)、语言合成、语言识别、多模态等领域(Stable Diffusion、MiniGPT-4、VisualGLM-6B、Ziya-Visual等)等 实战与经验。

    • RahulSChand/llama2.c-for-dummies : Step by step explanation/tutorial of llama2.c

    • liteli1987gmail/python_langchain_cn : langchain中文网是langchain的python中文文档。python.langchain.com.cn

    • langchain-ai/rag-from-scratch : Retrieval augmented generation (RAG) comes is a general methodology for connecting LLMs with external data sources. These notebooks accompany a video series will build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation.

    • phodal/aigc : 《构筑大语言模型应用:应用开发与架构设计》一本关于 LLM 在真实世界应用的开源电子书,介绍了大语言模型的基础知识和应用,以及如何构建自己的模型。其中包括Prompt的编写、开发和管理,探索最好的大语言模型能带来什么,以及LLM应用开发的模式和架构设计。

    • cystanford/aigc_LLM_engineering : aigc_LLM_engineering.

  • Community

    • Hugging Face : The AI community building the future. The platform where the machine learning community collaborates on models, datasets, and applications.

    • ModelScope | 魔塔社区 : ModelScope is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. ModelScope 是一个“模型即服务”(MaaS)平台,旨在汇集来自AI社区的最先进的机器学习模型,并简化在实际应用中使用AI模型的流程。ModelScope库使开发人员能够通过丰富的API设计执行推理、训练和评估,从而促进跨不同AI领域的最先进模型的统一体验。www.modelscope.cn/

    • The official LangChain blog : LangChain. The official LangChain blog.

Prompts

提示语(魔法)

Open API

Applications

  • IDE

    集成开发环境

    • Cursor : An editor made for programming with AI 🤖. Long term, our plan is to build Cursor into the world's most productive development environment. cursor.so
  • Chatbot

    聊天机器人

  • Role Play

    角色扮演

    • KMnO4-zx/xlab-huanhuan : Chat-甄嬛是利用《甄嬛传》剧本中所有关于甄嬛的台词和语句,基于InternLM2进行LoRA微调或全量微调得到的模仿甄嬛语气的聊天语言模型。

    • JimmyMa99/Roleplay-with-XiYou : Roleplay-with-XiYou 西游角色扮演。基于《西游记》原文、白话文、ChatGPT生成数据制作的,以InternLM2微调的角色扮演多LLM聊天室。 本项目将介绍关于角色扮演类 LLM 的一切,从数据获取、数据处理,到使用 XTuner 微调并部署至 OpenXLab,再到使用 LMDeploy 部署,以 openai api 的方式接入简单的聊天室,并可以观看不同角色的 LLM 互相交流、互怼。

  • Autonomous Driving Field

    自动驾驶领域

    • DriveVLM : "DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models". (CoRL 2024). "微信公众号「清华大学交叉信息研究院」《DriveVLM:清华MARS Lab合作推出首个部署上车的自动驾驶视觉语言大模型》"。

    • UniAD : "Planning-oriented Autonomous Driving". (CVPR 2023).

    • TransGPT|致远 : TransGPT是国内首款开源交通大模型,主要致力于在真实交通行业中发挥实际价值。它能够实现交通情况预测、智能咨询助手、公共交通服务、交通规划设计、交通安全教育、协助管理、交通事故报告和分析、自动驾驶辅助系统等功能。TransGPT作为一个通用常识交通大模型,可以为道路工程、桥梁工程、隧道工程、公路运输、水路运输、城市公共交通运输、交通运输经济、交通运输安全等行业提供通识常识。以此为基础,可以落脚到特定的交通应用场景中。

    • LLMLight : "LLMLight: Large Language Models as Traffic Signal Control Agents". (arXiv 2024).

  • Robotics and Embodied AI

    机器人与具身智能

    • LeRobot : 🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch.

    • BestAnHongjun/InternDog : InternDog: 基于InternLM2大模型的离线具身智能导盲犬。

  • Code Assistant

    代码助手

    • GPT Pilot : The first real AI developer. GPT Pilot doesn't just generate code, it builds apps! GPT Pilot is the core technology for the Pythagora VS Code extension that aims to provide the first real AI developer companion. Not just an autocomplete or a helper for PR messages but rather a real AI developer that can write full features, debug them, talk to you about issues, ask for review, etc.

    • StarCoder : 💫 StarCoder is a language model (LM) trained on source code and natural language text. Its training data incorporates more that 80 different programming languages as well as text extracted from GitHub issues and commits and from notebooks.

    • CodeGeeX2 : CodeGeeX2: A More Powerful Multilingual Code Generation Model. codegeex.cn

    • Code Llama : Inference code for CodeLlama models.

  • Translator

    翻译

  • Local knowledge Base

    本地知识库

    • privateGPT : Ask questions to your documents without an internet connection, using the power of LLMs. 100% private, no data leaves your execution environment at any point. You can ingest documents and ask questions without an internet connection! Built with LangChain, GPT4All, LlamaCpp, Chroma and SentenceTransformers.

    • Langchain-Chatchat : lLangchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain | 基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答。

    • yanqiangmiffy/Chinese-LangChain : Chinese-LangChain:中文langchain项目,基于ChatGLM-6b+langchain实现本地化知识库检索与智能答案生成。俗称:小必应,Q.Talk,强聊,QiangTalk。

    • labring/FastGPT : FastGPT is a knowledge-based question answering system built on the LLM. It offers out-of-the-box data processing and model invocation capabilities. Moreover, it allows for workflow orchestration through Flow visualization, thereby enabling complex question and answer scenarios! fastgpt.run

  • Long-Term Memory

    长期记忆

  • Question Answering System

    问答系统

    • THUDM/WebGLM : WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023). "WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences". (arXiv 2023).

    • afaqueumer/DocQA : Question Answering with Custom FIles using LLM. DocQA 🤖 is a web application built using Streamlit 🔥 and the LangChain 🦜🔗 framework, allowing users to leverage the power of LLMs for Generative Question Answering. 🌟

    • rese1f/MovieChat : 🔥 chat with over 10K frames of video! MovieChat can handle videos with >10K frames on a 24GB graphics card. MovieChat has a 10000× advantage over other methods in terms of the average increase in GPU memory cost per frame (21.3KB/f to ~200MB/f).

  • Academic Field

    学术领域

    • binary-husky/gpt_academic : 为ChatGPT/GLM提供图形交互界面,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm2等本地模型。兼容文心一言, moss, llama2, rwkv, claude2, 通义千问, 书生, 讯飞星火等。

    • kaixindelele/ChatPaper : Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文总结+润色+审稿+审稿回复。 💥💥💥面向全球,服务万千科研人的ChatPaper免费网页版正式上线:https://chatpaper.org/ 💥💥💥

    • GPTZero: The World's #1 AI Detector with over 1 Million Users. Detect ChatGPT, GPT3, GPT4, Bard, and other AI models.

    • BurhanUlTayyab/GPTZero : An open-source implementation of GPTZero. GPTZero is an AI model with some mathematical formulation to determine if a particular text fed to it is written by AI or a human being.

    • BurhanUlTayyab/DetectGPT : An open-source Pytorch implementation of DetectGPT. DetectGPT is an amazing method to determine whether a piece of text is written by large language models (like ChatGPT, GPT3, GPT2, BLOOM etc). However, we couldn't find any open-source implementation of it. Therefore this is the implementation of the paper. "DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature". (arXiv 2023).

    • WangRongsheng/ChatGenTitle : 🌟 ChatGenTitle:使用百万arXiv论文信息在LLaMA模型上进行微调的论文题目生成模型。

    • nishiwen1214/ChatReviewer : ChatReviewer: use ChatGPT to review papers; ChatResponse: use ChatGPT to respond to reviewers. 💥💥💥ChatReviewer的第一版网页出来了!!! 直接点击:https://huggingface.co/spaces/ShiwenNi/ChatReviewer

    • Shiling42/web-simulator-by-GPT4 : Online Interactive Physical Simulation Generated by GPT-4. shilingliang.com/web-simulator-by-GPT4/

  • Medical Field

    医药领域

    • 本草[原名:华驼(HuaTuo)] : Repo for BenTsao [original name: HuaTuo (华驼)], Llama-7B tuned with Chinese medical knowledge. 本草[原名:华驼(HuaTuo)]: 基于中文医学知识的LLaMA微调模型。本项目开源了经过中文医学指令精调/指令微调(Instruct-tuning) 的LLaMA-7B模型。我们通过医学知识图谱和GPT3.5 API构建了中文医学指令数据集,并在此基础上对LLaMA进行了指令微调,提高了LLaMA在医疗领域的问答效果。 "HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge". (arXiv 2023).

    • MedSAM : "Segment Anything in Medical Images". (arXiv 2023). "微信公众号「江大白」《MedSAM在医学领域,图像分割中的落地应用(附论文及源码)》"。

    • LLaVA-Med : "LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day". (arXiv 2023). "微信公众号「CVHub」《微软发布医学多模态大模型LLaVA-Med | 基于LLaVA的医学指令微调》"。

    • MedicalGPT : MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现包括二次预训练、有监督微调、奖励建模、强化学习训练。"微信公众号「KBQA沉思录」《【中文医疗大模型】训练全流程源码剖析》"。

    • MedQA-ChatGLM : 🛰️ 基于真实医疗对话数据在ChatGLM上进行LoRA、P-Tuning V2、Freeze、RLHF等微调,我们的眼光不止于医疗问答。www.wangrs.co/MedQA-ChatGLM/. "MedQA-ChatGLM: A Medical QA Model Fine-tuned on ChatGLM Using Multiple fine-tuning Method and Real Medical QA Data".

    • xhu248/AutoSAM : "How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images". (arXiv 2023).

    • DoctorGPT : DoctorGPT is an LLM that can pass the US Medical Licensing Exam. It works offline, it's cross-platform, & your health data stays private.

    • 仲景 : 仲景:首个实现从预训练到 RLHF 全流程训练的中文医疗大模型。 "Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue". (arXiv 2023).

  • Mental Health Field

    心理健康领域

    • MeChat : 中文心理健康支持对话数据集(SmileChat)与大模型(MeChat)。 "SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support". (arXiv 2023).

    • SmartFlowAI/EmoLLM : EmoLLM-心理健康大模型是一系列能够支持 理解用户-支持用户-帮助用户 心理健康辅导链路的心理健康大模型,由 LLM指令微调而来。心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral。

  • Legal Field

    法律领域

    • ChatLaw : ChatLaw-法律大模型。chatlaw.cloud/lawchat/

    • LaWGPT : 🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. LaWGPT 是一系列基于中文法律知识的开源大语言模型。该系列模型在通用中文基座模型(如 Chinese-LLaMA、ChatGLM 等)的基础上扩充法律领域专有词表、大规模中文法律语料预训练,增强了大模型在法律领域的基础语义理解能力。在此基础上,构造法律领域对话问答数据集、中国司法考试数据集进行指令精调,提升了模型对法律内容的理解和执行能力。

  • Financial Field

    金融领域

  • Math Field

    数学领域

  • Music Field

    音乐领域

  • Speech and Audio Field

    语音和音频领域

  • Humor Generation

    讲幽默笑话

  • Animation Field

    动漫领域

    • SaaRaaS-1300/InternLM2_horowag : 🍿InternLM2_Horowag🍿 🍏专门为 2024 书生·浦语大模型挑战赛 (春季赛) 准备的 Repo🍎收录了赫萝相关的微调模型。
  • Food Field

    食品领域

    • SmartFlowAI/TheGodOfCookery : 食神(The God Of Cookery)。本项目名称为“食神”( The God Of Cookery ),灵感来自喜剧大师周星驰主演的著名电影《食神》,旨在通过人工智能技术为用户提供烹饪咨询和食谱推荐,帮助用户更好地学习和实践烹饪技巧,降低烹饪门槛,实现《食神》电影中所讲的“只要用心,人人皆能做食神”。
  • Tool Learning

    工具学习

  • Adversarial Attack Field

    对抗攻击领域

  • Multi-Agent Collaboration

    多智能体协作

    • MetaGPT : "MetaGPT: Meta Programming for Multi-Agent Collaborative Framework". (arXiv 2023).
  • AI Avatar and Digital Human

    AI数字生命

    • RealChar : 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI GPT3.5/4, Anthropic Claude2, Chroma Vector DB, Whisper Speech2Text, ElevenLabs Text2Speech🎙️🤖 RealChar.ai/

    • FaceChain : FaceChain is a deep-learning toolchain for generating your Digital-Twin. FaceChain is a deep-learning toolchain for generating your Digital-Twin. With a minimum of 1 portrait-photo, you can create a Digital-Twin of your own and start generating personal portraits in different settings (multiple styles now supported!). You may train your Digital-Twin model and generate photos via FaceChain's Python scripts, or via the familiar Gradio interface. FaceChain是一个可以用来打造个人数字形象的深度学习模型工具。用户仅需要提供最低三张照片即可获得独属于自己的个人形象数字替身。FaceChain支持在gradio的界面中使用模型训练和推理能力,也支持资深开发者使用python脚本进行训练推理。

    • VirtualWife : VirtualWife 是一个虚拟主播项目,目前支持在B站进行直播,用户可以自由更换VRM人物模型,大家可以将他作为一个虚拟主播入门demo,在上面扩展自己喜欢功能。

    • GPT-vup : GPT-vup Live2D数字人直播。GPT-vup BIliBili | 抖音 | AI | 虚拟主播。

    • ChatVRM : ChatVRMはブラウザで簡単に3Dキャラクターと会話ができるデモアプリケーションです。

    • SillyTavern : LLM Frontend for Power Users. sillytavern.app

    • HeyGen : Scale your video production with customizable AI avatars. "微信公众号「DataLearner」《《流浪地球2》的数字生命计划可能快实现了!HeyGen即将发布下一代AI真人视频生成技术,效果逼真到无法几乎分辨!》"。

    • ChatVRM : ChatVRMはブラウザで簡単に3Dキャラクターと会話ができるデモアプリケーションです。

    • VideoChat : 实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.

  • GUI

    图形用户界面

    • Lobe Chat : 🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS/Plugins/Artifacts). One-click FREE deployment of your private ChatGPT/ Claude application. chat-preview.lobehub.com

    • ChatGPT-Next-Web : A well-designed cross-platform ChatGPT UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT 应用。

    • ChatGPT-Admin-Web : 带有用户管理和后台管理系统的 ChatGPT WebUI. caw.sku.moe

    • lencx/ChatGPT : 🔮 ChatGPT Desktop Application (Mac, Windows and Linux). NoFWL.

    • Synaptrix/ChatGPT-Desktop : Fuel your productivity with ChatGPT-Desktop - Blazingly fast and supercharged!

    • Poordeveloper/chatgpt-app : A ChatGPT App for all platforms. Built with Rust + Tauri + Vue + Axum.

    • sonnylazuardi/chat-ai-desktop : Chat AI Desktop App. Unofficial ChatGPT desktop app for Mac & Windows menubar using Tauri & Rust.

    • 202252197/ChatGPT_JCM : OpenAI Manage Web. OpenAI管理界面,聚合了OpenAI的所有接口进行界面操作。

    • m1guelpf/browser-agent : A browser AI agent, using GPT-4. docs.rs/browser-agent

    • sigoden/aichat : Using ChatGPT/GPT-3.5/GPT-4 in the terminal.

    • wieslawsoltes/ChatGPT : A ChatGPT C# client for graphical user interface runs on MacOS, Windows, Linux, Android, iOS and Browser. Powered by Avalonia UI framework. wieslawsoltes.github.io/ChatGPT/

    • sigoden/aichat : GUI for ChatGPT API and any LLM. 川虎 Chat 🐯 Chuanhu Chat. 为ChatGPT/ChatGLM/LLaMA/StableLM/MOSS等多种LLM提供了一个轻快好用的Web图形界。

    • amrrs/chatgpt-clone : Build Yo'own ChatGPT with OpenAI API & Gradio.

    • llama2-webui : Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.

    • ricklamers/gpt-code-ui : An open source implementation of OpenAI's ChatGPT Code interpreter.

    • mckaywrigley/chatbot-ui :An open source ChatGPT UI. chatbotui.com

    • chieapp/chie : An extensive desktop app for ChatGPT and other LLMs. chie.app

    • cLangUI : AUI for your AI. Open Source Tailwind components tailored for your GPT, generative AI, and LLM projects.

    • AUTOMATIC1111/stable-diffusion-webui : Stable Diffusion web UI. A browser interface based on Gradio library for Stable Diffusion.

    • Mikubill/sd-webui-controlnet : ControlNet for Stable Diffusion WebUI. The WebUI extension for ControlNet and other injection-based SD controls.

    • oobabooga/text-generation-webui : Text generation web UI. A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.

    • SolidUI : AI-generated visualization prototyping and editing platform.

    • AIdea : AIdea 是一款支持 GPT 以及国产大语言模型通义千问、文心一言等,支持 Stable Diffusion 文生图、图生图、 SDXL1.0、超分辨率、图片上色的全能型 APP。

    • Chainlit : Build Python LLM apps in minutes ⚡️ Chainlit lets you create ChatGPT-like UIs on top of any Python code in minutes! docs.chainlit.io

Datasets

数据集

Blogs

Videos

Interview

Star History

Star History Chart

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for awesome-llm-and-aigc

Similar Open Source Tools

For similar tasks

No tools available

For similar jobs

No tools available