awesome-mobile-llm
Awesome Mobile LLMs
Stars: 53
Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.
README:
A curated list of LLMs and related studies targeted at mobile and embedded hardware
Last update: 6th September 2024
If your publication/work is not included - and you think it should - please open an issue or reach out directly to @stevelaskaridis.
Let's try to make this list as useful as possible to researchers, engineers and practitioners all around the world.
- Mobile-First LLMs
- Infrastructure / Deployment of LLMs on Device
- Benchmarking LLMs on Device
- Applications
- Multimodal LLMs
- Surveys on Efficient LLMs
- Training LLMs on Device
- Mobile-Related Use-Cases
- Industry Announcements
- Related Awesome Repositories
The following Table shows sub-3B models designed for on-device deployments, sorted by year.
Name | Year | Sizes | Primary Group/Affiliation | Publication | Code Repository | HF Repository |
---|---|---|---|---|---|---|
Gemma 2 | 2024 | 2B, ... | blog | code | huggingface | |
Apple Intelligence Foundation LMs | 2024 | 3B | Apple | paper | - | - |
Fox | 2024 | 1.6B | TensorOpera | blog | - | huggingface |
Qwen2 | 2024 | 500M, 1.5B, ... | Qwen Team | paper | code | huggingface |
OpenELM | 2024 | 270M, 450M, 1.08B, 3.04B | Apple | paper | code | huggingface |
Phi-3 | 2024 | 3.8B | Microsoft | whitepaper | - | huggingface |
OLMo | 2024 | 1B, ... | AllenAI | paper | code | huggingface |
Mobile LLMs | 2024 | 125M, 250M | Meta | paper | code | - |
Gemma | 2024 | 2B, ... | website | code, gemma.cpp | huggingface | |
MobiLlama | 2024 | 0.5B, 1B | MBZUAI | paper | code | huggingface |
Stable LM 2 (Zephyr) | 2024 | 1.6B | Stability.ai | paper | - | huggingface |
TinyLlama | 2024 | 1.1B | Singapore University of Technology and Design | paper | code | huggingface |
Gemini-Nano | 2024 | 1.8B, 3.25B | paper | - | - | |
Stable LM (Zephyr) | 2023 | 3B | Stability | blog | code | huggingface |
OpenLM | 2023 | 11M, 25M, 87M, 160M, 411M, 830M, 1B, 3B, ... | OpenLM team | - | code | huggingface |
Phi-2 | 2023 | 2.7B | Microsoft | website | - | huggingface |
Phi-1.5 | 2023 | 1.3B | Microsoft | paper | - | huggingface |
Phi-1 | 2023 | 1.3B | Microsoft | paper | - | huggingface |
RWKV | 2023 | 169M, 430M, 1.5B, 3B, ... | EleutherAI | paper | code | huggingface |
Cerebras-GPT | 2023 | 111M, 256M, 590M, 1.3B, 2.7B ... | Cerebras | paper | code | huggingface |
OPT | 2022 | 125M, 350M, 1.3B, 2.7B, ... | Meta | paper | code | huggingface |
LaMini-LM | 2023 | 61M, 77M, 111M, 124M, 223M, 248M, 256M, 590M, 774M, 738M, 783M, 1.3B, 1.5B, ... | MBZUAI | paper | code | huggingface |
Pythia | 2023 | 70M, 160M, 410M, 1B, 1.4B, 2.8B, ... | EleutherAI | paper | code | huggingface |
Galactica | 2022 | 125M, 1.3B, ... | Meta | paper | code | huggingface |
BLOOM | 2022 | 560M, 1.1B, 1.7B, 3B, ... | BigScience | paper | code | huggingface |
XGLM | 2021 | 564M, 1.7B, 2.9B, ... | Meta | paper | code | huggingface |
GPT-Neo | 2021 | 125M, 350M, 1.3B, 2.7B | EleutherAI | - | code, gpt-neox | huggingface |
MobileBERT | 2020 | 15.1M, 25.3M | CMU, Google | paper | code | huggingface |
BART | 2019 | 140M, 400M | Meta | paper | code | huggingface |
DistilBERT | 2019 | 66M | HuggingFace | paper | code | huggingface |
T5 | 2019 | 60M, 220M, 770M, 3B, ... | paper | code | huggingface | |
TinyBERT | 2019 | 14.5M | Huawei | paper | code | huggingface |
Megatron-LM | 2019 | 336M, 1.3B, ... | Nvidia | paper | code | - |
This section showcases frameworks and contributions for supporting LLM inference on mobile and edge devices.
-
llama.cpp: Inference of Meta's LLaMA model (and others) in pure C/C++. Supports various platforms and builds on top of ggml (now gguf format).
- LLMFarm: iOS frontend for llama.cpp
- LLM.swift: iOS frontend for llama.cpp
- Sherpa: Android frontend for llama.cpp
- iAkashPaul/Portal: Wraps the example android app with tweaked UI, configs & additional model support
- dusty-nv's llama.cpp: Containers for Jetson deployment of llama.cpp
-
MLC-LLM: MLC LLM is a machine learning compiler and high-performance deployment engine for large language models. Supports various platforms and build on top of TVM.
- Android App: MLC Android app
- iOS App: MLC iOS app
- dusty-nv's MLC: Containers for Jetson deployment of MLC
-
PyTorch ExecuTorch: Solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers.
- TorchChat: Codebase showcasing the ability to run large language models (LLMs) seamlessly across iOS and Android
- Google MediaPipe: A suite of libraries and tools for you to quickly apply artificial intelligence (AI) and machine learning (ML) techniques in your applications. Support Android, iOS, Python and Web.
-
Apple MLX: MLX is an array framework for machine learning research on Apple silicon, brought to you by Apple machine learning research. Builds upon lazy evaluation and unified memory architecture.
- MLX Swift: Swift API for MLX.
- Alibaba MNN: MNN supports inference and training of deep learning models and for inference and training on-device.
- llama2.c (More educational, see here for android port)
- tinygrad: Simple neural network framework from tinycorp and @geohot
- TinyChatEngine: Targeted at Nvidia, Apple M1 and RPi, from Song Han's (MIT) group.
- PowerInfer-2: Fast Large Language Model Inference on a Smartphone (paper, code)
- [MobiCom'24] Mobile Foundation Model as Firmware (paper, code)
- Merino: Entropy-driven Design for Generative Language Models on IoT Devicess (paper)
- LLM as a System Service on Mobile Devices (paper)
- LLMCad: Fast and Scalable On-device Large Language Model Inference (paper)
- EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models (paper)
This section focuses on measurements and benchmarking efforts for assessing LLM performance when deployed on device.
- MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases (paper)
- [MobiCom'24] MELTing point: Mobile Evaluation of Language Transformers (paper, talk, code)
- Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent (paper)
- Octopus v2: On-device language model for super agent (paper)
- Towards an On-device Agent for Text Rewriting (paper)
This section refers to multimodal LLMs, which integrate vision or other modalities in their tasks.
- TinyLLaVA: A Framework of Small-scale Large Multimodal Models (paper, code)
- MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (paper, code)
This section includes survey papers on LLM efficiency, a topic very much related to deploying in constrained devices.
- On-Device Language Models: A Comprehensive Review (paper)
- A Survey of Resource-efficient LLM and Multimodal Foundation Models (paper)
- Efficient Large Language Models: A Survey (paper, code)
- Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems (paper)
- A Survey on Model Compression for Large Language Models (paper)
This section refers to papers attempting to train/fine-tune LLMs on device, in a standalone or federated manner.
- [MobiCom'23] Federated Few-Shot Learning for Mobile NLP (paper, code)
- FwdLLM: Efficient FedLLM using Forward Gradient (paper, code)
- [Electronics'24] Forward Learning of Large Language Models by Consumer Devices (paper)
- Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly (paper)
- Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes (paper, code)
This section includes paper that are mobile-related, but not necessarily run on device.
- Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs (paper)
- Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception (paper, code)
- [NeurIPS'23] AndroidInTheWild: A Large-Scale Dataset For Android Device Control (paper, code)
- GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation (paper, code)
- [ACL'20] Mapping Natural Language Instructions to Mobile UI Action Sequences (paper)
- WWDC'24 - Apple Foundation Models
- PyTorch Executorch Alpha
- Google - LLMs On-Device with MediaPipe and TFLite
- Qualcomm - The future of AI is Hybrid
- ARM - Generative AI on mobile
If you want to read more about related topics, here are some tangential awesome repositories to visit:
- NexaAI/Awesome-LLMs-on-device on LLMs on Device
- Hannibal046/Awesome-LLM on Large Language Models
- KennethanCeyer/awesome-llm on Large Language Models
- HuangOwen/Awesome-LLM-Compression on Large Language Model Compression
- csarron/awesome-emdl on Embedded and Mobile Deep Learning
Contributions welcome! Read the contribution guidelines first.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for awesome-mobile-llm
Similar Open Source Tools
awesome-mobile-llm
Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.
Awesome-LLM-Safety
Welcome to our Awesome-llm-safety repository! We've curated a collection of the latest, most comprehensive, and most valuable resources on large language model safety (llm-safety). But we don't stop there; included are also relevant talks, tutorials, conferences, news, and articles. Our repository is constantly updated to ensure you have the most current information at your fingertips.
llm-apps-java-spring-ai
The 'LLM Applications with Java and Spring AI' repository provides samples demonstrating how to build Java applications powered by Generative AI and Large Language Models (LLMs) using Spring AI. It includes projects for question answering, chat completion models, prompts, templates, multimodality, output converters, embedding models, document ETL pipeline, function calling, image models, and audio models. The repository also lists prerequisites such as Java 21, Docker/Podman, Mistral AI API Key, OpenAI API Key, and Ollama. Users can explore various use cases and projects to leverage LLMs for text generation, vector transformation, document processing, and more.
visionOS-examples
visionOS-examples is a repository containing accelerators for Spatial Computing. It includes examples such as Local Large Language Model, Chat Apple Vision Pro, WebSockets, Anchor To Head, Hand Tracking, Battery Life, Countdown, Plane Detection, Timer Vision, and PencilKit for visionOS. The repository showcases various functionalities and features for Apple Vision Pro, offering tools for developers to enhance their visionOS apps with capabilities like hand tracking, plane detection, and real-time cryptocurrency prices.
nntrainer
NNtrainer is a software framework for training neural network models on devices with limited resources. It enables on-device fine-tuning of neural networks using user data for personalization. NNtrainer supports various machine learning algorithms and provides examples for tasks such as few-shot learning, ResNet, VGG, and product rating. It is optimized for embedded devices and utilizes CBLAS and CUBLAS for accelerated calculations. NNtrainer is open source and released under the Apache License version 2.0.
llm-awq
AWQ (Activation-aware Weight Quantization) is a tool designed for efficient and accurate low-bit weight quantization (INT3/4) for Large Language Models (LLMs). It supports instruction-tuned models and multi-modal LMs, providing features such as AWQ search for accurate quantization, pre-computed AWQ model zoo for various LLMs, memory-efficient 4-bit linear in PyTorch, and efficient CUDA kernel implementation for fast inference. The tool enables users to run large models on resource-constrained edge platforms, delivering more efficient responses with LLM/VLM chatbots through 4-bit inference.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
EAGLE
Eagle is a family of Vision-Centric High-Resolution Multimodal LLMs that enhance multimodal LLM perception using a mix of vision encoders and various input resolutions. The model features a channel-concatenation-based fusion for vision experts with different architectures and knowledge, supporting up to over 1K input resolution. It excels in resolution-sensitive tasks like optical character recognition and document understanding.
2024-AICS-EXP
This repository contains the complete archive of the 2024 version of the 'Intelligent Computing System' experiment at the University of Chinese Academy of Sciences. The experiment content for 2024 has undergone extensive adjustments to the knowledge system and experimental topics, including the transition from TensorFlow to PyTorch, significant modifications to previous code, and the addition of experiments with large models. The project is continuously updated in line with the course progress, currently up to the seventh experiment. Updates include the addition of experiments like YOLOv5 in Experiment 5-3, updates to theoretical teaching materials, and fixes for bugs in Experiment 6 code. The repository also includes experiment manuals, questions, and answers for various experiments, with some data sets hosted on Baidu Cloud due to size limitations on GitHub.
Hands-On-LangChain-for-LLM-Applications-Development
Practical LangChain tutorials for developing LLM applications, including prompt templates, output parsing, chatbots memory, chains, evaluating applications, building agents using LangChain & OpenAI API, retrieval augmented generation with LangChain, documents loading, splitting, vector database & text embeddings, information retrieval, answering questions from documents, chat with files, and introduction to Open AI function calling.
Awesome-Tabular-LLMs
This repository is a collection of papers on Tabular Large Language Models (LLMs) specialized for processing tabular data. It includes surveys, models, and applications related to table understanding tasks such as Table Question Answering, Table-to-Text, Text-to-SQL, and more. The repository categorizes the papers based on key ideas and provides insights into the advancements in using LLMs for processing diverse tables and fulfilling various tabular tasks based on natural language instructions.
LLamaTuner
LLamaTuner is a repository for the Efficient Finetuning of Quantized LLMs project, focusing on building and sharing instruction-following Chinese baichuan-7b/LLaMA/Pythia/GLM model tuning methods. The project enables training on a single Nvidia RTX-2080TI and RTX-3090 for multi-round chatbot training. It utilizes bitsandbytes for quantization and is integrated with Huggingface's PEFT and transformers libraries. The repository supports various models, training approaches, and datasets for supervised fine-tuning, LoRA, QLoRA, and more. It also provides tools for data preprocessing and offers models in the Hugging Face model hub for inference and finetuning. The project is licensed under Apache 2.0 and acknowledges contributions from various open-source contributors.
CogVLM2
CogVLM2 is a new generation of open source models that offer significant improvements in benchmarks such as TextVQA and DocVQA. It supports 8K content length, image resolution up to 1344 * 1344, and both Chinese and English languages. The project provides basic calling methods, fine-tuning examples, and OpenAI API format calling examples to help developers quickly get started with the model.
Awesome-LLM-Large-Language-Models-Notes
Awesome-LLM-Large-Language-Models-Notes is a repository that provides a comprehensive collection of information on various Large Language Models (LLMs) classified by year, size, and name. It includes details on known LLM models, their papers, implementations, and specific characteristics. The repository also covers LLM models classified by architecture, must-read papers, blog articles, tutorials, and implementations from scratch. It serves as a valuable resource for individuals interested in understanding and working with LLMs in the field of Natural Language Processing (NLP).
Model-References
The 'Model-References' repository contains examples for training and inference using Intel Gaudi AI Accelerator. It includes models for computer vision, natural language processing, audio, generative models, MLPerf™ training, and MLPerf™ inference. The repository provides performance data and model validation information for various frameworks like PyTorch. Users can find examples of popular models like ResNet, BERT, and Stable Diffusion optimized for Intel Gaudi AI accelerator.
goodai-ltm-benchmark
This repository contains code and data for replicating experiments on Long-Term Memory (LTM) abilities of conversational agents. It includes a benchmark for testing agents' memory performance over long conversations, evaluating tasks requiring dynamic memory upkeep and information integration. The repository supports various models, datasets, and configurations for benchmarking and reporting results.
For similar tasks
pyllms
PyLLMs is a minimal Python library designed to connect to various Language Model Models (LLMs) such as OpenAI, Anthropic, Google, AI21, Cohere, Aleph Alpha, and HuggingfaceHub. It provides a built-in model performance benchmark for fast prototyping and evaluating different models. Users can easily connect to top LLMs, get completions from multiple models simultaneously, and evaluate models on quality, speed, and cost. The library supports asynchronous completion, streaming from compatible models, and multi-model initialization for testing and comparison. Additionally, it offers features like passing chat history, system messages, counting tokens, and benchmarking models based on quality, speed, and cost.
LLM-Fine-Tuning-Azure
A fine-tuning guide for both OpenAI and Open-Source Large Language Models on Azure. Fine-Tuning retrains an existing pre-trained LLM using example data, resulting in a new 'custom' fine-tuned LLM optimized for task-specific examples. Use cases include improving LLM performance on specific tasks and introducing information not well represented by the base LLM model. Suitable for cases where latency is critical, high accuracy is required, and clear evaluation metrics are available. Learning path includes labs for fine-tuning GPT and Llama2 models via Dashboards and Python SDK.
cellseg_models.pytorch
cellseg-models.pytorch is a Python library built upon PyTorch for 2D cell/nuclei instance segmentation models. It provides multi-task encoder-decoder architectures and post-processing methods for segmenting cell/nuclei instances. The library offers high-level API to define segmentation models, open-source datasets for training, flexibility to modify model components, sliding window inference, multi-GPU inference, benchmarking utilities, regularization techniques, and example notebooks for training and finetuning models with different backbones.
awesome-mobile-llm
Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.
flashinfer
FlashInfer is a library for Language Languages Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, PageAttention and LoRA. FlashInfer focus on LLM serving and inference, and delivers state-the-art performance across diverse scenarios.
langcorn
LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience. It offers features such as easy deployment of LangChain models and pipelines, ready-to-use authentication functionality, high-performance FastAPI framework for serving requests, scalability and robustness for language processing applications, support for custom pipelines and processing, well-documented RESTful API endpoints, and asynchronous processing for faster response times.
Awesome-LLM
Awesome-LLM is a curated list of resources related to large language models, focusing on papers, projects, frameworks, tools, tutorials, courses, opinions, and other useful resources in the field. It covers trending LLM projects, milestone papers, other papers, open LLM projects, LLM training frameworks, LLM evaluation frameworks, tools for deploying LLM, prompting libraries & tools, tutorials, courses, books, and opinions. The repository provides a comprehensive overview of the latest advancements and resources in the field of large language models.
ChuanhuChatGPT
Chuanhu Chat is a user-friendly web graphical interface that provides various additional features for ChatGPT and other language models. It supports GPT-4, file-based question answering, local deployment of language models, online search, agent assistant, and fine-tuning. The tool offers a range of functionalities including auto-solving questions, online searching with network support, knowledge base for quick reading, local deployment of language models, GPT 3.5 fine-tuning, and custom model integration. It also features system prompts for effective role-playing, basic conversation capabilities with options to regenerate or delete dialogues, conversation history management with auto-saving and search functionalities, and a visually appealing user experience with themes, dark mode, LaTeX rendering, and PWA application support.
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.