
deeppowers
An open-source, high-performance inference accelerator engine supporting various large language models such as DEEPSEEK, GPT, GEMINI, and CLAUDE.
Stars: 183

Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.
README:
An open-source, high-performance inference accelerator engine supporting various large language models such as DEEPSEEK, GPT, GEMINI, and CLAUDE.
DeepPowers is a high-performance tokenizer implementation with memory optimization and parallel processing capabilities. It provides efficient text tokenization for large language models with features like WordPiece and BPE algorithms, memory pooling, and batch processing.
- Hardware Abstraction Layer (HAL)
- CUDA device management
- Basic tensor operations
- Kernel management system
- Request queue management
- Batch processing system
- Priority scheduling
- Error handling mechanism
- INT8 quantization support
- INT4 quantization support
- Mixed-precision quantization
- Calibration data management
- C++ API infrastructure
- Python bindings
- REST API infrastructure
- gRPC service infrastructure
- Authentication middleware
- Rate limiting middleware
- Logging middleware
- Monitoring middleware
- Error handling middleware
- Model operator fusion
- Weight pruning techniques
- KV-cache optimization
- Automatic optimization selection
- Performance profiling and benchmarking
The architecture follows a pipeline-based design with several key components:
-
Request Flow
- User requests enter the system through a unified interface
- Requests are queued and prioritized in the Request Queue
- Batching system groups compatible requests for optimal processing
- Execution Engine processes batches and generates results
- Output is post-processed and returned to users
-
Control Flow
- Configuration Manager oversees system settings and runtime parameters
- Graph Compiler optimizes computation graphs for execution
- Hardware Abstraction Layer provides unified access to different hardware backends
-
Optimization Points
- Dynamic batching for throughput optimization
- Graph compilation for computation optimization
- Hardware-specific optimizations through HAL
- Configuration-based performance tuning
deeppowers/
├── src/
│ ├── core/ # Core implementation
│ │ ├── hal/ # Hardware Abstraction Layer for device management
│ │ ├── request_queue/ # Request queue and management system
│ │ ├── batching/ # Batch processing and optimization
│ │ ├── execution/ # Execution engine and runtime
│ │ ├── distributed/ # Distributed computing support
│ │ ├── scheduling/ # Task scheduling and resource management
│ │ ├── monitoring/ # System monitoring and metrics
│ │ ├── config/ # Configuration management
│ │ ├── preprocessing/ # Input preprocessing pipeline
│ │ ├── postprocessing/ # Output postprocessing pipeline
│ │ ├── graph/ # Computation graph management
│ │ ├── api/ # Internal API implementations
│ │ ├── model/ # Base model architecture
│ │ ├── memory/ # Memory management system
│ │ ├── inference/ # Inference engine core
│ │ ├── models/ # Specific model implementations
│ │ ├── tokenizer/ # Tokenization implementations
│ │ └── utils/ # Utility components
│ ├── api/ # External API implementations
│ └── common/ # Common utilities
├── tests/ # Test suite
├── scripts/ # Utility scripts
├── examples/ # Example usage
├── docs/ # Documentation
└── README.md # Project overview
The core module is organized into specialized components:
- HAL (Hardware Abstraction Layer): Manages hardware devices and provides unified interface for different backends
- Request Queue: Handles incoming requests with priority management and load balancing
- Batching: Implements dynamic batching strategies for optimal throughput
- Execution: Core execution engine for model inference
- Distributed: Supports distributed computing and model parallelism
- Scheduling: Manages task scheduling and resource allocation
- Monitoring: System metrics collection and performance monitoring
- Config: Configuration management and validation
- Memory: Advanced memory management and optimization
- Preprocessing: Input data preparation and normalization
- Postprocessing: Output processing and formatting
- Graph: Computation graph optimization and management
- Inference: Core inference engine implementation
- Model: Base model architecture and interfaces
- Models: Specific model implementations (GPT, BERT, etc.)
- Tokenizer: Text tokenization algorithms and utilities
- API: Internal API implementations for core functionality
- Utils: Common utilities and helper functions
- C++17 compiler
- CMake 3.15+
- Python 3.8+ (for Python bindings)
- ICU library for Unicode support
# Install dependencies (Ubuntu)
sudo apt-get install build-essential cmake libicu-dev
# Clone and build
git clone https://github.com/deeppowers/deeppowers.git
cd deeppowers
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j
# Install Python package (optional)
cd ./src/api/python
pip install -e .
# Clone the repository
git clone https://github.com/deeppowers/deeppowers.git
cd deeppowers
# Install dependencies
pip install -r requirements.txt
# Build from source
mkdir build && cd build
cmake ..
make -j$(nproc)
import deeppowers as dp
# Method 1: Using Pipeline (Recommended)
# Initialize pipeline with pre-trained model
pipeline = dp.Pipeline.from_pretrained("deepseek-v3")
# Generate text
response = pipeline.generate(
"Hello, how are you?",
max_length=50,
temperature=0.7,
top_p=0.9
)
print(response)
# Batch processing
responses = pipeline.generate(
["Hello!", "How are you?"],
max_length=50,
temperature=0.7
)
# Save and load pipeline
pipeline.save("my_pipeline")
loaded_pipeline = dp.Pipeline.load("my_pipeline")
# Method 2: Using Tokenizer and Model separately
# Initialize tokenizer
tokenizer = dp.Tokenizer(model_name="deepseek-v3") # or use custom vocab
tokenizer.load("path/to/tokenizer.model")
# Initialize model
model = dp.Model.from_pretrained("deepseek-v3")
# Create pipeline manually
pipeline = dp.Pipeline(model=model, tokenizer=tokenizer)
# Initialize tokenizer with specific type
tokenizer = dp.Tokenizer(tokenizer_type=dp.TokenizerType.WORDPIECE)
# Train on custom data
texts = ["your", "training", "texts"]
tokenizer.train(texts, vocab_size=30000, min_frequency=2)
# Save and load
tokenizer.save("tokenizer.model")
tokenizer.load("tokenizer.model")
# Basic tokenization
tokens = tokenizer.encode("Hello, world!")
text = tokenizer.decode(tokens)
# Batch processing with parallel execution
texts = ["multiple", "texts", "for", "processing"]
tokens_batch = tokenizer.encode_batch(
texts,
add_special_tokens=True,
padding=True,
max_length=128
)
# Configure generation parameters
response = pipeline.generate(
"Write a story about",
max_length=200, # Maximum length of generated text
min_length=50, # Minimum length of generated text
temperature=0.7, # Controls randomness (higher = more random)
top_k=50, # Limits vocabulary to top k tokens
top_p=0.9, # Nucleus sampling threshold
num_return_sequences=3, # Number of different sequences to generate
repetition_penalty=1.2 # Penalize repeated tokens
)
# Batch generation with multiple prompts
prompts = [
"Write a story about",
"Explain quantum physics",
"Give me a recipe for"
]
responses = pipeline.generate(
prompts,
max_length=100,
temperature=0.8
)
# Load model
model = dp.load_model("deepseek-v3", device="cuda", dtype="float16")
# Apply automatic optimization
results = dp.optimize_model(model, optimization_type="auto", level="o2", enable_profiling=True)
print(f"Achieved speedup: {results['speedup']}x")
print(f"Memory reduction: {results['memory_reduction']}%")
# Apply specific optimization techniques
results = dp.optimize_model(model, optimization_type="fusion")
results = dp.optimize_model(model, optimization_type="pruning")
results = dp.optimize_model(model, optimization_type="caching")
# Quantize model to INT8 precision
results = dp.quantize_model(model, precision="int8")
print(f"INT8 quantization speedup: {results['speedup']}x")
print(f"Accuracy loss: {results['accuracy_loss']}%")
# Run benchmarks
benchmark_results = dp.benchmark_model(
model,
input_text="This is a test input for benchmarking.",
num_runs=10,
warmup_runs=3
)
print(f"Average latency: {benchmark_results['avg_latency_ms']} ms")
print(f"Throughput: {benchmark_results['throughput_tokens_per_sec']} tokens/sec")
# Configure memory pool
tokenizer.set_memory_pool_size(4096) # 4KB blocks
tokenizer.enable_string_pooling(True)
# Monitor memory usage
stats = tokenizer.get_memory_stats()
print(f"Memory pool usage: {stats['pool_usage']}MB")
print(f"String pool size: {stats['string_pool_size']}")
# Configure thread pool
tokenizer.set_num_threads(8)
tokenizer.set_batch_size(64)
# Process large datasets
with open("large_file.txt", "r") as f:
texts = f.readlines()
tokens = tokenizer.encode_batch_parallel(texts)
DeepPowers includes several performance optimization features:
- Memory pooling and caching
- Dynamic batching
- Parallel processing
- Mixed-precision computation
- Distributed inference
- Model quantization (INT8, INT4, mixed precision)
- Operator fusion
- KV-cache optimization
- ✅ Hardware Abstraction Layer (HAL) - Basic CUDA and ROCM support
- ✅ Tokenizer Implementation - WordPiece and BPE algorithms
- ✅ Memory Management - Basic memory pooling system
- ✅ Request Queue Management - Basic request handling
- ✅ Configuration System - Basic config management
- ✅ Python Bindings - Basic API interface
- ✅ Monitoring System - Basic metrics collection
- ✅ Model Execution Framework - Core implementation
- ✅ Inference Pipeline - Basic pipeline structure
- ✅ Dynamic Batch Processing - Initial implementation
- ✅ Model Loading System - Support for ONNX, PyTorch, and TensorFlow formats
- ✅ Inference Engine - Complete implementation with text generation capabilities
- ✅ Inference Optimization - Operator fusion, weight pruning, and caching optimizations
- ✅ Quantization System - INT8, INT4, and mixed precision support
- ✅ Benchmarking Tools - Performance measurement and optimization metrics
- ✅ Streaming Generation - Real-time text generation with callback support
- ✅ Advanced Batching Strategies - Batch processing with parallel inference
- 🔄 Computation Graph System - Advanced graph optimizations
- 🔄 Distributed Computing Support - Multi-node inference
- 🔄 Auto-tuning System - Automatic performance optimization
- 🔄 Dynamic Shape Support - Flexible tensor dimensions handling
- 📋 Advanced Model Support
- Advanced LLM implementations (GPT, Gemini, Claude)
- More sophisticated model architecture support
- Model compression and distillation
- 📋 Performance Optimization
- Advanced memory management
- Kernel fusion optimizations
- Custom CUDA kernels for critical operations
- 📋 Advanced Features
- Multi-GPU parallelism
- Distributed inference across nodes
- Advanced caching system with prefetching
- Speculative decoding
- Custom operator implementation
The project includes comprehensive benchmarking tools:
# Run performance benchmark
python examples/optimize_and_benchmark.py --model your-model --benchmark
# Run with optimization
python examples/optimize_and_benchmark.py --model your-model --optimization auto --level o2 --benchmark
# Apply quantization
python examples/optimize_and_benchmark.py --model your-model --quantize --quantize-precision int8 --benchmark
# Generate text with optimized model
python examples/optimize_and_benchmark.py --model your-model --optimization auto --generate --prompt "Your prompt here"
We welcome contributions! Please see our Contributing Guidelines for details.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Special thanks to all contributors and the open-source community.
- GitHub Issues: Create an issue
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for deeppowers
Similar Open Source Tools

deeppowers
Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

deepteam
Deepteam is a powerful open-source tool designed for deep learning projects. It provides a user-friendly interface for training, testing, and deploying deep neural networks. With Deepteam, users can easily create and manage complex models, visualize training progress, and optimize hyperparameters. The tool supports various deep learning frameworks and allows seamless integration with popular libraries like TensorFlow and PyTorch. Whether you are a beginner or an experienced deep learning practitioner, Deepteam simplifies the development process and accelerates model deployment.

ml-retreat
ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

trustgraph
TrustGraph is a tool that deploys private GraphRAG pipelines to build a RDF style knowledge graph from data, enabling accurate and secure `RAG` requests compatible with cloud LLMs and open-source SLMs. It showcases the reliability and efficiencies of GraphRAG algorithms, capturing contextual language flags missed in conventional RAG approaches. The tool offers features like PDF decoding, text chunking, inference of various LMs, RDF-aligned Knowledge Graph extraction, and more. TrustGraph is designed to be modular, supporting multiple Language Models and environments, with a plug'n'play architecture for easy customization.

LightLLM
LightLLM is a lightweight library for linear and logistic regression models. It provides a simple and efficient way to train and deploy machine learning models for regression tasks. The library is designed to be easy to use and integrate into existing projects, making it suitable for both beginners and experienced data scientists. With LightLLM, users can quickly build and evaluate regression models using a variety of algorithms and hyperparameters. The library also supports feature engineering and model interpretation, allowing users to gain insights from their data and make informed decisions based on the model predictions.

trae-agent
Trae-agent is a Python library for building and training reinforcement learning agents. It provides a simple and flexible framework for implementing various reinforcement learning algorithms and experimenting with different environments. With Trae-agent, users can easily create custom agents, define reward functions, and train them on a variety of tasks. The library also includes utilities for visualizing agent performance and analyzing training results, making it a valuable tool for both beginners and experienced researchers in the field of reinforcement learning.

Fast-LLM
Fast-LLM is an open-source library designed for training large language models with exceptional speed, scalability, and flexibility. Built on PyTorch and Triton, it offers optimized kernel efficiency, reduced overheads, and memory usage, making it suitable for training models of all sizes. The library supports distributed training across multiple GPUs and nodes, offers flexibility in model architectures, and is easy to use with pre-built Docker images and simple configuration. Fast-LLM is licensed under Apache 2.0, developed transparently on GitHub, and encourages contributions and collaboration from the community.

sciml.ai
SciML.ai is an open source software organization dedicated to unifying packages for scientific machine learning. It focuses on developing modular scientific simulation support software, including differential equation solvers, inverse problems methodologies, and automated model discovery. The organization aims to provide a diverse set of tools with a common interface, creating a modular, easily-extendable, and highly performant ecosystem for scientific simulations. The website serves as a platform to showcase SciML organization's packages and share news within the ecosystem. Pull requests are encouraged for contributions.

AI_Spectrum
AI_Spectrum is a versatile machine learning library that provides a wide range of tools and algorithms for building and deploying AI models. It offers a user-friendly interface for data preprocessing, model training, and evaluation. With AI_Spectrum, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is designed to be flexible and scalable, making it suitable for both beginners and experienced data scientists.

ai-workshop-code
The ai-workshop-code repository contains code examples and tutorials for various artificial intelligence concepts and algorithms. It serves as a practical resource for individuals looking to learn and implement AI techniques in their projects. The repository covers a wide range of topics, including machine learning, deep learning, natural language processing, computer vision, and reinforcement learning. By exploring the code and following the tutorials, users can gain hands-on experience with AI technologies and enhance their understanding of how these algorithms work in practice.

atomic-agents
The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.

FastFlowLM
FastFlowLM is a Python library for efficient and scalable language model inference. It provides a high-performance implementation of language model scoring using n-gram language models. The library is designed to handle large-scale text data and can be easily integrated into natural language processing pipelines for tasks such as text generation, speech recognition, and machine translation. FastFlowLM is optimized for speed and memory efficiency, making it suitable for both research and production environments.

langchain
LangChain is a framework for building LLM-powered applications that simplifies AI application development by chaining together interoperable components and third-party integrations. It helps developers connect LLMs to diverse data sources, swap models easily, and future-proof decisions as technology evolves. LangChain's ecosystem includes tools like LangSmith for agent evals, LangGraph for complex task handling, and LangGraph Platform for deployment and scaling. Additional resources include tutorials, how-to guides, conceptual guides, a forum, API reference, and chat support.

ollama
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications. Ollama is designed to be easy to use and accessible to developers of all levels. It is open source and available for free on GitHub.

mcp-fundamentals
The mcp-fundamentals repository is a collection of fundamental concepts and examples related to microservices, cloud computing, and DevOps. It covers topics such as containerization, orchestration, CI/CD pipelines, and infrastructure as code. The repository provides hands-on exercises and code samples to help users understand and apply these concepts in real-world scenarios. Whether you are a beginner looking to learn the basics or an experienced professional seeking to refresh your knowledge, mcp-fundamentals has something for everyone.

deepflow
DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.
For similar tasks

LLMs4TS
LLMs4TS is a repository focused on the application of cutting-edge AI technologies for time-series analysis. It covers advanced topics such as self-supervised learning, Graph Neural Networks for Time Series, Large Language Models for Time Series, Diffusion models, Mixture-of-Experts architectures, and Mamba models. The resources in this repository span various domains like healthcare, finance, and traffic, offering tutorials, courses, and workshops from prestigious conferences. Whether you're a professional, data scientist, or researcher, the tools and techniques in this repository can enhance your time-series data analysis capabilities.

deeppowers
Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. It provides a common API to deliver inference solutions on various platforms, including CPU, GPU, NPU, and heterogeneous devices. OpenVINO™ supports pre-trained models from Open Model Zoo and popular frameworks like TensorFlow, PyTorch, and ONNX. Key components of OpenVINO™ include the OpenVINO™ Runtime, plugins for different hardware devices, frontends for reading models from native framework formats, and the OpenVINO Model Converter (OVC) for adjusting models for optimal execution on target devices.

djl-demo
The Deep Java Library (DJL) is a framework-agnostic Java API for deep learning. It provides a unified interface to popular deep learning frameworks such as TensorFlow, PyTorch, and MXNet. DJL makes it easy to develop deep learning applications in Java, and it can be used for a variety of tasks, including image classification, object detection, natural language processing, and speech recognition.

kaapana
Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.

MONAI
MONAI is a PyTorch-based, open-source framework for deep learning in healthcare imaging. It provides a comprehensive set of tools for medical image analysis, including data preprocessing, model training, and evaluation. MONAI is designed to be flexible and easy to use, making it a valuable resource for researchers and developers in the field of medical imaging.

nnstreamer
NNStreamer is a set of Gstreamer plugins that allow Gstreamer developers to adopt neural network models easily and efficiently and neural network developers to manage neural network pipelines and their filters easily and efficiently.
For similar jobs

promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.