deeppowers

An open-source, high-performance inference accelerator engine supporting various large language models such as DEEPSEEK, GPT, GEMINI, and CLAUDE.

Stars: 183

Visit

Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

README:

Deeppowers

An open-source, high-performance inference accelerator engine supporting various large language models such as DEEPSEEK, GPT, GEMINI, and CLAUDE.

Overview

DeepPowers is a high-performance tokenizer implementation with memory optimization and parallel processing capabilities. It provides efficient text tokenization for large language models with features like WordPiece and BPE algorithms, memory pooling, and batch processing.

Key Features

Core Architecture

Hardware Abstraction Layer (HAL)
CUDA device management
Basic tensor operations
Kernel management system

Request Processing

Request queue management
Batch processing system
Priority scheduling
Error handling mechanism

Quantization

INT8 quantization support
INT4 quantization support
Mixed-precision quantization
Calibration data management

API Interface

C++ API infrastructure
Python bindings
REST API infrastructure
gRPC service infrastructure

Middleware

Authentication middleware
Rate limiting middleware
Logging middleware
Monitoring middleware
Error handling middleware

Inference Optimization

Model operator fusion
Weight pruning techniques
KV-cache optimization
Automatic optimization selection
Performance profiling and benchmarking

Architecture

Technical Architecture Diagram

The architecture follows a pipeline-based design with several key components:

Request Flow
- User requests enter the system through a unified interface
- Requests are queued and prioritized in the Request Queue
- Batching system groups compatible requests for optimal processing
- Execution Engine processes batches and generates results
- Output is post-processed and returned to users
Control Flow
- Configuration Manager oversees system settings and runtime parameters
- Graph Compiler optimizes computation graphs for execution
- Hardware Abstraction Layer provides unified access to different hardware backends
Optimization Points
- Dynamic batching for throughput optimization
- Graph compilation for computation optimization
- Hardware-specific optimizations through HAL
- Configuration-based performance tuning

Directory Structure

deeppowers/
├── src/
│   ├── core/                      # Core implementation
│   │   ├── hal/                  # Hardware Abstraction Layer for device management
│   │   ├── request_queue/        # Request queue and management system
│   │   ├── batching/            # Batch processing and optimization
│   │   ├── execution/           # Execution engine and runtime
│   │   ├── distributed/         # Distributed computing support
│   │   ├── scheduling/          # Task scheduling and resource management
│   │   ├── monitoring/          # System monitoring and metrics
│   │   ├── config/             # Configuration management
│   │   ├── preprocessing/      # Input preprocessing pipeline
│   │   ├── postprocessing/     # Output postprocessing pipeline
│   │   ├── graph/              # Computation graph management
│   │   ├── api/               # Internal API implementations
│   │   ├── model/             # Base model architecture
│   │   ├── memory/            # Memory management system
│   │   ├── inference/         # Inference engine core
│   │   ├── models/            # Specific model implementations
│   │   ├── tokenizer/         # Tokenization implementations
│   │   └── utils/             # Utility components
│   ├── api/                   # External API implementations
│   └── common/                # Common utilities
├── tests/                     # Test suite
├── scripts/                   # Utility scripts
├── examples/                  # Example usage  
├── docs/                      # Documentation
└── README.md                  # Project overview

The core module is organized into specialized components:

Infrastructure Components

HAL (Hardware Abstraction Layer): Manages hardware devices and provides unified interface for different backends
Request Queue: Handles incoming requests with priority management and load balancing
Batching: Implements dynamic batching strategies for optimal throughput
Execution: Core execution engine for model inference
Distributed: Supports distributed computing and model parallelism

Resource Management

Scheduling: Manages task scheduling and resource allocation
Monitoring: System metrics collection and performance monitoring
Config: Configuration management and validation
Memory: Advanced memory management and optimization

Processing Pipeline

Preprocessing: Input data preparation and normalization
Postprocessing: Output processing and formatting
Graph: Computation graph optimization and management
Inference: Core inference engine implementation

Model Components

Model: Base model architecture and interfaces
Models: Specific model implementations (GPT, BERT, etc.)
Tokenizer: Text tokenization algorithms and utilities

Support Systems

API: Internal API implementations for core functionality
Utils: Common utilities and helper functions

Installation

Prerequisites

C++17 compiler
CMake 3.15+
Python 3.8+ (for Python bindings)
ICU library for Unicode support

# Install dependencies (Ubuntu)
sudo apt-get install build-essential cmake libicu-dev

# Clone and build
git clone https://github.com/deeppowers/deeppowers.git
cd deeppowers
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j

# Install Python package (optional)
cd ./src/api/python
pip install -e .

# Clone the repository
git clone https://github.com/deeppowers/deeppowers.git
cd deeppowers

# Install dependencies
pip install -r requirements.txt

# Build from source
mkdir build && cd build
cmake ..
make -j$(nproc)

Quick Start

Basic Usage

import deeppowers as dp

# Method 1: Using Pipeline (Recommended)
# Initialize pipeline with pre-trained model
pipeline = dp.Pipeline.from_pretrained("deepseek-v3")

# Generate text
response = pipeline.generate(
    "Hello, how are you?",
    max_length=50,
    temperature=0.7,
    top_p=0.9
)
print(response)

# Batch processing
responses = pipeline.generate(
    ["Hello!", "How are you?"],
    max_length=50,
    temperature=0.7
)

# Save and load pipeline
pipeline.save("my_pipeline")
loaded_pipeline = dp.Pipeline.load("my_pipeline")

# Method 2: Using Tokenizer and Model separately
# Initialize tokenizer
tokenizer = dp.Tokenizer(model_name="deepseek-v3")  # or use custom vocab
tokenizer.load("path/to/tokenizer.model")

# Initialize model
model = dp.Model.from_pretrained("deepseek-v3")

# Create pipeline manually
pipeline = dp.Pipeline(model=model, tokenizer=tokenizer)

Advanced Usage

Custom Tokenizer Training

# Initialize tokenizer with specific type
tokenizer = dp.Tokenizer(tokenizer_type=dp.TokenizerType.WORDPIECE)

# Train on custom data
texts = ["your", "training", "texts"]
tokenizer.train(texts, vocab_size=30000, min_frequency=2)

# Save and load
tokenizer.save("tokenizer.model")
tokenizer.load("tokenizer.model")

# Basic tokenization
tokens = tokenizer.encode("Hello, world!")
text = tokenizer.decode(tokens)

# Batch processing with parallel execution
texts = ["multiple", "texts", "for", "processing"]
tokens_batch = tokenizer.encode_batch(
    texts,
    add_special_tokens=True,
    padding=True,
    max_length=128
)

Advanced Generation Control

# Configure generation parameters
response = pipeline.generate(
    "Write a story about",
    max_length=200,          # Maximum length of generated text
    min_length=50,           # Minimum length of generated text
    temperature=0.7,         # Controls randomness (higher = more random)
    top_k=50,               # Limits vocabulary to top k tokens
    top_p=0.9,              # Nucleus sampling threshold
    num_return_sequences=3,  # Number of different sequences to generate
    repetition_penalty=1.2   # Penalize repeated tokens
)

# Batch generation with multiple prompts
prompts = [
    "Write a story about",
    "Explain quantum physics",
    "Give me a recipe for"
]
responses = pipeline.generate(
    prompts,
    max_length=100,
    temperature=0.8
)

Model Optimization

# Load model
model = dp.load_model("deepseek-v3", device="cuda", dtype="float16")

# Apply automatic optimization
results = dp.optimize_model(model, optimization_type="auto", level="o2", enable_profiling=True)
print(f"Achieved speedup: {results['speedup']}x")
print(f"Memory reduction: {results['memory_reduction']}%")

# Apply specific optimization techniques
results = dp.optimize_model(model, optimization_type="fusion")
results = dp.optimize_model(model, optimization_type="pruning")
results = dp.optimize_model(model, optimization_type="caching")

# Quantize model to INT8 precision
results = dp.quantize_model(model, precision="int8")
print(f"INT8 quantization speedup: {results['speedup']}x")
print(f"Accuracy loss: {results['accuracy_loss']}%")

# Run benchmarks
benchmark_results = dp.benchmark_model(
    model, 
    input_text="This is a test input for benchmarking.",
    num_runs=10, 
    warmup_runs=3
)
print(f"Average latency: {benchmark_results['avg_latency_ms']} ms")
print(f"Throughput: {benchmark_results['throughput_tokens_per_sec']} tokens/sec")

Performance Tuning

Memory Optimization

# Configure memory pool
tokenizer.set_memory_pool_size(4096)  # 4KB blocks
tokenizer.enable_string_pooling(True)

# Monitor memory usage
stats = tokenizer.get_memory_stats()
print(f"Memory pool usage: {stats['pool_usage']}MB")
print(f"String pool size: {stats['string_pool_size']}")

Parallel Processing

# Configure thread pool
tokenizer.set_num_threads(8)
tokenizer.set_batch_size(64)

# Process large datasets
with open("large_file.txt", "r") as f:
    texts = f.readlines()
tokens = tokenizer.encode_batch_parallel(texts)

Documentation

Performance Optimization

DeepPowers includes several performance optimization features:

Memory pooling and caching
Dynamic batching
Parallel processing
Mixed-precision computation
Distributed inference
Model quantization (INT8, INT4, mixed precision)
Operator fusion
KV-cache optimization

Roadmap

Implemented Features ✨

✅ Hardware Abstraction Layer (HAL) - Basic CUDA and ROCM support
✅ Tokenizer Implementation - WordPiece and BPE algorithms
✅ Memory Management - Basic memory pooling system
✅ Request Queue Management - Basic request handling
✅ Configuration System - Basic config management
✅ Python Bindings - Basic API interface
✅ Monitoring System - Basic metrics collection
✅ Model Execution Framework - Core implementation
✅ Inference Pipeline - Basic pipeline structure
✅ Dynamic Batch Processing - Initial implementation
✅ Model Loading System - Support for ONNX, PyTorch, and TensorFlow formats
✅ Inference Engine - Complete implementation with text generation capabilities
✅ Inference Optimization - Operator fusion, weight pruning, and caching optimizations
✅ Quantization System - INT8, INT4, and mixed precision support
✅ Benchmarking Tools - Performance measurement and optimization metrics
✅ Streaming Generation - Real-time text generation with callback support
✅ Advanced Batching Strategies - Batch processing with parallel inference

In Progress 🚧

🔄 Computation Graph System - Advanced graph optimizations
🔄 Distributed Computing Support - Multi-node inference
🔄 Auto-tuning System - Automatic performance optimization
🔄 Dynamic Shape Support - Flexible tensor dimensions handling

Planned Features 🎯

📋 Advanced Model Support
- Advanced LLM implementations (GPT, Gemini, Claude)
- More sophisticated model architecture support
- Model compression and distillation
📋 Performance Optimization
- Advanced memory management
- Kernel fusion optimizations
- Custom CUDA kernels for critical operations
📋 Advanced Features
- Multi-GPU parallelism
- Distributed inference across nodes
- Advanced caching system with prefetching
- Speculative decoding
- Custom operator implementation

Benchmarking Tools

The project includes comprehensive benchmarking tools:

# Run performance benchmark
python examples/optimize_and_benchmark.py --model your-model --benchmark

# Run with optimization
python examples/optimize_and_benchmark.py --model your-model --optimization auto --level o2 --benchmark

# Apply quantization
python examples/optimize_and_benchmark.py --model your-model --quantize --quantize-precision int8 --benchmark

# Generate text with optimized model
python examples/optimize_and_benchmark.py --model your-model --optimization auto --generate --prompt "Your prompt here"

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

Special thanks to all contributors and the open-source community.

Contact

GitHub Issues: Create an issue

For Tasks:

Click tags to check more tools for each tasks

image classification text generation object detection anomaly detection sentiment analysis

For Jobs:

data scientist machine learning engineer research scientist ai developer computer vision engineer

Alternative AI tools for deeppowers

Similar Open Source Tools

deeppowers

github

: 183

sciml.ai

SciML.ai is an open source software organization dedicated to unifying packages for scientific machine learning. It focuses on developing modular scientific simulation support software, including differential equation solvers, inverse problems methodologies, and automated model discovery. The organization aims to provide a diverse set of tools with a common interface, creating a modular, easily-extendable, and highly performant ecosystem for scientific simulations. The website serves as a platform to showcase SciML organization's packages and share news within the ecosystem. Pull requests are encouraged for contributions.

github

: 55

AI_Spectrum

AI_Spectrum is a versatile machine learning library that provides a wide range of tools and algorithms for building and deploying AI models. It offers a user-friendly interface for data preprocessing, model training, and evaluation. With AI_Spectrum, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is designed to be flexible and scalable, making it suitable for both beginners and experienced data scientists.

github

: 161

ai-workshop-code

The ai-workshop-code repository contains code examples and tutorials for various artificial intelligence concepts and algorithms. It serves as a practical resource for individuals looking to learn and implement AI techniques in their projects. The repository covers a wide range of topics, including machine learning, deep learning, natural language processing, computer vision, and reinforcement learning. By exploring the code and following the tutorials, users can gain hands-on experience with AI technologies and enhance their understanding of how these algorithms work in practice.

github

: 375

atomic-agents

The Atomic Agents framework is a modular and extensible tool designed for creating powerful applications. It leverages Pydantic for data validation and serialization. The framework follows the principles of Atomic Design, providing small and single-purpose components that can be combined. It integrates with Instructor for AI agent architecture and supports various APIs like Cohere, Anthropic, and Gemini. The tool includes documentation, examples, and testing features to ensure smooth development and usage.

github

: 2.7k

God-Level-AI

A drill of scientific methods, processes, algorithms, and systems to build stories & models. An in-depth learning resource for humans. This repository is designed for individuals aiming to excel in the field of Data and AI, providing video sessions and text content for learning. It caters to those in leadership positions, professionals, and students, emphasizing the need for dedicated effort to achieve excellence in the tech field. The content covers various topics with a focus on practical application.

github

: 3.5k

stable-diffusion.cpp

The stable-diffusion.cpp repository provides an implementation for inferring stable diffusion in pure C/C++. It offers features such as support for different versions of stable diffusion, lightweight and dependency-free implementation, various quantization support, memory-efficient CPU inference, GPU acceleration, and more. Users can download the built executable program or build it manually. The repository also includes instructions for downloading weights, building from scratch, using different acceleration methods, running the tool, converting weights, and utilizing various features like Flash Attention, ESRGAN upscaling, PhotoMaker support, and more. Additionally, it mentions future TODOs and provides information on memory requirements, bindings, UIs, contributors, and references.

github

: 3.8k

open-ai

Open AI is a powerful tool for artificial intelligence research and development. It provides a wide range of machine learning models and algorithms, making it easier for developers to create innovative AI applications. With Open AI, users can explore cutting-edge technologies such as natural language processing, computer vision, and reinforcement learning. The platform offers a user-friendly interface and comprehensive documentation to support users in building and deploying AI solutions. Whether you are a beginner or an experienced AI practitioner, Open AI offers the tools and resources you need to accelerate your AI projects and stay ahead in the rapidly evolving field of artificial intelligence.

github

: 2.1k

nmed2024

Nmed2024 is a GitHub repository that contains code for a neural network model designed for medical image analysis. The repository includes scripts for training the model, as well as pre-trained weights for quick deployment. The model is specifically tailored for detecting abnormalities in medical images, such as tumors or fractures. It utilizes deep learning techniques to achieve high accuracy and can be easily integrated into existing medical imaging systems. Researchers and developers in the healthcare industry can leverage this tool to enhance the efficiency and accuracy of medical image analysis tasks.

github

: 73

AimRT

AimRT is a basic runtime framework for modern robotics, developed in modern C++ with lightweight and easy deployment. It integrates research and development for robot applications in various deployment scenarios, providing debugging tools and observability support. AimRT offers a plug-in development interface compatible with ROS2, HTTP, Grpc, and other ecosystems for progressive system upgrades.

github

: 562

chatmcp

Chatmcp is a chatbot framework for building conversational AI applications. It provides a flexible and extensible platform for creating chatbots that can interact with users in a natural language. With Chatmcp, developers can easily integrate chatbot functionality into their applications, enabling users to communicate with the system through text-based conversations. The framework supports various natural language processing techniques and allows for the customization of chatbot behavior and responses. Chatmcp simplifies the development of chatbots by providing a set of pre-built components and tools that streamline the creation process. Whether you are building a customer support chatbot, a virtual assistant, or a chat-based game, Chatmcp offers the necessary features and capabilities to bring your conversational AI ideas to life.

github

: 659

aiounifi

Aiounifi is a Python library that provides a simple interface for interacting with the Unifi Controller API. It allows users to easily manage their Unifi network devices, such as access points, switches, and gateways, through automated scripts or applications. With Aiounifi, users can retrieve device information, perform configuration changes, monitor network performance, and more, all through a convenient and efficient API wrapper. This library simplifies the process of integrating Unifi network management into custom solutions, making it ideal for network administrators, developers, and enthusiasts looking to automate and streamline their network operations.

github

: 62

simple-ai

Simple AI is a lightweight Python library for implementing basic artificial intelligence algorithms. It provides easy-to-use functions and classes for tasks such as machine learning, natural language processing, and computer vision. With Simple AI, users can quickly prototype and deploy AI solutions without the complexity of larger frameworks.

github

: 276

DAILA

DAILA is a unified interface for AI systems in decompilers, supporting various decompilers and AI systems. It allows users to utilize local and remote LLMs, like ChatGPT and Claude, and local models such as VarBERT. DAILA can be used as a decompiler plugin with GUI or as a scripting library. It also provides a Docker container for offline installations and supports tasks like summarizing functions and renaming variables in decompilation.

github

: 600

Main

This repository contains material related to the new book _Synthetic Data and Generative AI_ by the author, including code for NoGAN, DeepResampling, and NoGAN_Hellinger. NoGAN is a tabular data synthesizer that outperforms GenAI methods in terms of speed and results, utilizing state-of-the-art quality metrics. DeepResampling is a fast NoGAN based on resampling and Bayesian Models with hyperparameter auto-tuning. NoGAN_Hellinger combines NoGAN and DeepResampling with the Hellinger model evaluation metric.

github

: 66

AI-and-competition

This repository provides baselines for various competitions, a few top solutions for some competitions, and independent deep learning projects. Baselines serve as entry guides for competitions, suitable for beginners to make their first submission. Top solutions are more complex and refined versions of baselines, with limited quantity but enhanced quality. The repository is maintained by a single author, yunsuxiaozi, offering code improvements and annotations for better understanding. Users can support the repository by learning from it and providing feedback.

github

: 51

For similar tasks

LLMs4TS

LLMs4TS is a repository focused on the application of cutting-edge AI technologies for time-series analysis. It covers advanced topics such as self-supervised learning, Graph Neural Networks for Time Series, Large Language Models for Time Series, Diffusion models, Mixture-of-Experts architectures, and Mamba models. The resources in this repository span various domains like healthcare, finance, and traffic, offering tutorials, courses, and workshops from prestigious conferences. Whether you're a professional, data scientist, or researcher, the tools and techniques in this repository can enhance your time-series data analysis capabilities.

github

: 110

deeppowers

github

: 183

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. It provides a common API to deliver inference solutions on various platforms, including CPU, GPU, NPU, and heterogeneous devices. OpenVINO™ supports pre-trained models from Open Model Zoo and popular frameworks like TensorFlow, PyTorch, and ONNX. Key components of OpenVINO™ include the OpenVINO™ Runtime, plugins for different hardware devices, frontends for reading models from native framework formats, and the OpenVINO Model Converter (OVC) for adjusting models for optimal execution on target devices.

github

: 8.1k

djl-demo

The Deep Java Library (DJL) is a framework-agnostic Java API for deep learning. It provides a unified interface to popular deep learning frameworks such as TensorFlow, PyTorch, and MXNet. DJL makes it easy to develop deep learning applications in Java, and it can be used for a variety of tasks, including image classification, object detection, natural language processing, and speech recognition.

github

: 307

kaapana

Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.

github

: 176

MONAI

MONAI is a PyTorch-based, open-source framework for deep learning in healthcare imaging. It provides a comprehensive set of tools for medical image analysis, including data preprocessing, model training, and evaluation. MONAI is designed to be flexible and easy to use, making it a valuable resource for researchers and developers in the field of medical imaging.

github

: 6.2k

nnstreamer

NNStreamer is a set of Gstreamer plugins that allow Gstreamer developers to adopt neural network models easily and efficiently and neural network developers to manage neural network pipelines and their filters easily and efficiently.

github

: 724

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529