rkllama

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning models on Rockchip devices with optimized NPU support ( rkllm )

Stars: 279

Visit

RKLLama is a server and client tool designed for running and interacting with LLM models optimized for Rockchip RK3588(S) and RK3576 platforms. It allows models to run on the NPU, with features such as running models on NPU, partial Ollama API compatibility, pulling models from Huggingface, API REST with documentation, dynamic loading/unloading of models, inference requests with streaming modes, simplified model naming, CPU model auto-detection, and optional debug mode. The tool supports Python 3.8 to 3.12 and has been tested on Orange Pi 5 Pro and Orange Pi 5 Plus with specific OS versions.

README:

RKLLama: LLM Server and Client for Rockchip 3588/3576

Version: 0.0.44

Video demo ( version 0.0.1 ):

Branches

Without Miniconda: This version runs without Miniconda.
Rkllama Docker: A fully isolated version running in a Docker container.
Support All Models: This branch ensures all models are tested before being merged into the main branch.
Docker Package

Overview

A server to run and interact with LLM models optimized for Rockchip RK3588(S) and RK3576 platforms. The difference from other software of this type like Ollama or Llama.cpp is that RKLLama allows models to run on the NPU.

Version Lib rkllm-runtime: V 1.2.1.
Version Lib rknn-runtime: V 2.3.2.

File Structure

./models: contains your rkllm models (wihh their rknn models if multimodal) .
./lib: C++ rkllm and rklnn library used for inference and fix_freqence_platform.
./app.py: API Rest server.
./client.py: Client to interact with the server.

Supported Python Versions:

Python 3.9 to 3.12

Tested Hardware and Environment

Hardware: Orange Pi 5 Pro: (Rockchip RK3588S, NPU 6 TOPS), 16GB RAM.
Hardware: Orange Pi 5 Plus: (Rockchip RK3588S, NPU 6 TOPS), 16GB RAM.
Hardware: Orange Pi 5 Max: (Rockchip RK3588S, NPU 6 TOPS), 16GB RAM.
OS: Ubuntu 24.04 arm64.
OS: Armbian Linux 6.1.99-vendor-rk35xx (Debian stable bookworm), v25.2.2.

Main Features

Running models on NPU.
Ollama API compatibility - Support for:
- /api/chat
- /api/generate
- /api/ps
- /api/tags
- /api/embed (and legacy /api/embeddings)
- /api/version
- /api/pull
Partial OpenAI API compatibility - Support for:
- /v1/completions
- /v1/chat/completions
- /v1/embeddings
Tool/Function Calling - Complete support for tool calls with multiple LLM formats (Qwen, Llama 3.2+, others).
Pull models directly from Huggingface.
Include a API REST with documentation.
Listing available models.
Multiples RKLLM models running in memory simultaniusly (parallels executions between distintct models in stream mode, FIFO if non stream)
Dynamic loading and unloading of models:
- Load the model after new request (if not in memory already)
- Unload when model expires after inactivity (default 30 min)
- Unload the oldest model in memory if new model is required to be loaded and there is not memory available in the server
Inference requests with streaming and non-streaming modes.
Message history.
Simplified custom model naming - Use models with familiar names like "qwen2.5:3b".
CPU Model Auto-detection - Automatic detection of RK3588 or RK3576 platform.
Optional Debug Mode - Detailed debugging with --debug flag.
Multimodal Suport - Use Qwen2 or Qwen2.5 vision models to ask questions about images (base64, local file or URL image address).

Documentation

French version: click

Client : Installation guide.
API REST : English documentation
API REST : French documentation
Ollama API: Compatibility guide
Model Naming: Naming convention
Tool Calling: Tool/Function calling guide

Installation

Standard Installation

Clone the repository:

git clone https://github.com/notpunchnox/rkllama
cd rkllama

Install RKLLama:

chmod +x setup.sh
./setup.sh

Output:

Docker Installation

Pull the RKLLama Docker image:

docker pull ghcr.io/notpunchnox/rkllama:main

run server

docker run -it --privileged -p 8080:8080 ghcr.io/notpunchnox/rkllama:main

Set up by: ichlaffterlalu

Docker Compose

Docker Compose facilities much of the extra flags declaration such as volumes:

docker compose up --detach --remove-orphans

Usage

Run Server

Virtualization with conda is started automatically, as well as the NPU frequency setting.

Start the server

rkllama serve

To enable debug mode:

rkllama serve --debug

Output:

Run Client

Command to start the client

rkllama

rkllama help

Output:

See the available models

rkllama list

Output:

Run a model

rkllama run <model_name>

Output:

Then start chatting ( verbose mode: display formatted history and statistics )

Tool Calling Quick Start

RKLLama supports advanced tool/function calling for enhanced AI interactions:

# Example: Weather tool call
curl -X POST http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5:3b",
    "messages": [{"role": "user", "content": "What is the weather in Paris?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"}
          },
          "required": ["location"]
        }
      }
    }]
  }'

Features:

🔧 Multiple model support (Qwen, Llama 3.2+, others)
🌊 Streaming & non-streaming modes
🎯 Robust JSON parsing with fallback methods
🔄 Auto format normalization
📋 Multiple tools in single request

For complete documentation: Tool Calling Guide

Adding a Model (`file.rkllm`)

Using the `rkllama pull` Command

You can download and install a model from the Hugging Face platform with the following command:

rkllama pull username/repo_id/model_file.rkllm/custom_model_name

Alternatively, you can run the command interactively:

rkllama pull
Repo ID ( example: punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4): <your response>
File ( example: TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm): <your response>
Custom Model Name ( example: tinyllama-chat:1.1b ): <your response>

This will automatically download the specified model file and prepare it for use with RKLLAMA.

Example with Qwen2.5 3b from c01zaut: https://huggingface.co/c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4

Manual Installation

Download the Model
- Download .rkllm models directly from Hugging Face.
- Alternatively, convert your GGUF models into .rkllm format (conversion tool coming soon on my GitHub).
Place the Model
- Navigate to the ~/RKLLAMA/models directory on your system.
- Make a directory with model name.
- Place the .rkllm files in this directory.
- Create Modelfile and add this :
```
 FROM="file.rkllm"

 HUGGINGFACE_PATH="huggingface_repository"

 SYSTEM="Your system prompt"

 TEMPERATURE=1.0

 TOKENIZER="path-to-tokenizer"
```
Example directory structure:
```
~/RKLLAMA/models/
    └── TinyLlama-1.1B-Chat-v1.0
        |── Modelfile
        └── TinyLlama-1.1B-Chat-v1.0.rkllm
```
You must provide a link to a HuggingFace repository to retrieve the tokenizer and chattemplate. An internet connection is required for the tokenizer initialization (only once), and you can use a repository different from that of the model as long as the tokenizer is compatible and the chattemplate meets your needs.

For Multimodal encoder model (.rknn) Installation

Download the Encoder Model .rknn
- Download .rknn models directly from Hugging Face.
- Alternatively, convert your ONNX models into .rknn format.
- Place the .rknn model inside the same folder of the .rkll models. RKLLama detected the encoder model present in the directory

Example directory structure for multimodal:

~/RKLLAMA/models/
    └── qwen2-vision\:2b
        |── Modelfile
        └── Qwen2-VL-2B-Instruct.rkllm
        └── Qwen2-VL-2B-Instruct.rknn

Configuration

RKLLAMA uses a flexible configuration system that loads settings from multiple sources in a priority order:

See the Configuration Documentation for complete details.

Uninstall

Go to the ~/RKLLAMA/ folder

cd ~/RKLLAMA/
cp ./uninstall.sh ../
cd ../ && chmod +x ./uninstall.sh && ./uninstall.sh

If you don't have the uninstall.sh file:

wget https://raw.githubusercontent.com/NotPunchnox/rkllama/refs/heads/main/uninstall.sh
chmod +x ./uninstall.sh
./uninstall.sh

Output:

New-Version

Ollama API Compatibility: RKLLAMA now implements key Ollama API endpoints, with primary focus on /api/chat and /api/generate, allowing integration with many Ollama clients. Additional endpoints are in various stages of implementation.

Enhanced Model Naming: Simplified model naming convention allows using models with familiar names like "qwen2.5:3b" or "llama3-instruct:8b" while handling the full file paths internally.

Improved Performance and Reliability: Enhanced streaming responses with better handling of completion signals and optimized token processing.

CPU Auto-detection: Automatic detection of RK3588 or RK3576 platform with fallback to interactive selection.

Debug Mode: Optional debugging tools with detailed logs that can be enabled with the --debug flag.

Simplified Model Management:

Delete models with one command using the simplified name
Pull models directly from Hugging Face with automatic Modelfile creation
Custom model configurations through Modelfiles
Smart collision handling for models with similar names

If you have already downloaded models and do not wish to reinstall everything, please follow this guide: Rebuild Architecture

Upcoming Features

Add RKNN for onnx models (TTS, image classification/segmentation...)
GGUF/HF to RKLLM conversion software

System Monitor:

Star History

Author

NotPunchnox

Contributors

ichlaffterlalu: Contributed with a pull request for Docker-Rkllama and fixed multiple errors.
TomJacobsUK: Contributed with pull requests for Ollama API compatibility and model naming improvements, and fixed CPU detection errors.
Yoann Vanitou: Contributed with Docker implementation improvements and fixed merge conflicts.
Daniel Ferreira: Contributed with Tools Support, OpenAI API compatibility and multiload RKLLM models in memory. Also improvements and fixes. Multimodal support implementation.

For Tasks:

Click tags to check more tools for each tasks

run models interact with models list available models load models dynamically debug model performance

For Jobs:

machine learning engineer ai developer software engineer data scientist research scientist

Alternative AI tools for rkllama

Similar Open Source Tools

rkllama

github

: 279

unity-mcp

MCP for Unity is a tool that acts as a bridge, enabling AI assistants to interact with the Unity Editor via a local MCP Client. Users can instruct their LLM to manage assets, scenes, scripts, and automate tasks within Unity. The tool offers natural language control, powerful tools for asset management, scene manipulation, and automation of workflows. It is extensible and designed to work with various MCP Clients, providing a range of functions for precise text edits, script management, GameObject operations, and more.

github

: 3.2k

TranslateBookWithLLM

TranslateBookWithLLM is a Python application designed for large-scale text translation, such as entire books (.EPUB), subtitle files (.SRT), and plain text. It leverages local LLMs via the Ollama API or Gemini API. The tool offers both a web interface for ease of use and a command-line interface for advanced users. It supports multiple format translations, provides a user-friendly browser-based interface, CLI support for automation, multiple LLM providers including local Ollama models and Google Gemini API, and Docker support for easy deployment.

github

: 113

zotero-mcp

Zotero MCP is an open-source project that integrates AI capabilities with Zotero using the Model Context Protocol. It consists of a Zotero plugin and an MCP server, enabling AI assistants to search, retrieve, and cite references from Zotero library. The project features a unified architecture with an integrated MCP server, eliminating the need for a separate server process. It provides features like intelligent search, detailed reference information, filtering by tags and identifiers, aiding in academic tasks such as literature reviews and citation management.

github

: 99

strava-mcp

Strava MCP Server is a TypeScript implementation of a Model Context Protocol (MCP) server that serves as a bridge to the Strava API. It provides tools for accessing recent activities, detailed activity streams, segments exploration, activity and segment effort information, saved routes details, and route exporting in GPX or TCX format. The server offers AI-friendly JSON responses via MCP and utilizes Strava API V3 for seamless integration. Users can interact with their Strava data through natural language queries and advanced prompts, enabling personalized analysis and visualization of their activities.

github

: 139

memento-mcp

Memento MCP is a scalable, high-performance knowledge graph memory system designed for LLMs. It offers semantic retrieval, contextual recall, and temporal awareness to any LLM client supporting the model context protocol. The system is built on core concepts like entities and relations, utilizing Neo4j as its storage backend for unified graph and vector search capabilities. With advanced features such as semantic search, temporal awareness, confidence decay, and rich metadata support, Memento MCP provides a robust solution for managing knowledge graphs efficiently and effectively.

github

: 217

Zero

Zero is an open-source AI email solution that allows users to self-host their email app while integrating external services like Gmail. It aims to modernize and enhance emails through AI agents, offering features like open-source transparency, AI-driven enhancements, data privacy, self-hosting freedom, unified inbox, customizable UI, and developer-friendly extensibility. Built with modern technologies, Zero provides a reliable tech stack including Next.js, React, TypeScript, TailwindCSS, Node.js, Drizzle ORM, and PostgreSQL. Users can set up Zero using standard setup or Dev Container setup for VS Code users, with detailed environment setup instructions for Better Auth, Google OAuth, and optional GitHub OAuth. Database setup involves starting a local PostgreSQL instance, setting up database connection, and executing database commands for dependencies, tables, migrations, and content viewing.

github

: 4.8k

Visionatrix

Visionatrix is a project aimed at providing easy use of ComfyUI workflows. It offers simplified setup and update processes, a minimalistic UI for daily workflow use, stable workflows with versioning and update support, scalability for multiple instances and task workers, multiple user support with integration of different user backends, LLM power for integration with Ollama/Gemini, and seamless integration as a service with backend endpoints and webhook support. The project is approaching version 1.0 release and welcomes new ideas for further implementation.

github

: 122

g4f.dev

G4f.dev is the official documentation hub for GPT4Free, a free and convenient AI tool with endpoints that can be integrated directly into apps, scripts, and web browsers. The documentation provides clear overviews, quick examples, and deeper insights into the major features of GPT4Free, including text and image generation. Users can choose between Python and JavaScript for installation and setup, and can access various API endpoints, providers, models, and client options for different tasks.

github

: 54

swift-ocr-llm-powered-pdf-to-markdown

Swift OCR is a powerful tool for extracting text from PDF files using OpenAI's GPT-4 Turbo with Vision model. It offers flexible input options, advanced OCR processing, performance optimizations, structured output, robust error handling, and scalable architecture. The tool ensures accurate text extraction, resilience against failures, and efficient handling of multiple requests.

github

: 219

paelladoc

PAELLADOC is an intelligent documentation system that uses AI to analyze code repositories and generate comprehensive technical documentation. It offers a modular architecture with MECE principles, interactive documentation process, key features like Orchestrator and Commands, and a focus on context for successful AI programming. The tool aims to streamline documentation creation, code generation, and product management tasks for software development teams, providing a definitive standard for AI-assisted development documentation.

github

: 221

probe

Probe is an AI-friendly, fully local, semantic code search tool designed to power the next generation of AI coding assistants. It combines the speed of ripgrep with the code-aware parsing of tree-sitter to deliver precise results with complete code blocks, making it perfect for large codebases and AI-driven development workflows. Probe is fully local, keeping code on the user's machine without relying on external APIs. It supports multiple languages, offers various search options, and can be used in CLI mode, MCP server mode, AI chat mode, and web interface. The tool is designed to be flexible, fast, and accurate, providing developers and AI models with full context and relevant code blocks for efficient code exploration and understanding.

github

: 110

pixel-banner

Pixel Banner is a powerful Obsidian plugin that enhances note-taking by creating visually stunning headers with customizable banner images. It offers AI-generated banners, professional banner images from a store, local image support, and direct URL banners. Users can customize banner placement, appearance, display modes, and add decorative icons. The plugin provides efficient workflow with quick banner selection, command integration, and custom field names. It also offers smart organization features like folder-specific settings and image shuffling. Premium features include a token-based system for AI banners, banner history, and prompt inspiration. Enhance your Obsidian experience with beautiful, intelligent banners that make your notes visually distinctive and organized.

github

: 153

coding-agent-template

Coding Agent Template is a versatile tool for building AI-powered coding agents that support various coding tasks using Claude Code, OpenAI's Codex CLI, Cursor CLI, and opencode with Vercel Sandbox. It offers features like multi-agent support, Vercel Sandbox for secure code execution, AI Gateway integration, AI-generated branch names, task management, persistent storage, Git integration, and a modern UI built with Next.js and Tailwind CSS. Users can easily deploy their own version of the template to Vercel and set up the tool by cloning the repository, installing dependencies, configuring environment variables, setting up the database, and starting the development server. The tool simplifies the process of creating tasks, monitoring progress, reviewing results, and managing tasks, making it ideal for developers looking to automate coding tasks with AI agents.

github

: 275

TermNet

TermNet is an AI-powered terminal assistant that connects a Large Language Model (LLM) with shell command execution, browser search, and dynamically loaded tools. It streams responses in real-time, executes tools one at a time, and maintains conversational memory across steps. The project features terminal integration for safe shell command execution, dynamic tool loading without code changes, browser automation powered by Playwright, WebSocket architecture for real-time communication, a memory system to track planning and actions, streaming LLM output integration, a safety layer to block dangerous commands, dual interface options, a notification system, and scratchpad memory for persistent note-taking. The architecture includes a multi-server setup with servers for WebSocket, browser automation, notifications, and web UI. The project structure consists of core backend files, various tools like web browsing and notification management, and servers for browser automation and notifications. Installation requires Python 3.9+, Ollama, and Chromium, with setup steps provided in the README. The tool can be used via the launcher for managing components or directly by starting individual servers. Additional tools can be added by registering them in `toolregistry.json` and implementing them in Python modules. Safety notes highlight the blocking of dangerous commands, allowed risky commands with warnings, and the importance of monitoring tool execution and setting appropriate timeouts.

github

: 61

trpc-agent-go

A powerful Go framework for building intelligent agent systems with large language models (LLMs), hierarchical planners, memory, telemetry, and a rich tool ecosystem. tRPC-Agent-Go enables the creation of autonomous or semi-autonomous agents that reason, call tools, collaborate with sub-agents, and maintain long-term state. The framework provides detailed documentation, examples, and tools for accelerating the development of AI applications.

github

: 122

For similar tasks

neutone_sdk

The Neutone SDK is a tool designed for researchers to wrap their own audio models and run them in a DAW using the Neutone Plugin. It simplifies the process by allowing models to be built using PyTorch and minimal Python code, eliminating the need for extensive C++ knowledge. The SDK provides support for buffering inputs and outputs, sample rate conversion, and profiling tools for model performance testing. It also offers examples, notebooks, and a submission process for sharing models with the community.

github

: 468

rkllama

github

: 279

lmql

LMQL is a programming language designed for large language models (LLMs) that offers a unique way of integrating traditional programming with LLM interaction. It allows users to write programs that combine algorithmic logic with LLM calls, enabling model reasoning capabilities within the context of the program. LMQL provides features such as Python syntax integration, rich control-flow options, advanced decoding techniques, powerful constraints via logit masking, runtime optimization, sync and async API support, multi-model compatibility, and extensive applications like JSON decoding and interactive chat interfaces. The tool also offers library integration, flexible tooling, and output streaming options for easy model output handling.

github

: 3.4k

llm-gateway

llm-gateway is a gateway tool designed for interacting with third-party LLM providers such as OpenAI, Cohere, etc. It tracks data exchanged with these providers in a postgres database, applies PII scrubbing heuristics, and ensures safe communication with OpenAI's services. The tool supports various models from different providers and offers API and Python usage examples. Developers can set up the tool using Poetry, Pyenv, npm, and yarn for dependency management. The project also includes Docker setup for backend and frontend development.

github

: 182

Ollamac

Ollamac is a macOS app designed for interacting with Ollama models. It is optimized for macOS, allowing users to easily use any model from the Ollama library. The app features a user-friendly interface, chat archive for saving interactions, and real-time communication using HTTP streaming technology. Ollamac is open-source, enabling users to contribute to its development and enhance its capabilities. It requires macOS 14 or later and the Ollama system to be installed on the user's Mac with at least one Ollama model downloaded.

github

: 1.5k

llmops-duke-aipi

LLMOps Duke AIPI is a course focused on operationalizing Large Language Models, teaching methodologies for developing applications using software development best practices with large language models. The course covers various topics such as generative AI concepts, setting up development environments, interacting with large language models, using local large language models, applied solutions with LLMs, extensibility using plugins and functions, retrieval augmented generation, introduction to Python web frameworks for APIs, DevOps principles, deploying machine learning APIs, LLM platforms, and final presentations. Students will learn to build, share, and present portfolios using Github, YouTube, and Linkedin, as well as develop non-linear life-long learning skills. Prerequisites include basic Linux and programming skills, with coursework available in Python or Rust. Additional resources and references are provided for further learning and exploration.

github

: 73

mistral-ai-kmp

Mistral AI SDK for Kotlin Multiplatform (KMP) allows communication with Mistral API to get AI models, start a chat with the assistant, and create embeddings. The library is based on Mistral API documentation and built with Kotlin Multiplatform and Ktor client library. Sample projects like ZeChat showcase the capabilities of Mistral AI SDK. Users can interact with different Mistral AI models through ZeChat apps on Android, Desktop, and Web platforms. The library is not yet published on Maven, but users can fork the project and use it as a module dependency in their apps.

github

: 59

LightRAG

LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.

github

: 562

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 11.3k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 186

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529

rkllama

README:

RKLLama: LLM Server and Client for Rockchip 3588/3576

Version: 0.0.44

Branches

Overview

File Structure

Supported Python Versions:

Tested Hardware and Environment

Main Features

Documentation

Installation

Standard Installation

Docker Installation

Docker Compose

Usage

Run Server

Run Client

Tool Calling Quick Start

Adding a Model (file.rkllm)

Using the rkllama pull Command

Manual Installation

For Multimodal encoder model (.rknn) Installation

Configuration

Uninstall

New-Version

Upcoming Features

Star History

Author

Contributors

For Tasks:

For Jobs:

Alternative AI tools for rkllama

Similar Open Source Tools

rkllama

unity-mcp

TranslateBookWithLLM

zotero-mcp

strava-mcp

memento-mcp

Zero

Visionatrix

g4f.dev

swift-ocr-llm-powered-pdf-to-markdown

paelladoc

probe

pixel-banner

coding-agent-template

TermNet

trpc-agent-go

For similar tasks

neutone_sdk

rkllama

lmql

llm-gateway

Ollamac

llmops-duke-aipi

mistral-ai-kmp

LightRAG

For similar jobs

promptflow

deepeval

MegaDetector

leapfrogai

llava-docker

carrot

TrustLLM

AI-YinMei

Adding a Model (`file.rkllm`)

Using the `rkllama pull` Command