
airi
💖🧸 アイリ, ultimate Neuro-sama like LLM powered Live2D/VRM living character life pod, near by you.
Stars: 329

Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.
README:
Heavily inspired by Neuro-sama
Unlike the other AI driven VTuber open source projects, アイリ VTuber was built with many support of Web technologies such as WebGPU, WebAudio, Web Workers, WebAssembly, WebSocket, etc. from the first day.
This means that アイリ VTuber is capable to run on modern browsers and devices, and even on mobile devices (already done with PWA support), this brought a lot of possibilities for us (the developers) to build and extend the power of アイリ VTuber to the next level, while still left the flexibilities for users to enable features that requires TCP connections or other non-Web technologies such as connect to voice channel to Discord, or playing Minecraft, Factorio with you and your friends.
[!NOTE]
We are still in the early stage of development where we are seeking out talented developers to join us and help us to make アイリ VTuber a reality.
It's ok if you are not familiar with Vue.js, TypeScript, and devtools that required for this project, you can join us as an artist, designer, or even help us to launch our first live stream.
Even you are a big fan of React or Svelte, even Solid, we welcome you, you can open a sub-directory to add features that you want to see in アイリ VTuber, or would like to experiment with.
Fields (and related projects) that we are looking for:
- Live2D modeller
- VRM modeller
- VRChat avatar designer
- Computer Vision
- Reinforcement Learning
- Speech Recognition
- Speech Synthesis
- ONNX Runtime
- Transformers.js
- vLLM
- WebGPU
- Three.js
- WebXR (checkout the another project we have under @moeru-ai organization)
If you are interested in, why not introduce yourself here? Would like to join part of us to build Airi?
Capable of
- [x] Brain
- [x] Play Minecraft
- [x] Play Factorio (WIP, but PoC and demo available)
- [x] Chat in Telegram
- [x] Chat in Discord
- [ ] Memory
- [x] Pure in-browser database support (DuckDB WASM |
pglite
) - [ ] Memory Alaya (WIP)
- [x] Pure in-browser database support (DuckDB WASM |
- [ ] Pure in-browser local (WebGPU) inference
- [x] Ears
- [x] Audio input from browser
- [x] Audio input from Discord
- [x] Client side speech recognition
- [x] Client side talking detection
- [x] Mouth
- [x] ElevenLabs voice synthesis
- [x] Body
- [x] VRM support
- [x] Control VRM model
- [x] VRM model animations
- [x] Auto blink
- [x] Auto look at
- [x] Idle eye movement
- [x] Live2D support
- [x] Control Live2D model
- [x] Live2D model animations
- [x] Auto blink
- [x] Auto look at
- [x] Idle eye movement
- [x] VRM support
pnpm i
pnpm dev
Supported the following LLM API Providers (powered by xsai)
- [x] OpenRouter
- [x] vLLM
- [x] SGLang
- [x] Ollama
- [x] Google Gemini
- [x] OpenAI
- [ ] Azure OpenAI API (PR welcome)
- [x] Anthropic Claude
- [ ] AWS Claude (PR welcome)
- [x] DeepSeek
- [x] Qwen
- [x] xAI
- [x] Groq
- [x] Mistral
- [x] Cloudflare Workers AI
- [x] Together.ai
- [x] Fireworks.ai
- [x] Novita
- [x] Zhipu
- [x] SiliconFlow
- [x] Stepfun
- [x] Baichuan
- [x] Minimax
- [x] Moonshot AI
- [x] Tencent Cloud
- [ ] Sparks (PR welcome)
- [ ] Volcano Engine (PR welcome)
-
unspeech
: Universal endpoint proxy server for/audio/transcriptions
and/audio/speech
, like LiteLLM but for any ASR and TTS -
hfup
: tools to help on deploying, bundling to HuggingFace Spaces -
@proj-airi/drizzle-duckdb-wasm
: Drizzle ORM driver for DuckDB WASM -
@proj-airi/duckdb-wasm
: Easy to use wrapper for@duckdb/duckdb-wasm
-
@proj-airi/lobe-icons
: Iconify JSON bundle for amazing AI & LLM icons from lobe-icons, support Tailwind and UnoCSS - Airi Factorio: Allow Airi to play Factorio
- Factorio RCON API: RESTful API wrapper for Factorio headless server console
-
autorio
: Factorio automation library -
tstl-plugin-reload-factorio-mod
: Reload Factorio mod when developing - 🥺 SAD: Documentation and notes for self-host and browser running LLMs
-
@velin-dev/ml
: Use Vue SFC and Markdown to write easy to manage stateful prompts for LLM -
demodel
: Easily boost the speed of pulling your models and datasets from various of inference runtimes. -
inventory
: Centralized model catalog and default provider configurations backend service - MCP Launcher: Easy to use MCP builder & launcher for all possible MCP servers, just like Ollama for models!
@proj-airi/elevenlabs
: TypeScript definitions for ElevenLabs API
%%{ init: { 'flowchart': { 'curve': 'catmullRom' } } }%%
flowchart TD
Core("Core")
Unspeech["unspeech"]
DBDriver["@proj-airi/drizzle-duckdb-wasm"]
MemoryDriver["[WIP] Memory Alaya"]
DB1["@proj-airi/duckdb-wasm"]
ICONS["@proj-airi/lobe-icons"]
UI("@proj-airi/stage-ui")
Stage("Stage")
F_AGENT("Factorio Agent")
F_API["Factorio RCON API"]
F_MOD1["autorio"]
SVRT["@proj-airi/server-runtime"]
MC_AGENT("Minecraft Agent")
XSAI["xsai"]
subgraph Airi
DB1 --> DBDriver --> MemoryDriver --> Memory --> Core
ICONS --> UI --> Stage --> Core
Core --> STT
Core --> SVRT
end
STT --> |Speaking|Unspeech
SVRT --> |Playing Factorio|F_AGENT
SVRT --> |Playing Minecraft|MC_AGENT
subgraph Factorio Agent
F_AGENT --> F_API -..- factorio-server
subgraph factorio-server-wrapper
subgraph factorio-server
F_MOD1
end
end
end
subgraph Minecraft Agent
MC_AGENT --> Mineflayer -..- minecraft-server
subgraph factorio-server-wrapper
subgraph factorio-server
F_MOD1
end
end
end
XSAI --> Core
XSAI --> F_AGENT
XSAI --> MC_AGENT
%%{ init: { 'flowchart': { 'curve': 'catmullRom' } } }%%
flowchart TD
subgraph deploy&bundle
direction LR
HFUP["hfup"]
HF[/"HuggingFace Spaces"\]
HFUP -...- UI -...-> HF
HFUP -...- whisper-webgpu -...-> HF
HFUP -...- moonshine-web -...-> HF
end
- kimjammer/Neuro: A recreation of Neuro-Sama originally created in 7 days.: very well completed implementation.
- SugarcaneDefender/z-waif: Great at gaming, autonomous, and prompt engineering
- semperai/amica: Great at VRM, WebXR
- elizaOS/eliza: Great examples and software engineering on how to integrate agent into various of systems and APIs
- ardha27/AI-Waifu-Vtuber: Great about Twitch API integrations
- InsanityLabs/AIVTuber: Nice UI and UX
- IRedDragonICY/vixevia
- t41372/Open-LLM-VTuber
- PeterH0323/Streamer-Sales
- https://clips.twitch.tv/WanderingCaringDeerDxCat-Qt55xtiGDSoNmDDr https://www.youtube.com/watch?v=8Giv5mupJNE
- https://clips.twitch.tv/TriangularAthleticBunnySoonerLater-SXpBk1dFso21VcWD
- pixiv/ChatVRM
- josephrocca/ChatVRM-js: A JS conversion/adaptation of parts of the ChatVRM (TypeScript) code for standalone use in OpenCharacters and elsewhere
- Design of UI and style was inspired by Cookard, UNBEATABLE, and Sensei! I like you so much!, and artworks of Ayame by Mercedes Bazan with Wish by Mercedes Bazan
- mallorbc/whisper_mic
-
xsai
: Implemented a decent amount of packages to interact with LLMs and models, like [Vercel
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for airi
Similar Open Source Tools

airi
Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.

pro-chat
ProChat is a components library focused on quickly building large language model chat interfaces. It empowers developers to create rich, dynamic, and intuitive chat interfaces with features like automatic chat caching, streamlined conversations, message editing tools, auto-rendered Markdown, and programmatic controls. The tool also includes design evolution plans such as customized dialogue rendering, enhanced request parameters, personalized error handling, expanded documentation, and atomic component design.

kan-gpt
The KAN-GPT repository is a PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling. It provides a model for generating text based on prompts, with a focus on improving performance compared to traditional MLP-GPT models. The repository includes scripts for training the model, downloading datasets, and evaluating model performance. Development tasks include integrating with other libraries, testing, and documentation.

llama-api-server
This project aims to create a RESTful API server compatible with the OpenAI API using open-source backends like llama/llama2. With this project, various GPT tools/frameworks can be compatible with your own model. Key features include: - **Compatibility with OpenAI API**: The API server follows the OpenAI API structure, allowing seamless integration with existing tools and frameworks. - **Support for Multiple Backends**: The server supports both llama.cpp and pyllama backends, providing flexibility in model selection. - **Customization Options**: Users can configure model parameters such as temperature, top_p, and top_k to fine-tune the model's behavior. - **Batch Processing**: The API supports batch processing for embeddings, enabling efficient handling of multiple inputs. - **Token Authentication**: The server utilizes token authentication to secure access to the API. This tool is particularly useful for developers and researchers who want to integrate large language models into their applications or explore custom models without relying on proprietary APIs.

FluidFrames.RIFE
FluidFrames.RIFE is a Windows app powered by RIFE AI to create frame-generated and slowmotion videos. It is written in Python and utilizes external packages such as torch, onnxruntime-directml, customtkinter, OpenCV, moviepy, and Nuitka. The app features an elegant GUI, video frame generation at different speeds, video slow motion, video resizing, multiple GPU support, and compatibility with various video formats. Future versions aim to support different GPU types, enhance the GUI, include audio processing, optimize video processing speed, and introduce new features like saving AI-generated frames and supporting different RIFE AI models.

MiKaPo
MiKaPo is a web-based tool that allows users to pose MMD models in real-time using video input. It utilizes technologies such as Mediapipe for 3D key points detection, Babylon.js for 3D scene rendering, babylon-mmd for MMD model viewing, and Vite+React for the web framework. Users can upload videos and images, select different environments, and choose models for posing. MiKaPo also supports camera input and Ollama (electron version). The tool is open to feature requests and pull requests, with ongoing development to add VMD export functionality.

chatgpt-tarot-divination
ChatGPT Tarot Divination is a tool that offers AI fortune-telling and divination functionalities. Users can download the executable installation package, deploy it using Docker, and run it locally. The tool supports various divination methods such as Tarot cards, birth charts, name analysis, dream interpretation, naming suggestions, and more. It allows customization through setting API base URL and key, and provides a user-friendly interface for easy usage.

painting-droid
Painting Droid is an AI-powered cross-platform painting app inspired by MS Paint, expandable with plugins and open. It utilizes various AI models, from paid providers to self-hosted open-source models, as well as some lightweight ones built into the app. Features include regular painting app features, AI-generated content filling and augmentation, filters and effects, image manipulation, plugin support, and cross-platform compatibility.

llama-assistant
Llama Assistant is a local AI assistant that respects your privacy. It is an AI-powered assistant that can recognize your voice, process natural language, and perform various actions based on your commands. It can help with tasks like summarizing text, rephrasing sentences, answering questions, writing emails, and more. The assistant runs offline on your local machine, ensuring privacy by not sending data to external servers. It supports voice recognition, natural language processing, and customizable UI with adjustable transparency. The project is a work in progress with new features being added regularly.

llama-assistant
Llama Assistant is an AI-powered assistant that helps with daily tasks, such as voice recognition, natural language processing, summarizing text, rephrasing sentences, answering questions, and more. It runs offline on your local machine, ensuring privacy by not sending data to external servers. The project is a work in progress with regular feature additions.

Qmedia
QMedia is an open-source multimedia AI content search engine designed specifically for content creators. It provides rich information extraction methods for text, image, and short video content. The tool integrates unstructured text, image, and short video information to build a multimodal RAG content Q&A system. Users can efficiently search for image/text and short video materials, analyze content, provide content sources, and generate customized search results based on user interests and needs. QMedia supports local deployment for offline content search and Q&A for private data. The tool offers features like content cards display, multimodal content RAG search, and pure local multimodal models deployment. Users can deploy different types of models locally, manage language models, feature embedding models, image models, and video models. QMedia aims to spark new ideas for content creation and share AI content creation concepts in an open-source manner.

RealScaler
RealScaler is a Windows app powered by RealESRGAN AI to enhance, upscale, and de-noise photos and videos. It provides an easy-to-use GUI for upscaling images and videos using multiple AI models. The tool supports automatic image tiling and merging to avoid GPU VRAM limitations, resizing images/videos before upscaling, interpolation between original and upscaled content, and compatibility with various image and video formats. RealScaler is written in Python and requires Windows 11/10, at least 8GB RAM, and a Directx12 compatible GPU with 4GB VRAM. Future versions aim to enhance performance, support more GPUs, offer a new GUI with Windows 11 style, include audio for upscaled videos, and provide features like metadata extraction and application from original to upscaled files.

DB-GPT
DB-GPT is a personal database administrator that can solve database problems by reading documents, using various tools, and writing analysis reports. It is currently undergoing an upgrade. **Features:** * **Online Demo:** * Import documents into the knowledge base * Utilize the knowledge base for well-founded Q&A and diagnosis analysis of abnormal alarms * Send feedbacks to refine the intermediate diagnosis results * Edit the diagnosis result * Browse all historical diagnosis results, used metrics, and detailed diagnosis processes * **Language Support:** * English (default) * Chinese (add "language: zh" in config.yaml) * **New Frontend:** * Knowledgebase + Chat Q&A + Diagnosis + Report Replay * **Extreme Speed Version for localized llms:** * 4-bit quantized LLM (reducing inference time by 1/3) * vllm for fast inference (qwen) * Tiny LLM * **Multi-path extraction of document knowledge:** * Vector database (ChromaDB) * RESTful Search Engine (Elasticsearch) * **Expert prompt generation using document knowledge** * **Upgrade the LLM-based diagnosis mechanism:** * Task Dispatching -> Concurrent Diagnosis -> Cross Review -> Report Generation * Synchronous Concurrency Mechanism during LLM inference * **Support monitoring and optimization tools in multiple levels:** * Monitoring metrics (Prometheus) * Flame graph in code level * Diagnosis knowledge retrieval (dbmind) * Logical query transformations (Calcite) * Index optimization algorithms (for PostgreSQL) * Physical operator hints (for PostgreSQL) * Backup and Point-in-time Recovery (Pigsty) * **Continuously updated papers and experimental reports** This project is constantly evolving with new features. Don't forget to star ⭐ and watch 👀 to stay up to date.

ChatGPT-Next-Web
ChatGPT Next Web is a well-designed cross-platform ChatGPT web UI tool that supports Claude, GPT4, and Gemini Pro models. It allows users to deploy their private ChatGPT applications with ease. The tool offers features like one-click deployment, compact client for Linux/Windows/MacOS, compatibility with self-deployed LLMs, privacy-first approach with local data storage, markdown support, responsive design, fast loading speed, prompt templates, awesome prompts, chat history compression, multilingual support, and more.

evalscope
Eval-Scope is a framework designed to support the evaluation of large language models (LLMs) by providing pre-configured benchmark datasets, common evaluation metrics, model integration, automatic evaluation for objective questions, complex task evaluation using expert models, reports generation, visualization tools, and model inference performance evaluation. It is lightweight, easy to customize, supports new dataset integration, model hosting on ModelScope, deployment of locally hosted models, and rich evaluation metrics. Eval-Scope also supports various evaluation modes like single mode, pairwise-baseline mode, and pairwise (all) mode, making it suitable for assessing and improving LLMs.
For similar tasks

airi
Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.

gpt-subtrans
GPT-Subtrans is an open-source subtitle translator that utilizes large language models (LLMs) as translation services. It supports translation between any language pairs that the language model supports. Note that GPT-Subtrans requires an active internet connection, as subtitles are sent to the provider's servers for translation, and their privacy policy applies.

basehub
JavaScript / TypeScript SDK for BaseHub, the first AI-native content hub. **Features:** * ✨ Infers types from your BaseHub repository... _meaning IDE autocompletion works great._ * 🏎️ No dependency on graphql... _meaning your bundle is more lightweight._ * 🌐 Works everywhere `fetch` is supported... _meaning you can use it anywhere._

novel
Novel is an open-source Notion-style WYSIWYG editor with AI-powered autocompletions. It allows users to easily create and edit content with the help of AI suggestions. The tool is built on a modern tech stack and supports cross-framework development. Users can deploy their own version of Novel to Vercel with one click and contribute to the project by reporting bugs or making feature enhancements through pull requests.

local-rag
Local RAG is an offline, open-source tool that allows users to ingest files for retrieval augmented generation (RAG) using large language models (LLMs) without relying on third parties or exposing sensitive data. It supports offline embeddings and LLMs, multiple sources including local files, GitHub repos, and websites, streaming responses, conversational memory, and chat export. Users can set up and deploy the app, learn how to use Local RAG, explore the RAG pipeline, check planned features, known bugs and issues, access additional resources, and contribute to the project.

Onllama.Tiny
Onllama.Tiny is a lightweight tool that allows you to easily run LLM on your computer without the need for a dedicated graphics card. It simplifies the process of running LLM, making it more accessible for users. The tool provides a user-friendly interface and streamlines the setup and configuration required to run LLM on your machine. With Onllama.Tiny, users can quickly set up and start using LLM for various applications and projects.

ComfyUI-BRIA_AI-RMBG
ComfyUI-BRIA_AI-RMBG is an unofficial implementation of the BRIA Background Removal v1.4 model for ComfyUI. The tool supports batch processing, including video background removal, and introduces a new mask output feature. Users can install the tool using ComfyUI Manager or manually by cloning the repository. The tool includes nodes for automatically loading the Removal v1.4 model and removing backgrounds. Updates include support for batch processing and the addition of a mask output feature.

enterprise-h2ogpte
Enterprise h2oGPTe - GenAI RAG is a repository containing code examples, notebooks, and benchmarks for the enterprise version of h2oGPTe, a powerful AI tool for generating text based on the RAG (Retrieval-Augmented Generation) architecture. The repository provides resources for leveraging h2oGPTe in enterprise settings, including implementation guides, performance evaluations, and best practices. Users can explore various applications of h2oGPTe in natural language processing tasks, such as text generation, content creation, and conversational AI.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.