
airi
💖 アイリ, ultimate Neuro-sama like LLM powered Live2D/VRM living character life pod, near by you.
Stars: 167

Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.
README:
Heavily inspired by Neuro-sama
Unlike the other AI driven VTuber open source projects, アイリ VTuber was built with many support of Web technologies such as WebGPU, WebAudio, Web Workers, WebAssembly, WebSocket, etc. from the first day.
This means that アイリ VTuber is capable to run on modern browsers and devices, and even on mobile devices (already done with PWA support), this brought a lot of possibilities for us (the developers) to build and extend the power of アイリ VTuber to the next level, while still left the flexibilities for users to enable features that requires TCP connections or other non-Web technologies such as connect to voice channel to Discord, or playing Minecraft, Factorio with you and your friends.
[!NOTE]
We are still in the early stage of development where we are seeking out talented developers to join us and help us to make アイリ VTuber a reality.
It's ok if you are not familiar with Vue.js, TypeScript, and devtools that required for this project, you can join us as an artist, designer, or even help us to launch our first live stream.
Even you are a big fan of React or Svelte, even Solid, we welcome you, you can open a sub-directory to add features that you want to see in アイリ VTuber, or would like to experiment with.
Fields (and related projects) that we are looking for:
- Live2D modeller
- VRM modeller
- VRChat avatar designer
- Computer Vision
- Reinforcement Learning
- Speech Recognition
- Speech Synthesis
- ONNX Runtime
- Transformers.js
- vLLM
- WebGPU
- Three.js
- WebXR (checkout the another project we have under @moeru-ai organization)
If you are interested in, why not introduce yourself here? Would like to join part of us to build Airi?
Capable of
- [x] Brain
- [x] Play Minecraft
- [x] Play Factorio (WIP, but PoC and demo available)
- [x] Chat in Telegram
- [x] Chat in Discord
- [ ] Memory
- [x] Pure in-browser database support (DuckDB WASM |
pglite
) - [ ] Memory Alaya (WIP)
- [x] Pure in-browser database support (DuckDB WASM |
- [ ] Pure in-browser local (WebGPU) inference
- [x] Ears
- [x] Audio input from browser
- [x] Audio input from Discord
- [x] Client side speech recognition
- [x] Client side talking detection
- [x] Mouth
- [x] ElevenLabs voice synthesis
- [x] Body
- [x] VRM support
- [x] Control VRM model
- [x] VRM model animations
- [x] Auto blink
- [x] Auto look at
- [x] Idle eye movement
- [x] Live2D support
- [x] Control Live2D model
- [x] Live2D model animations
- [x] Auto blink
- [x] Auto look at
- [x] Idle eye movement
- [x] VRM support
pnpm i
pnpm dev
Supported the following LLM API Providers (powered by xsai)
- [x] OpenRouter
- [x] vLLM
- [x] SGLang
- [x] Ollama
- [x] Google Gemini
- [x] OpenAI
- [ ] Azure OpenAI API (PR welcome)
- [x] Anthropic Claude
- [ ] AWS Claude (PR welcome)
- [x] DeepSeek
- [x] Qwen
- [x] xAI
- [x] Groq
- [x] Mistral
- [x] Cloudflare Workers AI
- [x] Together.ai
- [x] Fireworks.ai
- [x] Novita
- [x] Zhipu
- [x] SiliconFlow
- [x] Stepfun
- [x] Baichuan
- [x] Minimax
- [x] Moonshot AI
- [x] Tencent Cloud
- [ ] Sparks (PR welcome)
- [ ] Volcano Engine (PR welcome)
-
unspeech
: Universal endpoint proxy server for/audio/transcriptions
and/audio/speech
, like LiteLLM but for any ASR and TTS -
hfup
: tools to help on deploying, bundling to HuggingFace Spaces -
@proj-airi/drizzle-duckdb-wasm
: Drizzle ORM driver for DuckDB WASM -
@proj-airi/duckdb-wasm
: Easy to use wrapper for@duckdb/duckdb-wasm
-
@proj-airi/lobe-icons
: Iconify JSON bundle for amazing AI & LLM icons from lobe-icons, support Tailwind and UnoCSS -
@proj-airi/elevenlabs
: TypeScript definitions for ElevenLabs API - Airi Factorio: Allow Airi to play Factorio
- Factorio RCON API: RESTful API wrapper for Factorio headless server console
-
autorio
: Factorio automation library - `tstl-plugin-reload-factorio-mod: Reload Factorio mod when developing
- 🥺 SAD: Documentation and notes for self-host and browser running LLMs
%%{ init: { 'flowchart': { 'curve': 'catmullRom' } } }%%
flowchart TD
Core("Core")
Unspeech["unspeech"]
DBDriver["@proj-airi/drizzle-duckdb-wasm"]
MemoryDriver["[WIP] Memory Alaya"]
DB1["@proj-airi/duckdb-wasm"]
ICONS["@proj-airi/lobe-icons"]
UI("@proj-airi/stage-ui")
Stage("Stage")
F_AGENT("Factorio Agent")
F_API["Factorio RCON API"]
F_MOD1["autorio"]
SVRT["@proj-airi/server-runtime"]
MC_AGENT("Minecraft Agent")
XSAI["xsai"]
subgraph Airi
DB1 --> DBDriver --> MemoryDriver --> Memory --> Core
ICONS --> UI --> Stage --> Core
Core --> STT
Core --> SVRT
end
STT --> |Speaking|Unspeech
SVRT --> |Playing Factorio|F_AGENT
SVRT --> |Playing Minecraft|MC_AGENT
subgraph Factorio Agent
F_AGENT --> F_API -..- factorio-server
subgraph factorio-server-wrapper
subgraph factorio-server
F_MOD1
end
end
end
subgraph Minecraft Agent
MC_AGENT --> Mineflayer -..- minecraft-server
subgraph factorio-server-wrapper
subgraph factorio-server
F_MOD1
end
end
end
XSAI --> Core
XSAI --> F_AGENT
XSAI --> MC_AGENT
%%{ init: { 'flowchart': { 'curve': 'catmullRom' } } }%%
flowchart TD
subgraph deploy&bundle
direction LR
HFUP["hfup"]
HF[/"HuggingFace Spaces"\]
HFUP -...- UI -...-> HF
HFUP -...- whisper-webgpu -...-> HF
HFUP -...- moonshine-web -...-> HF
end
- kimjammer/Neuro: A recreation of Neuro-Sama originally created in 7 days.: very well completed implementation.
- SugarcaneDefender/z-waif: Great at gaming, autonomous, and prompt engineering
- semperai/amica: Great at VRM, WebXR
- elizaOS/eliza: Great examples and software engineering on how to integrate agent into various of systems and APIs
- ardha27/AI-Waifu-Vtuber: Great about Twitch API integrations
- InsanityLabs/AIVTuber: Nice UI and UX
- IRedDragonICY/vixevia
- t41372/Open-LLM-VTuber
- PeterH0323/Streamer-Sales
- https://clips.twitch.tv/WanderingCaringDeerDxCat-Qt55xtiGDSoNmDDr https://www.youtube.com/watch?v=8Giv5mupJNE
- https://clips.twitch.tv/TriangularAthleticBunnySoonerLater-SXpBk1dFso21VcWD
- pixiv/ChatVRM
- josephrocca/ChatVRM-js: A JS conversion/adaptation of parts of the ChatVRM (TypeScript) code for standalone use in OpenCharacters and elsewhere
- Design of UI and style was inspired by Cookard, UNBEATABLE, and Sensei! I like you so much!, and artworks of Ayame by Mercedes Bazan with Wish by Mercedes Bazan
- mallorbc/whisper_mic
-
xsai
: Implemented a decent amount of packages to interact with LLMs and models, like Vercel AI SDK but a lot more smaller.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for airi
Similar Open Source Tools

airi
Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.

pro-chat
ProChat is a components library focused on quickly building large language model chat interfaces. It empowers developers to create rich, dynamic, and intuitive chat interfaces with features like automatic chat caching, streamlined conversations, message editing tools, auto-rendered Markdown, and programmatic controls. The tool also includes design evolution plans such as customized dialogue rendering, enhanced request parameters, personalized error handling, expanded documentation, and atomic component design.

kan-gpt
The KAN-GPT repository is a PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling. It provides a model for generating text based on prompts, with a focus on improving performance compared to traditional MLP-GPT models. The repository includes scripts for training the model, downloading datasets, and evaluating model performance. Development tasks include integrating with other libraries, testing, and documentation.

FluidFrames.RIFE
FluidFrames.RIFE is a Windows app powered by RIFE AI to create frame-generated and slowmotion videos. It is written in Python and utilizes external packages such as torch, onnxruntime-directml, customtkinter, OpenCV, moviepy, and Nuitka. The app features an elegant GUI, video frame generation at different speeds, video slow motion, video resizing, multiple GPU support, and compatibility with various video formats. Future versions aim to support different GPU types, enhance the GUI, include audio processing, optimize video processing speed, and introduce new features like saving AI-generated frames and supporting different RIFE AI models.

chatgpt-tarot-divination
ChatGPT Tarot Divination is a tool that offers AI fortune-telling and divination functionalities. Users can download the executable installation package, deploy it using Docker, and run it locally. The tool supports various divination methods such as Tarot cards, birth charts, name analysis, dream interpretation, naming suggestions, and more. It allows customization through setting API base URL and key, and provides a user-friendly interface for easy usage.

painting-droid
Painting Droid is an AI-powered cross-platform painting app inspired by MS Paint, expandable with plugins and open. It utilizes various AI models, from paid providers to self-hosted open-source models, as well as some lightweight ones built into the app. Features include regular painting app features, AI-generated content filling and augmentation, filters and effects, image manipulation, plugin support, and cross-platform compatibility.

llama-assistant
Llama Assistant is a local AI assistant that respects your privacy. It is an AI-powered assistant that can recognize your voice, process natural language, and perform various actions based on your commands. It can help with tasks like summarizing text, rephrasing sentences, answering questions, writing emails, and more. The assistant runs offline on your local machine, ensuring privacy by not sending data to external servers. It supports voice recognition, natural language processing, and customizable UI with adjustable transparency. The project is a work in progress with new features being added regularly.

llama-assistant
Llama Assistant is an AI-powered assistant that helps with daily tasks, such as voice recognition, natural language processing, summarizing text, rephrasing sentences, answering questions, and more. It runs offline on your local machine, ensuring privacy by not sending data to external servers. The project is a work in progress with regular feature additions.

feast
Feast is an open source feature store for machine learning, providing a fast path to manage infrastructure for productionizing analytic data. It allows ML platform teams to make features consistently available, avoid data leakage, and decouple ML from data infrastructure. Feast abstracts feature storage from retrieval, ensuring portability across different model training and serving scenarios.

Qmedia
QMedia is an open-source multimedia AI content search engine designed specifically for content creators. It provides rich information extraction methods for text, image, and short video content. The tool integrates unstructured text, image, and short video information to build a multimodal RAG content Q&A system. Users can efficiently search for image/text and short video materials, analyze content, provide content sources, and generate customized search results based on user interests and needs. QMedia supports local deployment for offline content search and Q&A for private data. The tool offers features like content cards display, multimodal content RAG search, and pure local multimodal models deployment. Users can deploy different types of models locally, manage language models, feature embedding models, image models, and video models. QMedia aims to spark new ideas for content creation and share AI content creation concepts in an open-source manner.

ChatGPT-Next-Web
ChatGPT Next Web is a well-designed cross-platform ChatGPT web UI tool that supports Claude, GPT4, and Gemini Pro models. It allows users to deploy their private ChatGPT applications with ease. The tool offers features like one-click deployment, compact client for Linux/Windows/MacOS, compatibility with self-deployed LLMs, privacy-first approach with local data storage, markdown support, responsive design, fast loading speed, prompt templates, awesome prompts, chat history compression, multilingual support, and more.

RPGMaker_LLM_Translator
This is an offline Japanese translator for RPGMaker games based on Mtool and the Sakura model, capable of providing high-quality offline Japanese translations. It is recommended to use the Sakura-13B-Galgame translation model, and the currently supported versions are Sakura v0.8/v0.9/v0.10pre0.

xtuner
XTuner is an efficient, flexible, and full-featured toolkit for fine-tuning large models. It supports various LLMs (InternLM, Mixtral-8x7B, Llama 2, ChatGLM, Qwen, Baichuan, ...), VLMs (LLaVA), and various training algorithms (QLoRA, LoRA, full-parameter fine-tune). XTuner also provides tools for chatting with pretrained / fine-tuned LLMs and deploying fine-tuned LLMs with any other framework, such as LMDeploy.

GPTSwarm
GPTSwarm is a graph-based framework for LLM-based agents that enables the creation of LLM-based agents from graphs and facilitates the customized and automatic self-organization of agent swarms with self-improvement capabilities. The library includes components for domain-specific operations, graph-related functions, LLM backend selection, memory management, and optimization algorithms to enhance agent performance and swarm efficiency. Users can quickly run predefined swarms or utilize tools like the file analyzer. GPTSwarm supports local LM inference via LM Studio, allowing users to run with a local LLM model. The framework has been accepted by ICML2024 and offers advanced features for experimentation and customization.

rllm
rLLM (relationLLM) is a Pytorch library for Relational Table Learning (RTL) with LLMs. It breaks down state-of-the-art GNNs, LLMs, and TNNs as standardized modules and facilitates novel model building in a 'combine, align, and co-train' way using these modules. The library is LLM-friendly, processes various graphs as multiple tables linked by foreign keys, introduces new relational table datasets, and is supported by students and teachers from Shanghai Jiao Tong University and Tsinghua University.

midjourney-proxy
Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.
For similar tasks

airi
Airi is a VTuber project heavily inspired by Neuro-sama. It is capable of various functions such as playing Minecraft, chatting in Telegram and Discord, audio input from browser and Discord, client side speech recognition, VRM and Live2D model support with animations, and more. The project also includes sub-projects like unspeech, hfup, Drizzle ORM driver for DuckDB WASM, and various other tools. Airi uses models like whisper-large-v3-turbo from Hugging Face and is similar to projects like z-waif, amica, eliza, AI-Waifu-Vtuber, and AIVTuber. The project acknowledges contributions from various sources and implements packages to interact with LLMs and models.

gpt-subtrans
GPT-Subtrans is an open-source subtitle translator that utilizes large language models (LLMs) as translation services. It supports translation between any language pairs that the language model supports. Note that GPT-Subtrans requires an active internet connection, as subtitles are sent to the provider's servers for translation, and their privacy policy applies.

basehub
JavaScript / TypeScript SDK for BaseHub, the first AI-native content hub. **Features:** * ✨ Infers types from your BaseHub repository... _meaning IDE autocompletion works great._ * 🏎️ No dependency on graphql... _meaning your bundle is more lightweight._ * 🌐 Works everywhere `fetch` is supported... _meaning you can use it anywhere._

novel
Novel is an open-source Notion-style WYSIWYG editor with AI-powered autocompletions. It allows users to easily create and edit content with the help of AI suggestions. The tool is built on a modern tech stack and supports cross-framework development. Users can deploy their own version of Novel to Vercel with one click and contribute to the project by reporting bugs or making feature enhancements through pull requests.

local-rag
Local RAG is an offline, open-source tool that allows users to ingest files for retrieval augmented generation (RAG) using large language models (LLMs) without relying on third parties or exposing sensitive data. It supports offline embeddings and LLMs, multiple sources including local files, GitHub repos, and websites, streaming responses, conversational memory, and chat export. Users can set up and deploy the app, learn how to use Local RAG, explore the RAG pipeline, check planned features, known bugs and issues, access additional resources, and contribute to the project.

Onllama.Tiny
Onllama.Tiny is a lightweight tool that allows you to easily run LLM on your computer without the need for a dedicated graphics card. It simplifies the process of running LLM, making it more accessible for users. The tool provides a user-friendly interface and streamlines the setup and configuration required to run LLM on your machine. With Onllama.Tiny, users can quickly set up and start using LLM for various applications and projects.

ComfyUI-BRIA_AI-RMBG
ComfyUI-BRIA_AI-RMBG is an unofficial implementation of the BRIA Background Removal v1.4 model for ComfyUI. The tool supports batch processing, including video background removal, and introduces a new mask output feature. Users can install the tool using ComfyUI Manager or manually by cloning the repository. The tool includes nodes for automatically loading the Removal v1.4 model and removing backgrounds. Updates include support for batch processing and the addition of a mask output feature.

enterprise-h2ogpte
Enterprise h2oGPTe - GenAI RAG is a repository containing code examples, notebooks, and benchmarks for the enterprise version of h2oGPTe, a powerful AI tool for generating text based on the RAG (Retrieval-Augmented Generation) architecture. The repository provides resources for leveraging h2oGPTe in enterprise settings, including implementation guides, performance evaluations, and best practices. Users can explore various applications of h2oGPTe in natural language processing tasks, such as text generation, content creation, and conversational AI.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.