
PoPo
Pose and animate MMD model with LLM
Stars: 187

PoPo is an AI-powered MMD pose generator that transforms natural language descriptions into expressive 3D character animations. It uses MPL (MMD Pose Language) to generate anatomically correct poses, providing real-time rendering and precise pose control. The tool fine-tunes LLMs with MPL, resulting in better training convergence, consistent outputs, anatomically correct poses, and debuggable results. The technology stack includes Next.js, Babylon.js, MPL, fine-tuned GPT-4o-mini, and Vercel for deployment. By training on semantic MPL instead of raw quaternions, PoPo enables the AI to understand the 'grammar' of human movement.
README:
AI-powered MMD pose generator - Transform natural language into expressive 3D character animations
PoPo uses fine-tuned LLMs to generate MMD character poses from natural language descriptions. Instead of training on raw quaternions, we use MPL (MMD Pose Language) - a semantic, MMD-specific pose description language that helps AI understand and generate anatomically correct poses.
π Live demo: popo.love
Demo model: ζ·±η©ΊδΉηΌ δΈηΈΒ·ζ’΅ε€©γζ ι΄η©δΌ΄γ
- Natural Language Input: "wave right hand with big laugh, inviting me for dinner"
- LLM-Generated Poses: Fine-tuned models output semantic MPL code for precise pose control
- Real-time Rendering: Instant pose creation with smooth bone animations
- MMD-Specific: Built for anime characters with proper bone constraints and physics
PoPo fine-tunes LLMs with MPL: MPL is a semantic pose description language designed specifically for MMD. This approach provides:
- Better training convergence - Structured, human-readable pose descriptions
- Consistent outputs - Same prompt generates reliable pose code
- Anatomically correct - Built-in constraints prevent impossible movements
- Debuggable results - Generated MPL code can be read and modified
{
"messages": [
{ "role": "system", "content": "Generate MMD Pose Language (MPL) script from description." },
{ "role": "user", "content": "Description: arms down" },
{ "role": "assistant", "content": "arm_l bend forward 40;arm_r bend forward 40;" }
]
}
- Frontend: Next.js, shadcn/ui, TypeScript
- 3D Engine: Babylon.js with babylon-mmd
- Pose Language: MPL (MMD Pose Language) for semantic pose description
- AI Model: Fine-tuned GPT-4o-mini for natural language β MPL generation
- Deployment: Vercel
- MiKaPo: Camera β MediaPipe β MMD bones (real-time capture)
- PoPo: Text β Fine-tuned LLM β MPL code β MMD bones (AI-generated poses)
By using semantic MPL as the training target instead of raw quaternions, we achieve better consistency and allow the AI to learn the "grammar" of human movement.
GPL-3.0 License - see LICENSE for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for PoPo
Similar Open Source Tools

PoPo
PoPo is an AI-powered MMD pose generator that transforms natural language descriptions into expressive 3D character animations. It uses MPL (MMD Pose Language) to generate anatomically correct poses, providing real-time rendering and precise pose control. The tool fine-tunes LLMs with MPL, resulting in better training convergence, consistent outputs, anatomically correct poses, and debuggable results. The technology stack includes Next.js, Babylon.js, MPL, fine-tuned GPT-4o-mini, and Vercel for deployment. By training on semantic MPL instead of raw quaternions, PoPo enables the AI to understand the 'grammar' of human movement.

rigging
Rigging is a lightweight LLM framework designed to simplify the usage of language models in production code. It offers structured Pydantic models for text output, supports various models like LiteLLM and transformers, and provides features such as defining prompts as python functions, simple tool use, storing models as connection strings, async batching for large scale generation, and modern Python support with type hints and async capabilities. Rigging is developed by dreadnode and is suitable for tasks like building chat pipelines, running completions, tracking behavior with tracing, playing with generation parameters, and scaling up with iterating and batching.

tensorzero
TensorZero is an open-source platform that helps LLM applications graduate from API wrappers into defensible AI products. It enables a data & learning flywheel for LLMs by unifying inference, observability, optimization, and experimentation. The platform includes a high-performance model gateway, structured schema-based inference, observability, experimentation, and data warehouse for analytics. TensorZero Recipes optimize prompts and models, and the platform supports experimentation features and GitOps orchestration for deployment.

OpenManus-RL
OpenManus-RL is an open-source initiative focused on enhancing reasoning and decision-making capabilities of large language models (LLMs) through advanced reinforcement learning (RL)-based agent tuning. The project explores novel algorithmic structures, diverse reasoning paradigms, sophisticated reward strategies, and extensive benchmark environments. It aims to push the boundaries of agent reasoning and tool integration by integrating insights from leading RL tuning frameworks and continuously updating progress in a dynamic, live-streaming fashion.

chunkhound
ChunkHound is a modern tool for transforming your codebase into a searchable knowledge base for AI assistants. It utilizes semantic search via the cAST algorithm and regex search, integrating with AI assistants through the Model Context Protocol (MCP). With features like cAST Algorithm, Multi-Hop Semantic Search, Regex search, and support for 22 languages, ChunkHound offers a local-first approach to code analysis and discovery. It provides intelligent code discovery, universal language support, and real-time indexing capabilities, making it a powerful tool for developers looking to enhance their coding experience.

RealtimeSTT_LLM_TTS
RealtimeSTT is an easy-to-use, low-latency speech-to-text library for realtime applications. It listens to the microphone and transcribes voice into text, making it ideal for voice assistants and applications requiring fast and precise speech-to-text conversion. The library utilizes Voice Activity Detection, Realtime Transcription, and Wake Word Activation features. It supports GPU-accelerated transcription using PyTorch with CUDA support. RealtimeSTT offers various customization options for different parameters to enhance user experience and performance. The library is designed to provide a seamless experience for developers integrating speech-to-text functionality into their applications.

Dive
Dive is an open-source MCP Host Desktop Application that seamlessly integrates with any LLMs supporting function calling capabilities. It offers universal LLM support, cross-platform compatibility, model context protocol for AI agent integration, OAP cloud integration, dual architecture for optimal performance, multi-language support, advanced API management, granular tool control, custom instructions, auto-update mechanism, and more. Dive provides a user-friendly interface for managing multiple AI models and tools, with recent updates introducing major architecture changes, new features, improvements, and platform availability. Users can easily download and install Dive on Windows, MacOS, and Linux, and set up MCP tools through local servers or OAP cloud services.

MM-RLHF
MM-RLHF is a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. It includes a high-quality MLLM alignment dataset, a Critique-Based MLLM reward model, a novel alignment algorithm MM-DPO, and benchmarks for reward models and multimodal safety. The dataset covers image understanding, video understanding, and safety-related tasks with model-generated responses and human-annotated scores. The reward model generates critiques of candidate texts before assigning scores for enhanced interpretability. MM-DPO is an alignment algorithm that achieves performance gains with simple adjustments to the DPO framework. The project enables consistent performance improvements across 10 dimensions and 27 benchmarks for open-source MLLMs.

modern_ai_for_beginners
This repository provides a comprehensive guide to modern AI for beginners, covering both theoretical foundations and practical implementation. It emphasizes the importance of understanding both the mathematical principles and the code implementation of AI models. The repository includes resources on PyTorch, deep learning fundamentals, mathematical foundations, transformer-based LLMs, diffusion models, software engineering, and full-stack development. It also features tutorials on natural language processing with transformers, reinforcement learning, and practical deep learning for coders.

payload-ai
The Payload AI Plugin is an advanced extension that integrates modern AI capabilities into your Payload CMS, streamlining content creation and management. It offers features like text generation, voice and image generation, field-level prompt customization, prompt editor, document analyzer, fact checking, automated content workflows, internationalization support, editor AI suggestions, and AI chat support. Users can personalize and configure the plugin by setting environment variables. The plugin is actively developed and tested with Payload version v3.2.1, with regular updates expected.

llms-interview-questions
This repository contains a comprehensive collection of 63 must-know Large Language Models (LLMs) interview questions. It covers topics such as the architecture of LLMs, transformer models, attention mechanisms, training processes, encoder-decoder frameworks, differences between LLMs and traditional statistical language models, handling context and long-term dependencies, transformers for parallelization, applications of LLMs, sentiment analysis, language translation, conversation AI, chatbots, and more. The readme provides detailed explanations, code examples, and insights into utilizing LLMs for various tasks.

PromptX
PromptX is a leading AI agent context platform that revolutionizes interaction design, enabling AI agents to become industry experts. It offers core capabilities such as an AI role creation platform, intelligent tool development platform, and cognitive memory system. PromptX allows users to easily discover experts, summon them for assistance, and engage in professional dialogues through natural conversations. The platform's core philosophy emphasizes treating AI as a person, enabling users to communicate naturally without the need for complex commands. With Nuwa Creation Workshop, users can design custom AI roles using meta-prompt technology, transforming abstract needs into concrete executable AI expert roles in just minutes.

Apt
Apt. is a free and open-source AI productivity tool designed to enhance user productivity while ensuring privacy and data security. It offers efficient AI solutions such as built-in ChatGPT, batch image and video processing, and more. Key features include free and open-source code, privacy protection through local deployment, offline operation, no installation needed, and multi-language support. Integrated AI models cover ChatGPT for intelligent conversations, image processing features like super-resolution and color restoration, and video processing capabilities including super-resolution and frame interpolation. Future plans include integrating more AI models. The tool provides user guides and technical support via email and various platforms, with a user-friendly interface for easy navigation.

persistent-ai-memory
Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.

Lidar_AI_Solution
Lidar AI Solution is a highly optimized repository for self-driving 3D lidar, providing solutions for sparse convolution, BEVFusion, CenterPoint, OSD, and Conversion. It includes CUDA and TensorRT implementations for various tasks such as 3D sparse convolution, BEVFusion, CenterPoint, PointPillars, V2XFusion, cuOSD, cuPCL, and YUV to RGB conversion. The repository offers easy-to-use solutions, high accuracy, low memory usage, and quantization options for different tasks related to self-driving technology.

JamAIBase
JamAI Base is an open-source platform integrating SQLite and LanceDB databases with managed memory and RAG capabilities. It offers built-in LLM, vector embeddings, and reranker orchestration accessible through a spreadsheet-like UI and REST API. Users can transform static tables into dynamic entities, facilitate real-time interactions, manage structured data, and simplify chatbot development. The tool focuses on ease of use, scalability, flexibility, declarative paradigm, and innovative RAG techniques, making complex data operations accessible to users with varying technical expertise.
For similar tasks

PoPo
PoPo is an AI-powered MMD pose generator that transforms natural language descriptions into expressive 3D character animations. It uses MPL (MMD Pose Language) to generate anatomically correct poses, providing real-time rendering and precise pose control. The tool fine-tunes LLMs with MPL, resulting in better training convergence, consistent outputs, anatomically correct poses, and debuggable results. The technology stack includes Next.js, Babylon.js, MPL, fine-tuned GPT-4o-mini, and Vercel for deployment. By training on semantic MPL instead of raw quaternions, PoPo enables the AI to understand the 'grammar' of human movement.

lunary
Lunary is an open-source observability and prompt platform for Large Language Models (LLMs). It provides a suite of features to help AI developers take their applications into production, including analytics, monitoring, prompt templates, fine-tuning dataset creation, chat and feedback tracking, and evaluations. Lunary is designed to be usable with any model, not just OpenAI, and is easy to integrate and self-host.

VideoTuna
VideoTuna is a codebase for text-to-video applications that integrates multiple AI video generation models for text-to-video, image-to-video, and text-to-image generation. It provides comprehensive pipelines in video generation, including pre-training, continuous training, post-training, and fine-tuning. The models in VideoTuna include U-Net and DiT architectures for visual generation tasks, with upcoming releases of a new 3D video VAE and a controllable facial video generation model.
For similar jobs

Godot4ThirdPersonCombatPrototype
Godot4ThirdPersonCombatPrototype is a base project for third person combat, featuring player movement and camera controls with lock-on functionality. It includes setups for models, animations, AI behavior, state machines, audio, and custom resources. The project aims to provide a foundation for developers to create third-person combat mechanics in their games.

PoPo
PoPo is an AI-powered MMD pose generator that transforms natural language descriptions into expressive 3D character animations. It uses MPL (MMD Pose Language) to generate anatomically correct poses, providing real-time rendering and precise pose control. The tool fine-tunes LLMs with MPL, resulting in better training convergence, consistent outputs, anatomically correct poses, and debuggable results. The technology stack includes Next.js, Babylon.js, MPL, fine-tuned GPT-4o-mini, and Vercel for deployment. By training on semantic MPL instead of raw quaternions, PoPo enables the AI to understand the 'grammar' of human movement.

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students