ai-game-development-tools
Here we will keep track of the latest AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥
Stars: 312
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool
README:
Here we will keep track of the latest AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥
- Tool (AI LLM)
- Game (Agent)
- Code
- Writer
- Image
- Texture
- Shader
- 3D Model
- Avatar
- Animation
- Visual
- Video
- Audio
- Music
- Singing Voice
- Speech
- Analytics
- Video Tool
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AgentGPT | 🤖 Assemble, configure, and deploy autonomous AI Agents in your browser. | Tool | ||
AICommand | ChatGPT integration with Unity Editor. | Unity | Tool | |
AIOS | LLM Agent Operating System. | Tool | ||
Assistant CLI | A comfortable CLI tool to use ChatGPT service🔥 | Tool | ||
Auto-GPT | An experimental open-source attempt to make GPT-4 fully autonomous. | Tool | ||
BabyAGI | This Python script is an example of an AI-powered task management system. | Tool | ||
👶🤖🖥️ BabyAGI UI | BabyAGI UI is designed to make it easier to run and develop with babyagi in a web app, like a ChatGPT. | Tool | ||
baichuan-7B | A large-scale 7B pretraining language model developed by Baichuan. | Tool | ||
Baichuan-13B | A 13B large language model developed by Baichuan Intelligent Technology. | Tool | ||
Baichuan 2 | A series of large language models developed by Baichuan Intelligent Technology. | Tool | ||
Bisheng | Bisheng is an open LLM devops platform for next generation AI applications. | Tool | ||
Character-LLM | A Trainable Agent for Role-Playing. | arXiv | Tool | |
ChatDev | Communicative Agents for Software Development. | arXiv | Tool | |
ChatGPT-API-unity | Binds ChatGPT chat completion API to pure C# on Unity. | Unity | Tool | |
ChatGPTForUnity | ChatGPT for unity. | Unity | Tool | |
ChatRWKV | ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source. | Tool | ||
ChatYuan | Large Language Model for Dialogue in Chinese and English. | Tool | ||
Chinese-LLaMA-Alpaca-3 | (Chinese Llama-3 LLMs) developed from Meta Llama 3. | Tool | ||
Chrome-GPT | An AutoGPT agent that controls Chrome on your desktop. | Tool | ||
CogVLM | CogVLM, a powerful open-source visual language foundation model. | Tool | ||
CoreNet | A library for training deep neural networks. | Tool | ||
DBRX | DBRX is a large language model trained by Databricks. | Tool | ||
DemoGPT | Auto Gen-AI App Generator with the Power of Llama 2 | Tool | ||
Design2Code | Automating Front-End Engineering | Tool | ||
Devika | Devika is an Agentic AI Software Engineer. | Tool | ||
Devon | An open-source pair programmer. | Tool | ||
Dora | Generating powerful websites, one prompt at a time. | Tool | ||
Flowise | Drag & drop UI to build your customized LLM flow using LangchainJS. | Tool | ||
Gemini | Gemini is built from the ground up for multimodality — reasoning seamlessly across text, images, video, audio, and code. | Tool | ||
Gemma | Gemma is a family of lightweight, state-of-the art open models built from research and technology used to create Google Gemini models. | Tool | ||
gemma.cpp | lightweight, standalone C++ inference engine for Google's Gemma models. | Tool | ||
GPT4All | A chatbot trained on a massive collection of clean assistant data including code, stories and dialogue. | Tool | ||
GPT-4o | GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. | Tool | ||
GPTScript | Develop LLM Apps in Natural Language. | Tool | ||
Grok-1 | The weights and architecture of our 314 billion parameter Mixture-of-Experts model, Grok-1. | Tool | ||
HuggingChat | Making the community's best AI chat models available to everyone. | Tool | ||
Hugging Face API Unity Integration | This Unity package provides an easy-to-use integration for the Hugging Face Inference API, allowing developers to access and use Hugging Face AI models within their Unity projects. | Unity | Tool | |
ImageBind | ImageBind One Embedding Space to Bind Them All. | arXiv | Tool | |
InteractML-Unity | InteractML, an Interactive Machine Learning Visual Scripting framework for Unity3D. | Unity | Tool | |
InteractML-Unreal Engine | Bringing Machine Learning to Unreal Engine. | Unreal Engine | Tool | |
InternLM | InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system. | arXiv | Tool | |
InternLM-XComposer | InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension. | arXiv | Tool | |
Jan | Bring AI to your Desktop. | Tool | ||
Lamini | Lamini allows any engineering team to outperform general purpose LLMs through RLHF and fine- tuning on their own data. | Tool | ||
LaMini-LM | LaMini-LM is a collection of small-sized, efficient language models distilled from ChatGPT and trained on a large-scale dataset of 2.58M instructions. | Tool | ||
LangChain | LangChain is a framework for developing applications powered by language models. | Tool | ||
LangFlow | ⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows. | Tool | ||
LaVague | Automate automation with Large Action Model framework. | Tool | ||
Lemur | Open Foundation Models for Language Agents. | Tool | ||
Lepton AI | A Pythonic framework to simplify AI service building. | Tool | ||
Lit-LLaMA | Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. | Tool | ||
llama2-webui | Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). | Tool | ||
Llama 3 | The official Meta Llama 3 GitHub site. | Tool | ||
LLaSM | Large Language and Speech Model. | Tool | ||
LLM Answer Engine | Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Mixtral, Langchain, OpenAI, Brave & Serper. | Tool | ||
llm.c | LLM training in simple, raw C/CUDA. | Tool | ||
LLocalSearch | LLocalSearch is a completely locally running search engine using LLM Agents. | Tool | ||
LogicGamesSolver | A Python tool to solve logic games with AI, Deep Learning and Computer Vision. | Tool | ||
Large World Model (LWM) | Large World Model (LWM) is a general-purpose large-context multimodal autoregressive model. | Tool | ||
Lumina-T2X | Lumina-T2X is a unified framework for Text to Any Modality Generation. | Tool | ||
MetaGPT | The Multi-Agent Framework | Tool | ||
MiniCPM-2B | An end-side LLM outperforms Llama2-13B. | Tool | ||
MiniGPT-4 | Enhancing Vision-language Understanding with Advanced Large Language Models. | Tool | ||
MiniGPT-5 | Interleaved Vision-and-Language Generation via Generative Vokens. | Tool | ||
Mixtral 8x7B | A high quality Sparse Mixture-of-Experts. | Tool | ||
Mistral 7B | The best 7B model to date, Apache 2.0. | Tool | ||
Mistral Large | Mistral Large is a new cutting-edge text generation model. It reaches top-tier reasoning capabilities. | Tool | ||
MLC LLM | Enable everyone to develop, optimize and deploy AI models natively on everyone's devices. | Tool | ||
MobiLlama | Towards Accurate and Lightweight Fully Transparent GPT. | Tool | ||
MoE-LLaVA | Mixture of Experts for Large Vision-Language Models. | arXiv | Tool | |
MOSS | An open-source tool-augmented conversational language model from Fudan University. | Tool | ||
mPLUG-Owl🦉 | Modularization Empowers Large Language Models with Multimodality. | Tool | ||
Nemotron-4 | A 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. | Tool | ||
NExT-GPT | Any-to-Any Multimodal Large Language Model. | Tool | ||
OLMo | Open Language Model | Tool | ||
OmniLMM | Large multi-modal models for strong performance and efficient deployment. | Tool | ||
OneLLM | One Framework to Align All Modalities with Language. | Tool | ||
Open-Assistant | OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. | Tool | ||
OpenDevin | An autonomous AI software engineer. | Tool | ||
Orion-14B | Orion-14B is a family of models includes a 14B foundation LLM, and a series of models. | Tool | ||
Panda | Overseas Chinese open source large language model, based on Llama-7B, -13B, -33B, -65B for continuous pre-training in the Chinese field. | Tool | ||
Perplexica | An AI-powered search engine. | Tool | ||
Pi | AI chatbot designed for personal assistance and emotional support. | Tool | ||
Qwen1.5 | Qwen1.5 is the improved version of Qwen. | Tool | ||
Qwen-7B | The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud. | Tool | ||
RepoAgent | RepoAgent is an Open-Source project driven by Large Language Models(LLMs) that aims to provide an intelligent way to document projects. | Tool | ||
Sanity AI Engine | Sanity AI Engine for the Unity Game Development Tool. | Unity | Tool | |
SearchGPT | 🌳 Connecting ChatGPT with the Internet | Tool | ||
ShareGPT4V | Improving Large Multi-Modal Models with Better Captions. | Tool | ||
Skywork | Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. | Tool | ||
StableLM | Stability AI Language Models. | Tool | ||
Stanford Alpaca | An Instruction-following LLaMA Model. | Tool | ||
Text generation web UI | A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, OPT, and GALACTICA. | Tool | ||
TinyChatEngine | On-Device LLM Inference Library. | Tool | ||
ToolBench | An open platform for training, serving, and evaluating large language model for tool learning. | Tool | ||
Unity ChatGPT | Unity ChatGPT Experiments. | Unity | Tool | |
Unity OpenAI-API Integration | Integrate openai GPT-3 language model and ChatGPT API into a Unity project. | Unity | Tool | |
Unreal Engine 5 Llama LoRA | A proof-of-concept project that showcases the potential for using small, locally trainable LLMs to create next-generation documentation tools. | Unreal Engine | Tool | |
UnrealGPT | A collection of Unreal Engine 5 Editor Utility widgets powered by GPT3/4. | Unreal Engine | Tool | |
Video-LLaVA | Learning United Visual Representation by Alignment Before Projection. | Tool | ||
WebGPT | Run GPT model on the browser with WebGPU. | Tool | ||
Web3-GPT | Deploy smart contracts with AI | Tool | ||
WordGPT | 🤖 Bring the power of ChatGPT to Microsoft Word | Tool | ||
XAgent | An Autonomous LLM Agent for Complex Task Solving. | Tool | ||
Yi | A series of large language models trained from scratch by developers. | Tool | ||
01 Project | The open-source language model computer. | Tool |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AgentBench | A Comprehensive Benchmark to Evaluate LLMs as Agents. | arXiv | Agent | |
Agent Group Chat | An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior. | arXiv | Agent | |
AgentScope | Start building LLM-empowered multi-agent applications in an easier way. | arXiv | Agent | |
AgentSims | An Open-Source Sandbox for Large Language Model Evaluation. | Agent | ||
AI Town | AI Town is a virtual town where AI characters live, chat and socialize. | Agent | ||
anime.gf | Local & Open Source Alternative to CharacterAI. | Game | ||
AutoAgents | A Framework for Automatic Agent Generation. | Agent | ||
AutoGen | Enable Next-Gen Large Language Model Applications. | Agent | ||
behaviac | Behaviac is a framework of the game AI development. | Framework | ||
Biomes | Biomes is an open source sandbox MMORPG built for the web using web technologies such as Next.js, Typescript, React and WebAssembly. | Game | ||
Byzer-Agent | Easy, fast, and distributed agent framework for everyone. | Agent | ||
Cat Town | A C(h)atGPT-powered simulation with cats. | Agent | ||
Cat Town | A C(h)atGPT-powered simulation with cats. | Agent | ||
CharacterGLM | Customizing Chinese Conversational AI Characters with Large Language Models. | arXiv | Agent | |
ChatDev | Communicative Agents for Software Development. | arXiv | Agent | |
CogAgent | CogAgent is an open-source visual language model improved based on CogVLM. | Agent | ||
Cradle | Towards General Computer Control. | Agent | ||
crewAI | Framework for orchestrating role-playing, autonomous AI agents. | Agent | ||
Dify | Dify is an open-source LLM app building platform. | Agent | ||
Digital Life Project | Autonomous 3D Characters with Social Intelligence. | arXiv | Agent | |
everything-ai | Your fully proficient, AI-powered and local chatbot assistant🤖. | Agent | ||
fabric | fabric is an open-source framework for augmenting humans using AI. | Agent | ||
FastGPT | FastGPT is a knowledge-based platform built on the LLM. | Agent | ||
fastRAG | Efficient Retrieval Augmentation and Generation Framework. | Agent | ||
GameAISDK | Image-based game AI automation framework. | Framework | ||
Generative Agents | Interactive Simulacra of Human Behavior. | arXiv | Agent | |
Genie | Generative Interactive Environments. | Game | ||
gigax | Runtime, LLM-powered NPCs. | Game | ||
Interactive LLM Powered NPCs | Interactive LLM Powered NPCs, is an open-source project that completely transforms your interaction with non-player characters (NPCs) in any game! | Game | ||
KwaiAgents | A generalized information-seeking agent system with Large Language Models (LLMs). | arXiv | Agent | |
LangChain | Get your LLM application from prototype to production. | Agent | ||
Langflow | Langflow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows. | Agent | ||
LARP | Language-Agent Role Play for open-world games. | Agent | ||
LlamaIndex | LlamaIndex is a data framework for your LLM application. | Agent | ||
Moonlander.ai | Start building 3D games without any coding using generative AI. | Framework | ||
MuG Diffusion | MuG Diffusion is a charting AI for rhythm games based on Stable Diffusion (one of the most powerful AIGC models) with a large modification to incorporate audio waves. | Game | ||
OpenAgents | An Open Platform for Language Agents in the Wild. | Agent | ||
Opus | An AI app that turns text into a video game. | Game | ||
Qwen-Agent | Qwen-Agent is a framework for developing LLM applications based on the instruction following, tool usage, planning, and memory capabilities of Qwen. | Agent | ||
Ragas | Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. | Agent | ||
SIMA | A generalist AI agent for 3D virtual environments. | Agent | ||
StoryGames.ai | AI for Dreamers Make Games. | Game | ||
SWE-agent | Agent Computer Interfaces Enable Software Engineering Language Models. | Agent | ||
Video2Game | Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video. | arXiv | Game | |
V-IRL | Grounding Virtual Intelligence in Real Life. | arXiv | Agent | |
XAgent | An Autonomous LLM Agent for Complex Task Solving. | Agent |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AI Code Translator | Use AI to translate code from one language to another. | Code | ||
aiXcoder-7B | aiXcoder-7B Code Large Language Model. | Code | ||
bloop | bloop is a fast code search engine written in Rust. | Code | ||
Chapyter | ChatGPT Code Interpreter in Jupyter Notebooks. | Code | ||
CodeGeeX | An Open Multilingual Code Generation Model. | arXiv | Code | |
CodeGeeX2 | A More Powerful Multilingual Code Generation Model. | Code | ||
CodeGen | CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. | Code | ||
CodeGen2 | CodeGen2 models for program synthesis. | Code | ||
Code Llama | Code Llama is a large language models for code based on Llama 2. | Code | ||
CodeTF | One-stop Transformer Library for State-of-the-art Code LLM. | Code | ||
CodeT5 | Open Code LLMs for Code Understanding and Generation. | Code | ||
Cursor | Write, edit, and chat about your code with GPT-4 in a new type of editor. | Code | ||
OpenAI Codex | OpenAI Codex is a descendant of GPT-3. | Code | ||
PandasAI | Pandas AI is a Python library that integrates generative artificial intelligence capabilities into Pandas, making dataframes conversational. | Code | ||
RobloxScripterAI | RobloxScripterAI is an AI-powered code generation tool for Roblox. | Roblox | Code | |
Scikit-LLM | Seamlessly integrate powerful language models like ChatGPT into scikit-learn for enhanced text analysis tasks. | Code | ||
SoTaNa | The Open-Source Software Development Assistant. | Code | ||
Stable Code 3B | Coding on the Edge. | Code | ||
StarCoder | 💫 StarCoder is a language model (LM) trained on source code and natural language text. | Code | ||
StarCoder 2 | StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from The Stack v2 and some natural language text such as Wikipedia, Arxiv, and GitHub issues. | Code | ||
UnityGen AI | UnityGen AI is an AI-powered code generation plugin for Unity. | Unity | Code |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AI-Writer | AI writes novels, generates fantasy and romance web articles, etc. Chinese pre-trained generative model. | Writer | ||
Notebook.ai | Notebook.ai is a set of tools for writers, game designers, and roleplayers to create magnificent universes – and everything within them. | Writer | ||
Novel | Notion-style WYSIWYG editor with AI-powered autocompletions. | Writer | ||
NovelAI | Driven by AI, painlessly construct unique stories, thrilling tales, seductive romances, or just fool around. | Writer |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AnyDoor | Zero-shot Object-level Image Customization. | arXiv | Image | |
AnyText | Multilingual Visual Text Generation And Editing. | arXiv | Image | |
Blender-ControlNet | Using ControlNet right in Blender. | Blender | Image | |
BriVL | Bridging Vision and Language Model. | arXiv | Image | |
CLIPasso | A method for converting an image of an object to a sketch, allowing for varying levels of abstraction. | arXiv | Image | |
ClipDrop | Create stunning visuals in seconds. | Image | ||
ComfyUI | A powerful and modular stable diffusion GUI with a graph/nodes interface. | Image | ||
ConceptLab | Creative Generation using Diffusion Prior Constraints. | arXiv | Image | |
ControlNet | ControlNet is a neural network structure to control diffusion models by adding extra conditions. | arXiv | Image | |
DALL·E 2 | DALL·E 2 is an AI system that can create realistic images and art from a description in natural language. | Image | ||
Dashtoon Studio | Dashtoon Studio is an AI powered comic creation platform. | Comic | ||
DeepAI | DeepAI offers a suite of tools that use AI to enhance your creativity. | Image | ||
DeepFloyd IF | IF by DeepFloyd Lab at StabilityAI. | Image | ||
Depth map library and poser | Depth map library for use with the Control Net extension for Automatic1111/stable-diffusion-webui. | Image | ||
Diffuse to Choose | Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All. | arXiv | Image | |
Disco Diffusion | A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations. | Image | ||
DragGAN | Interactive Point-based Manipulation on the Generative Image Manifold. | arXiv | Image | |
Draw Things | AI- assisted image generation in Your Pocket. | Image | ||
DWPose | Effective Whole-body Pose Estimation with Two-stages Distillation. | arXiv | Image | |
EasyPhoto | Your Smart AI Photo Generator. | Image | ||
Follow-Your-Click | Open-domain Regional Image Animation via Short Prompts. | arXiv | Image | |
Fooocus | Focus on prompting and generating. | Image | ||
GIFfusion | Create GIFs and Videos using Stable Diffusion. | Image | ||
Grounded-Segment-Anything | Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs. | Image | ||
Hua | Hua is an AI image editor with Stable Diffusion (and more). | Image | ||
Hunyuan-DiT | A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding. | arXiv | Image | |
IC-Light | IC-Light is a project to manipulate the illumination of images. | Image | ||
Ideogram | Helping people become more creative. | Image | ||
Imagen | Imagen is an AI system that creates photorealistic images from input text. | Image | ||
img2img-turbo | One-Step Image-to-Image with SD-Turbo. | Image | ||
Img2Prompt | Get prompts from stable diffusion generated images. | Image | ||
InstantID | Zero-shot Identity-Preserving Generation in Seconds. | arXiv | Image | |
InternLM-XComposer2 | InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension. | Image | ||
KOALA | Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis. | Image | ||
KREA | Generate images and videos with a delightful AI-powered design tool. | Image | ||
LaVi-Bridge | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation. | Image | ||
LayerDiffusion | Transparent Image Layer Diffusion using Latent Transparency. | Image | ||
Lexica | A Stable Diffusion prompts search engine. | Image | ||
MetaShoot | MetaShoot is a digital twin of a photo studio, developed as a plugin for Unreal Engine that gives any creator the ability to produce highly realistic renders in the easiest and quickest way. | Unreal Engine | Image | |
Midjourney | Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species. | Image | ||
Openpose Editor | Openpose Editor for AUTOMATIC1111's stable-diffusion-webui. | Image | ||
Outfit Anyone | Ultra-high quality virtual try-on for Any Clothing and Any Person. | Image | ||
PhotoMaker | Customizing Realistic Human Photos via Stacked ID Embedding. | Image | ||
Photoroom | AI Background Generator. | Image | ||
Plask | AI image generation in the cloud. | Image | ||
Prompt.Art | The Generators Hub. | Image | ||
PuLID | Pure and Lightning ID Customization via Contrastive Alignment. | Image | ||
Rich-Text-to-Image | Expressive Text-to-Image Generation with Rich Text. | Image | ||
RPG-DiffusionMaster | Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG). | Image | ||
Segment Anything | Segment Anything Model (SAM): a new AI model from Meta AI that can "cut out" any object , in any image , with a single click. | Image | ||
sd-webui-controlnet | WebUI extension for ControlNet. | Image | ||
SDXL-Lightning | Progressive Adversarial Diffusion Distillation. | Image | ||
SDXS | Real-Time One-Step Latent Diffusion Models with Image Conditions. | Image | ||
Stable.art | Photoshop plugin for Stable Diffusion with Automatic1111 as backend (locally or with Google Colab). | Image | ||
Stable Cascade | Stable Cascade consists of three models: Stage A, Stage B and Stage C, representing a cascade for generating images, hence the name "Stable Cascade". | Image | ||
Stable Diffusion | A latent text-to-image diffusion model. | Image | ||
stable-diffusion.cpp | Stable Diffusion in pure C/C++. | Image | ||
Stable Diffusion web UI | A browser interface based on Gradio library for Stable Diffusion. | Image | ||
Stable Diffusion web UI | Web-based UI for Stable Diffusion. | Image | ||
Stable Diffusion WebUI Chinese | Chinese version of stable-diffusion-webui. | Image | ||
Stable Diffusion XL | Generate images from text. | Image | ||
Stable Diffusion XL Turbo | Real-Time Text-to-Image Generation. | Image | ||
Stable Doodle | Stable Doodle is a sketch-to-image tool that converts a simple drawing into a dynamic image. | Image | ||
StableStudio | StableStudio by Stability AI | Image | ||
StreamDiffusion | A Pipeline-Level Solution for Real-Time Interactive Generation. | Image | ||
StyleDrop | Text-To-Image Generation in Any Style. | Image | ||
SyncDreamer | Generating Multiview-consistent Images from a Single-view Image. | Image | ||
Unity ML Stable Diffusion | Core ML Stable Diffusion on Unity. | Unity | Image | |
Vispunk Visions | Text-to-Image generation platform. | Image |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
CRM | Single Image to 3D Textured Mesh with Convolutional Reconstruction Model. | arXiv | Texture | |
DreamMat | High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models. | arXiv | Texture | |
DreamSpace | Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation. | Texture | ||
Dream Textures | Stable Diffusion built-in to Blender. Create textures, concept art, background assets, and more with a simple text prompt. | Blender | Texture | |
InstructHumans | Editing Animated 3D Human Textures with Instructions. | Texture | ||
InteX | Interactive Text-to-Texture Synthesis via Unified Depth-aware Inpainting. | Texture | ||
Neuralangelo | High-Fidelity Neural Surface Reconstruction. | Texture | ||
Paint-it | Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering. | Texture | ||
Polycam | Create your own 3D textures just by typing. | Texture | ||
TexFusion | Synthesizing 3D Textures with Text-Guided Image Diffusion Models. | Texture | ||
Text2Tex | Text-driven texture Synthesis via Diffusion Models. | Texture | ||
Texture Lab | AI-generated texures. You can generate your own with a text prompt. | Texture | ||
With Poly | Create Textures With Poly. Generate 3D materials with AI in a free online editor, or search our growing community library. | Texture |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AI Shader | ChatGPT-powered shader generator for Unity. | Unity | Shader |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
Anything-3D | Segment-Anything + 3D. Let's lift the anything to 3D. | arXiv | Model | |
BlenderGPT | Use commands in English to control Blender with OpenAI's GPT-4. | Blender | Model | |
Blender-GPT | An all-in-one Blender assistant powered by GPT3/4 + Whisper integration. | Blender | Model | |
Blockade Labs | Digital alchemy is real with Skybox Lab - the ultimate AI-powered solution for generating incredible 360° skybox experiences from text prompts. | Model | ||
chatGPT-maya | Simple Maya tool that utilizes open AI to perform basic tasks based on descriptive instructions. | Maya | Model | |
CityDreamer | Compositional Generative Model of Unbounded 3D Cities. | 3D | ||
CSM | Generate 3D worlds from images and videos. | 3D | ||
Dash | Your Copilot for World Building in Unreal Engine. | Unreal Engine | 3D | |
DUSt3R | Geometric 3D Vision Made Easy. | 3D | ||
GaussianDreamer | Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors. | 3D | ||
GenieLabs | Empower your game with AI-UGC. | 3D | ||
HiFA | High-fidelity Text-to-3D with advance Diffusion guidance. | Model | ||
Infinigen | Infinite Photorealistic Worlds using Procedural Generation. | 3D | ||
Instruct-NeRF2NeRF | Editing 3D Scenes with Instructions. | Model | ||
Interactive3D | Create What You Want by Interactive 3D Generation. | 3D | ||
Isotropic3D | Image-to-3D Generation Based on a Single CLIP Embedding. | 3D | ||
LATTE3D | Large-scale Amortized Text-To-Enhanced3D Synthesis. | 3D | ||
LION | Latent Point Diffusion Models for 3D Shape Generation. | Model | ||
Luma AI | Capture in lifelike 3D. Unmatched photorealism, reflections, and details. The future of VFX is now, for everyone! | Model | ||
lumine AI | AI-Powered Creativity. | 3D | ||
Make-It-3D | High-Fidelity 3D Creation from A Single Image with Diffusion Prior. | Model | ||
Meshy | Create Stunning 3D Game Assets with AI. | 3D | ||
Mootion | Magical 3D AI Animation Maker. | 3D | ||
MVDream | Multi-view Diffusion for 3D Generation. | 3D | ||
NVIDIA Instant NeRF | Instant neural graphics primitives: lightning fast NeRF and more. | Model | ||
One-2-3-45 | Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization. | Model | ||
Paint3D | Paint Anything 3D with Lighting-Less Texture Diffusion Models. | 3D | ||
PAniC-3D | Stylized Single-view 3D Reconstruction from Portraits of Anime Characters. | Model | ||
Point·E | Point cloud diffusion for 3D model synthesis. | Model | ||
ProlificDreamer | High-Fidelity and diverse Text-to-3D generation with Variational score Distillation. | Model | ||
Shap-E | Generate 3D objects conditioned on text or images. | Model | ||
Sloyd | 3D modelling has never been easier. | Model | ||
Spline AI | The power of AI is coming to the 3rd dimension. Generate objects, animations, and textures using prompts. | Model | ||
Stable Dreamfusion | A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. | Model | ||
SV3D | Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion. | 3D | ||
Tafi | AI text to 3D character engine. | Model | ||
3D-GPT | Procedural 3D MODELING WITH LARGE LANGUAGE MODELS. | 3D | ||
3D-LLM | Injecting the 3D World into Large Language Models. | 3D | ||
3Dpresso | Extract a 3D model of an object, captured on a video. | Model | ||
3DTopia | Text-to-3D Generation within 5 Minutes. | 3D | ||
threestudio | A unified framework for 3D content generation. | Model | ||
TripoSR | A state-of-the-art open-source model for fast feedforward 3D reconstruction from a single image. | Model | ||
UnityGaussianSplatting | Toy Gaussian Splatting visualization in Unity. | Unity | 3D | |
ViVid-1-to-3 | Novel View Synthesis with Video Diffusion Models. | 3D | ||
Voxcraft | Crafting Ready-to-Use 3D Models with AI. | 3D | ||
Wonder3D | Single Image to 3D using Cross-Domain Diffusion. | 3D | ||
Zero-1-to-3 | Zero-shot One Image to 3D Object. | Model |
Source | Description | Game Engine | Type |
---|---|---|---|
AniPortrait | Audio-Driven Synthesis of Photorealistic Portrait Animations. | Avatar | |
CALM | Conditional Adversarial Latent Models for Directable Virtual Characters. | Avatar | |
ChatAvatar | Progressive generation Of Animatable 3D Faces Under Text guidance. | Avatar | |
ChatdollKit | ChatdollKit enables you to make your 3D model into a chatbot. | Unity | Avatar |
DreamTalk | When Expressive Talking Head Generation Meets Diffusion Probabilistic Models. | Avatar | |
EMOPortraits | Emotion-enhanced Multimodal One-shot Head Avatars. | Avatar | |
GeneAvatar | Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image. | Avatar | |
GeneFace++ | Generalized and Stable Real-Time 3D Talking Face Generation. | Avatar | |
HeadSculpt | Crafting 3D Head Avatars with Text. | Avatar | |
Linly-Talker | Digital Avatar Conversational System. | Avatar | |
MotionGPT | Human Motion as a Foreign Language, a unified motion-language generation model using LLMs. | Avatar | |
MuseTalk | Real-Time High Quality Lip Synchorization with Latent Space Inpainting. | Avatar | |
Ready Player Me | Integrate customizable avatars into your game or app in days. | Avatar | |
StyleAvatar3D | Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation. | Avatar | |
Text2Control3D | Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model. | Avatar | |
UnityAIWithChatGPT | Based on Unity, ChatGPT+UnityChan voice interactive display is realized. | Unity | Avatar |
Vid2Avatar | 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition. | Avatar | |
VLOGGER | Multimodal Diffusion for Embodied Avatar Synthesis. | Avatar | |
Wild2Avatar | Rendering Humans Behind Occlusions. | Avatar |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
Animate Anyone | Consistent and Controllable Image-to-Video Synthesis for Character Animation. | arXiv | Animation | |
AnimateAnything | Fine-Grained Open Domain Image Animation with Motion Guidance. | Animation | ||
AnimateDiff | Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning. | Animation | ||
AnimateLCM | Let's Accelerate the Video Generation within 4 Steps! | Animation | ||
AnimateZero | Video Diffusion Models are Zero-Shot Image Animators. | Animation | ||
AnimationGPT | An AIGC tool for generating game combat motion assets. | Animation | ||
Deforum | Deforum leverages Stable Diffusion to generate evolving AI visuals. | Animation | ||
DreaMoving | A Human Video Generation Framework based on Diffusion Models. | Animation | ||
FaceFusion | Next generation face swapper and enhancer. | Animation | ||
FreeInit | Bridging Initialization Gap in Video Diffusion Models. | Animation | ||
GeneFace | Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis. | Animation | ||
ID-Animator | Zero-Shot Identity-Preserving Human Video Generation. | arXiv | Animation | |
MagicAnimate | Temporally Consistent Human Image Animation using Diffusion Model. | arXiv | Animation | |
NUWA | DragNUWA is an open-domain diffusion-based video generation model takes text, image, and trajectory controls as inputs to achieve controllable video generation. | arXiv | Animation | |
NUWA-Infinity | NUWA-Infinity is a multimodal generative model that is designed to generate high-quality images and videos from given text, image or video input. | Animation | ||
NUWA-XL | A novel Diffusion over Diffusion architecture for eXtremely Long video generation. | Animation | ||
Omni Animation | AI Generated High Fidelity Animations. | Animation | ||
PIA | Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models. | arXiv | Animation | |
SadTalker | Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation. | arXiv | Animation | |
SadTalker-Video-Lip-Sync | This project is based on SadTalkers Wav2lip for video lip synthesis. | Animation | ||
Stable Animation | A powerful text-to-animation tool for developers. | Animation | ||
TaleCrafter | An interactive story visualization tool that support multiple characters. | arXiv | Animation | |
Wav2Lip | Accurately Lip-syncing Videos In The Wild. | Animation | ||
Wonder Studio | An AI tool that automatically animates, lights and composes CG characters into a live-action scene. | Animation |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
CogVLM2 | GPT4V-level open-source multi-modal model based on Llama3-8B. | Visual | ||
LLaVA++ | Extending Visual Capabilities with LLaMA-3 and Phi-3. | Visual | ||
MiniCPM-Llama3-V 2.5 | A GPT-4V Level MLLM on Your Phone. | Visual | ||
PLLaVA | Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning. | arXiv | Visual | |
Vitron | A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing. | Visual |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
360DVD | Controllable Panorama Video Generation with 360-Degree Video Diffusion Model. | arXiv | Video | |
Animate-A-Story | Retrieval-Augmented Video Generation for Telling a Story. | arXiv | Video | |
Anything in Any Scene | Photorealistic Video Object Insertion. | Video | ||
ART•V | Auto-Regressive Text-to-Video Generation with Diffusion Models. | arXiv | Video | |
Assistive | Meet the generative video platform that brings your ideas to life. | Video | ||
AtomoVideo | High Fidelity Image-to-Video Generation. | arXiv | Video | |
BackgroundRemover | Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source. | Video | ||
Boximator | Generating Rich and Controllable Motions for Video Synthesis. | arXiv | Video | |
CoDeF | Content Deformation Fields for Temporally Consistent Video Processing. | arXiv | Video | |
CogVideo | Generate Videos from Text Descriptions. | Video | ||
CogVLM | CogVLM is a powerful open-source visual language model (VLM). | Visual | ||
CoNR | Genarate vivid dancing videos from hand-drawn anime character sheets(ACS). | arXiv | Video | |
Decohere | Create what can't be filmed. | Video | ||
Descript | Descript is the simple, powerful , and fun way to edit. | Video | ||
dolphin | General video interaction platform based on LLMs. | Video | ||
DomoAI | Amplify Your Creativity with DomoAI. | Video | ||
DynamiCrafter | Animating Open-domain Images with Video Diffusion Priors. | arXiv | Video | |
EDGE | We introduce EDGE, a powerful method for editable dance generation that is capable of creating realistic, physically-plausible dances while remaining faithful to arbitrary input music. | arXiv | Video | |
EMO | Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions. | arXiv | Video | |
Emu Video | Factorizing Text-to-Video Generation by Explicit Image Conditioning. | Video | ||
Etna | Etna can generate corresponding video content based on short text descriptions. | Video | ||
Fairy | Fast Parallelized Instruction-Guided Video-to-Video Synthesis. | Video | ||
Follow Your Pose | Pose-Guided Text-to-Video Generation using Pose-Free Videos. | arXiv | Video | |
FullJourney | Your complete suite of AI Creation tools at your fingertips. | Video | ||
Gen-2 | A multi-modal AI system that can generate novel videos with text, images, or video clips. | Video | ||
Generative Dynamics | Generative Image Dynamics. | Video | ||
Genie | Generative Interactive Environments. | arXiv | Video | |
Genmo | Magically make videos with AI. | Video | ||
GenTron | Diffusion Transformers for Image and Video Generation. | Video | ||
HiGen | Hierarchical Spatio-temporal Decoupling for Text-to-Video generation. | Video | ||
Hotshot-XL | Hotshot-XL is an AI text-to-GIF model trained to work alongside Stable Diffusion XL. | Video | ||
Imagen Video | Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. | Video | ||
InstructVideo | Instructing Video Diffusion Models with Human Feedback. | arXiv | Video | |
I2VGen-XL | High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models. | arXiv | Video | |
LaVie | High-Quality Video Generation with Cascaded Latent Diffusion Models. | arXiv | Video | |
LTX Studio | LTX Studio is a holistic, AI-driven filmmaking platform for creators, marketers, filmmakers and studios. | Video | ||
Lumiere | A Space-Time Diffusion Model for Video Generation. | arXiv | Video | |
LVDM | Latent Video Diffusion Models for High-Fidelity Long Video Generation. | arXiv | Video | |
MagicVideo | Efficient Video Generation With Latent Diffusion Models. | arXiv | Video | |
MagicVideo-V2 | Multi-Stage High-Aesthetic Video Generation. | arXiv | Video | |
Magic Hour | AI Video for Creators made simple. | Video | ||
MAGVIT-v2 | Tokenizer is key to visual generation. | Video | ||
MAGVIT | Masked Generative Video Transformer. | Video | ||
Make-A-Video | Make-A-Video is a state-of-the-art AI system that generates videos from text. | Video | ||
Make Pixels Dance | High-Dynamic Video Generation. | Video | ||
Make-Your-Video | Customized Video Generation Using Textual and Structural Guidance. | Video | ||
MicroCinema | A Divide-and-Conquer Approach for Text-to-Video Generation. | Video | ||
Mini-Gemini | Mining the Potential of Multi-modality Vision Language Models. | Vision | ||
MobileVidFactory | Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text. | Video | ||
MoneyPrinterTurbo | Use large models to generate short videos with one click. | Video | ||
Moonvalley | Moonvalley is a groundbreaking new text-to-video generative AI model. | Video | ||
Mora | More like Sora for Generalist Video Generation. | Video | ||
Morph Studio | With our Text-to-Video AI Magic, manifest your creativity through your prompt. | Video | ||
MotionCtrl | A Unified and Flexible Motion Controller for Video Generation. | Video | ||
MotionDirector | Motion Customization of Text-to-Video Diffusion Models. | Video | ||
Motionshop | An application of replacing the characters in video with 3D avatars. | Video | ||
Mov2mov | Mov2mov plugin for Automatic1111/stable-diffusion-webui. | Video | ||
MovieFactory | Automatic Movie Creation from Text using Large Generative Models for Language and Images. | Video | ||
Neural Frames | Discover the synthesizer for the visual world. | Video | ||
NeverEnds | Create your world. | Video | ||
Open-Sora | Democratizing Efficient Video Production for All. | Video | ||
Open-Sora | Open-Sora Plan. | Video | ||
Phenaki | A model for generating videos from text, with prompts that can change over time, and videos that can be as long as multiple minutes. | Video | ||
Pika Labs | Pika Labs is revolutionizing video-making experience with AI. | Video | ||
Pixeling | Pixeling empowers our customers to create highly precise, ultra-realistic, and extremely controllable visual content including images, videos and 3D models. | Video | ||
PixVerse | Create breath-taking videos with AI. | Video | ||
Pollinations | Creating gets easy, fast, and fun. | Video | ||
Reuse and Diffuse | Iterative Denoising for Text-to-Video Generation. | Video | ||
ShortGPT | An experimental AI framework for automated short/video content creation. | Video | ||
Show-1 | Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation. | Video | ||
Snap Video | Scaled Spatiotemporal Transformers for Text-to-Video Synthesis. | Video | ||
Sora | Creating video from text. | Video | ||
SoraWebui | SoraWebui is an open-source Sora web client, enabling users to easily create videos from text with OpenAI's Sora model. | Video | ||
StableVideo | Text-driven Consistency-aware Diffusion Video Editing. | Video | ||
Stable Video Diffusion | Stable Video Diffusion (SVD) Image-to-Video. | Video | ||
StoryDiffusion | Consistent Self-Attention for Long-Range Image and Video Generation. | Video | ||
StreamingT2V | Consistent, Dynamic, and Extendable Long Video Generation from Text. | Video | ||
StyleCrafter | nhancing Stylized Text-to-Video Generation with Style Adapter. | Video | ||
TATS | Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer. | Video | ||
Text2Video-Zero | Text-to-Image Diffusion Models are Zero-Shot Video Generators. | Video | ||
TF-T2V | A Recipe for Scaling up Text-to-Video Generation with Text-free Videos. | Video | ||
Track-Anything | Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything and XMem. | Video | ||
Tune-A-Video | One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. | Video | ||
TwelveLabs | Multimodal AI that understands videos like humans. | Video | ||
UniVG | Towards UNIfied-modal Video Generation. | Video | ||
VGen | A holistic video generation ecosystem for video generation building on diffusion models. | Video | ||
Video-ChatGPT | Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. | Video | ||
VideoComposer | Compositional Video Synthesis with Motion Controllability. | Video | ||
VideoCrafter1 | Open Diffusion Models for High-Quality Video Generation. | Video | ||
VideoCrafter2 | Overcoming Data Limitations for High-Quality Video Diffusion Models. | Video | ||
VideoDrafter | Content-Consistent Multi-Scene Video Generation with LLM. | Video | ||
VideoElevator | Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models. | Video | ||
VideoFactory | Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation. | Video | ||
VideoGen | A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation. | Video | ||
VideoLCM | Video Latent Consistency Model. | Video | ||
Video LDMs | Align your Latents: High- resolution Video Synthesis with Latent Diffusion Models. | Video | ||
Video-LLaVA | Learning United Visual Representation by Alignment Before Projection. | Video | ||
VideoMamba | State Space Model for Efficient Video Understanding. | Video | ||
VideoPoet | A large language model for zero-shot video generation. | Video | ||
Vispunk Motion | Create realistic videos using just text. | Video | ||
VisualRWKV | VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks. | Visual | ||
V-JEPA | Video Joint Embedding Predictive Architecture. | Video | ||
W.A.L.T | Photorealistic Video Generation with Diffusion Models. | Video | ||
Zeroscope | Zeroscope Text-to-Video. | Video |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AcademiCodec | An Open Source Audio Codec Model for Academic Research. | Audio | ||
Amphion | An Open-Source Audio, Music, and Speech Generation Toolkit. | arXiv | Audio | |
ArchiSound | Audio generation using diffusion models, in PyTorch. | Audio | ||
Audiobox | Unified Audio Generation with Natural Language Prompts. | Audio | ||
AudioEditing | Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion. | arXiv | Audio | |
Audiogen Codec | A low compression 48khz stereo neural audio codec for general audio, optimizing for audio fidelity 🎵. | Audio | ||
AudioGPT | Understanding and Generating Speech, Music, Sound, and Talking Head. | arXiv | Audio | |
AudioLDM | Text-to-Audio Generation with Latent Diffusion Models. | arXiv | Audio | |
AudioLDM 2 | Learning Holistic Audio Generation with Self-supervised Pretraining. | arXiv | Audio | |
Auffusion | Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation. | arXiv | Audio | |
CTAG | Creative Text-to-Audio Generation via Synthesizer Programming. | Audio | ||
MAGNeT | Masked Audio Generation using a Single Non-Autoregressive Transformer. | Audio | ||
Make-An-Audio | Text-To-Audio Generation with Prompt-Enhanced Diffusion Models. | arXiv | Audio | |
NeuralSound | Learning-based Modal Sound Synthesis with Acoustic Transfer. | arXiv | Audio | |
OptimizerAI | Sounds for Creators, Game makers, Artists, Video makers. | Audio | ||
SoundStorm | Efficient Parallel Audio Generation. | arXiv | Audio | |
Stable Audio | Fast Timing-Conditioned Latent Audio Diffusion. | Audio | ||
TANGO | Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model. | Audio | ||
WavJourney | Compositional Audio Creation with Large Language Models. | arXiv | Audio |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
AIVA | The Artificial Intelligence composing emotional soundtrack music. | Music | ||
Amper Music | Custom music generation technology powered by Amper. | Music | ||
Boomy | Create generative music. Share it with the world. | Music | ||
ChatMusician | Fostering Intrinsic Musical Abilities Into LLM. | Music | ||
Chord2Melody | Automatic Music Generation AI. | Music | ||
GPTAbleton | Draft script for processing GPT response and sending the MIDI notes into the Ableton clips with AbletonOSC and python-osc. | Music | ||
HeyMusic.AI | AI Music Generator | Music | ||
Image to Music | AI Image to Music Generator is a tool that uses artificial intelligence to convert images into music. | Music | ||
JEN-1 | Text-Guided Universal Music Generation with Omnidirectional Diffusion Models. | Music | ||
Jukebox | A Generative Model for Music. | arXiv | Music | |
Magenta | Magenta is a research project exploring the role of machine learning in the process of creating art and music. | Music | ||
MeLoDy | Efficient Neural Music Generation | Music | ||
Mubert | AI Generative Music. | Music | ||
MuseNet | A deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles. | Music | ||
MusicGen | Simple and Controllable Music Generation. | Music | ||
MusicLDM | Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies. | Music | ||
MusicLM | Generating Music From Text. | arXiv | Music | |
Riffusion App | Riffusion is an app for real-time music generation with stable diffusion. | Music | ||
SoundRaw | AI music generator for creators. | Music |
Source | Description | Paper | Game Engine | Type |
---|---|---|---|---|
DiffSinger | Singing Voice Synthesis via Shallow Diffusion Mechanism. | arXiv | Singing Voice | |
Retrieval-based-Voice-Conversion-WebUI | An easy-to-use SVC framework based on VITS. | Singing Voice | ||
so-vits-svc | SoftVC VITS Singing Voice Conversion. | Singing Voice | ||
VI-SVS | Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger. | Singing Voice |
Source | Description | Game Engine | Type |
---|---|---|---|
Applio | Ultimate voice cloning tool, meticulously optimized for unrivaled power, modularity, and user-friendly experience. | Speech | |
Audyo | Text in. Audio out. | Speech | |
Bark | Text-Prompted Generative Audio Model. | Speech | |
Bert-VITS2 | VITS2 Backbone with multilingual bert. | Speech | |
ChatTTS | ChatTTS is a generative speech model for daily dialogue. | Speech | |
CLAPSpeech | Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training. | Speech | |
EmotiVoice | A Multi-Voice and Prompt-Controlled TTS Engine. | Speech | |
Fliki | Turn text into videos with AI voices. | Speech | |
Glow-TTS | A Generative Flow for Text-to-Speech via Monotonic Alignment Search. | Speech | |
GPT-SoVITS | A Powerful Few-shot Voice Conversion and Text-to-Speech WebUI. | Speech | |
LOVO | LOVO is the go-to AI Voice Generator & Text to Speech platform for thousands of creators. | Speech | |
MahaTTS | An Open-Source Large Speech Generation Model. | Speech | |
Matcha-TTS | A fast TTS architecture with conditional flow matching. | Speech | |
MeloTTS | High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean. | Speech | |
MetaVoice-1B | AI for human-level speech intelligence. | Speech | |
Narakeet | Easily Create Voiceovers Using Realistic Text to Speech. | Speech | |
One-Shot-Voice-Cloning | One Shot Voice Cloning base on Unet-TTS. | Speech | |
OpenVoice | Instant voice cloning by MyShell. | Speech | |
OverFlow | Putting flows on top of neural transducers for better TTS. | Speech | |
RealtimeTTS | RealtimeTTS is a state-of-the-art text-to-speech (TTS) library designed for real-time applications. | Speech | |
SpeechGPT | Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities. | Speech | |
speech-to-text-gpt3-unity | This is the repo I use Whisper and ChatGPT API from OpenAI in Unity. | Unity | Speech |
Stable Speech | Stability AI's Text-to-Speech model. | Speech | |
StableTTS | Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3. | Speech | |
StyleTTS 2 | Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. | Speech | |
TorToiSe-TTS | A multi-voice TTS system trained with an emphasis on quality. | Speech | |
TTS Generation WebUI | TTS Generation WebUI (Bark, MusicGen, Tortoise, RVC, Vocos, Demucs). | Speech | |
VALL-E | Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers. | Speech | |
VALL-E X | Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling | Speech | |
Vocode | Vocode is an open-source library for building voice-based LLM applications. | Speech | |
Voicebox | Text-Guided Multilingual Universal Speech Generation at Scale. | Speech | |
VoiceCraft | Zero-Shot Speech Editing and Text-to-Speech in the Wild. | Speech | |
Whisper | Whisper is a general-purpose speech recognition model. | Speech | |
WhisperSpeech | An Open Source text-to-speech system built by inverting Whisper. | Speech | |
X-E-Speech | Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion. | Speech | |
XTTS | XTTS is a library for advanced Text-to-Speech generation. | Speech | |
YourTTS | Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone. | Speech | |
ZMM-TTS | Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations. | Speech |
Source | Description | Game Engine | Type |
---|---|---|---|
Ludo.ai | Assistant for game research and design. | Analytics |
Source | Description | Game Engine | Type |
---|---|---|---|
CoTracker | It is Better to Track Together. | Video Tool | |
FaceHi | It is Better to Track Together. | Video Tool | |
LGVI | Towards Language-Driven Video Inpainting via Multimodal Large Language Models. | Video Tool | |
MaskViT | Masked Visual Pre-Training for Video Prediction. | Video Tool |
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-game-development-tools
Similar Open Source Tools
ai-game-development-tools
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool
ai-reference-models
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. The purpose is to quickly replicate complete software environments showcasing the AI capabilities of Intel platforms. It includes optimizations for popular deep learning frameworks like TensorFlow and PyTorch, with additional plugins/extensions for improved performance. The repository is licensed under Apache License Version 2.0.
models
The Intel® AI Reference Models repository contains links to pre-trained models, sample scripts, best practices, and tutorials for popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs. It aims to replicate the best-known performance of target model/dataset combinations in optimally-configured hardware environments. The repository will be deprecated upon the publication of v3.2.0 and will no longer be maintained or published.
Cool-GenAI-Fashion-Papers
Cool-GenAI-Fashion-Papers is a curated list of resources related to GenAI-Fashion, including papers, workshops, companies, and products. It covers a wide range of topics such as fashion design synthesis, outfit recommendation, fashion knowledge extraction, trend analysis, and more. The repository provides valuable insights and resources for researchers, industry professionals, and enthusiasts interested in the intersection of AI and fashion.
Top-AI-Tools
Top AI Tools is a comprehensive, community-curated directory that aims to catalog and showcase the most outstanding AI-powered products. This index is not exhaustive, but rather a compilation of our research and contributions from the community.
Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.
cool-ai-stuff
This repository contains an uncensored list of free to use APIs and sites for several AI models. > _This list is mainly managed by @zukixa, the queen of zukijourney, so any decisions may have bias!~_ > > **Scroll down for the sites, APIs come first!** * * * > [!WARNING] > We are not endorsing _any_ of the listed services! Some of them might be considered controversial. We are not responsible for any legal, technical or any other damage caused by using the listed services. Data is provided without warranty of any kind. **Use these at your own risk!** * * * # APIs Table of Contents #### Overview of Existing APIs #### Overview of Existing APIs -- Top LLM Models Available #### Overview of Existing APIs -- Top Image Models Available #### Overview of Existing APIs -- Top Other Features & Models Available #### Overview of Existing APIs -- Available Donator Perks * * * ## API List:* *: This list solely covers all providers I (@zukixa) was able to collect metrics in. Any mistakes are not my responsibility, as I am either banned, or not aware of x API. \ 1: Last Updated 4/14/24 ### Overview of APIs: | Service | # of Users1 | Link | Stablity | NSFW Ok? | Open Source? | Owner(s) | Other Notes | | ----------- | ---------- | ------------------------------------------ | ------------------------------------------ | --------------------------- | ------------------------------------------------------ | -------------------------- | ----------------------------------------------------------------------------------------------------------- | | zukijourney| 4441 | D | High | On /unf/, not /v1/ | ✅, Here | @zukixa | Largest & Oldest GPT-4 API still continuously around. Offers other popular AI-related Bots too. | | Hyzenberg| 1234 | D | High | Forbidden | ❌ | @thatlukinhasguy & @voidiii | Experimental sister API to Zukijourney. Successor to HentAI | | NagaAI | 2883 | D | High | Forbidden | ❌ | @zentixua | Honorary successor to ChimeraGPT, the largest API in history (15k users). | | WebRaftAI | 993 | D | High | Forbidden | ❌ | @ds_gamer | Largest API by model count. Provides a lot of service/hosting related stuff too. | | KrakenAI | 388 | D | High | Discouraged | ❌ | @paninico | It is an API of all time. | | ShuttleAI | 3585 | D | Medium | Generally Permitted | ❌ | @xtristan | Faked GPT-4 Before 1, 2 | | Mandrill | 931 | D | Medium | Enterprise-Tier-Only | ❌ | @fredipy | DALL-E-3 access pioneering API. Has some issues with speed & stability nowadays. | oxygen | 742 | D | Medium | Donator-Only | ❌ | @thesketchubuser | Bri'ish 🤮 & Fren'sh 🤮 | | Skailar | 399 | D | Medium | Forbidden | ❌ | @aquadraws | Service is the personification of the word 'feature creep'. Lots of things announced, not much operational. |
LLM-PlayLab
LLM-PlayLab is a repository containing various projects related to LLM (Large Language Models) fine-tuning, generative AI, time-series forecasting, and crash courses. It includes projects for text generation, sentiment analysis, data analysis, chat assistants, image captioning, and more. The repository offers a wide range of tools and resources for exploring and implementing advanced AI techniques.
LLM-Agent-Survey
Autonomous agents are designed to achieve specific objectives through self-guided instructions. With the emergence and growth of large language models (LLMs), there is a growing trend in utilizing LLMs as fundamental controllers for these autonomous agents. This repository conducts a comprehensive survey study on the construction, application, and evaluation of LLM-based autonomous agents. It explores essential components of AI agents, application domains in natural sciences, social sciences, and engineering, and evaluation strategies. The survey aims to be a resource for researchers and practitioners in this rapidly evolving field.
RAGHub
RAGHub is a community-driven project focused on cataloging new and emerging frameworks, projects, and resources in the Retrieval-Augmented Generation (RAG) ecosystem. It aims to help users stay ahead of changes in the field by providing a platform for the latest innovations in RAG. The repository includes information on RAG frameworks, evaluation frameworks, optimization frameworks, citation frameworks, engines, search reranker frameworks, projects, resources, and real-world use cases across industries and professions.
AudioLLM
AudioLLMs is a curated collection of research papers focusing on developing, implementing, and evaluating language models for audio data. The repository aims to provide researchers and practitioners with a comprehensive resource to explore the latest advancements in AudioLLMs. It includes models for speech interaction, speech recognition, speech translation, audio generation, and more. Additionally, it covers methodologies like multitask audioLLMs and segment-level Q-Former, as well as evaluation benchmarks like AudioBench and AIR-Bench. Adversarial attacks such as VoiceJailbreak are also discussed.
open-llms
Open LLMs is a repository containing various Large Language Models licensed for commercial use. It includes models like T5, GPT-NeoX, UL2, Bloom, Cerebras-GPT, Pythia, Dolly, and more. These models are designed for tasks such as transfer learning, language understanding, chatbot development, code generation, and more. The repository provides information on release dates, checkpoints, papers/blogs, parameters, context length, and licenses for each model. Contributions to the repository are welcome, and it serves as a resource for exploring the capabilities of different language models.
kumo-search
Kumo search is an end-to-end search engine framework that supports full-text search, inverted index, forward index, sorting, caching, hierarchical indexing, intervention system, feature collection, offline computation, storage system, and more. It runs on the EA (Elastic automic infrastructure architecture) platform, enabling engineering automation, service governance, real-time data, service degradation, and disaster recovery across multiple data centers and clusters. The framework aims to provide a ready-to-use search engine framework to help users quickly build their own search engines. Users can write business logic in Python using the AOT compiler in the project, which generates C++ code and binary dynamic libraries for rapid iteration of the search engine.
For similar tasks
ai-game-development-tools
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool
LayaAir
LayaAir engine, under the Layabox brand, is a 3D engine that supports full-platform publishing. It can be applied in various fields such as games, education, advertising, marketing, digital twins, metaverse, AR guides, VR scenes, architectural design, industrial design, etc.
ComfyUI-BlenderAI-node
ComfyUI-BlenderAI-node is an addon for Blender that allows users to convert ComfyUI nodes into Blender nodes seamlessly. It offers features such as converting nodes, editing launch arguments, drawing masks with Grease pencil, and more. Users can queue batch processing, use node tree presets, and model preview images. The addon enables users to input or replace 3D models in Blender and output controlnet images using composite. It provides a workflow showcase with presets for camera input, AI-generated mesh import, composite depth channel, character bone editing, and more.
ai-collective-tools
ai-collective-tools is an open-source community dedicated to creating a comprehensive collection of AI tools for developers, researchers, and enthusiasts. The repository provides a curated selection of AI tools and resources across various categories such as 3D, Agriculture, Art, Audio Editing, Avatars, Chatbots, Code Assistant, Cooking, Copywriting, Crypto, Customer Support, Dating, Design Assistant, Design Generator, Developer, E-Commerce, Education, Email Assistant, Experiments, Fashion, Finance, Fitness, Fun Tools, Gaming, General Writing, Gift Ideas, HealthCare, Human Resources, Image Classification, Image Editing, Image Generator, Interior Designing, Legal Assistant, Logo Generator, Low Code, Models, Music, Paraphraser, Personal Assistant, Presentations, Productivity, Prompt Generator, Psychology, Real Estate, Religion, Research, Resume, Sales, Search Engine, SEO, Shopping, Social Media, Spreadsheets, SQL, Startup Tools, Story Teller, Summarizer, Testing, Text to Speech, Text to Image, Transcriber, Travel, Video Editing, Video Generator, Weather, Writing Generator, and Other Resources.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
mistral.rs
Mistral.rs is a fast LLM inference platform written in Rust. We support inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings.
For similar jobs
ai-game-development-tools
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
better-genshin-impact
BetterGI is a project based on computer vision technology, which aims to make Genshin Impact better. It can automatically pick up items, skip dialogues, automatically select options, automatically submit items, close pop-up pages, etc. When talking to Katherine, it can automatically receive the "Daily Commission" rewards and automatically re-dispatch. When the automatic plot function is turned on, this function will take effect, and the invitation options will be automatically selected. AI recognizes automatic casting, automatically reels in when the fish is hooked, and automatically completes the fishing progress. Help you easily complete the Seven Saint Summoning character invitation, weekly visitor challenge and other PVE content. Automatically use the "King Tree Blessing" with the `Z` key, and use the principle of refreshing wood by going online and offline to hang up a backpack full of wood. Write combat scripts to let the team fight automatically according to your strategy. Fully automatic secret realm hangs up to restore physical strength, automatically enters the secret realm to open the key, fight, walk to the ancient tree and receive rewards. Click the teleportation point on the map, or if there is a teleportation point in the list that appears after clicking, it will automatically click the teleportation point and teleport. Set a shortcut key, and long press to continuously rotate the perspective horizontally (of course you can also use it to rotate the grass god). Quickly switch between "Details" and "Enhance" pages to skip the display of holy relic enhancement results and quickly +20. You can quickly purchase items in the store in full quantity, which is suitable for quickly clearing event redemptions,塵歌壺 store redemptions, etc.
agnai
Agnaistic is an AI roleplay chat tool that allows users to interact with personalized characters using their favorite AI services. It supports multiple AI services, persona schema formats, and features such as group conversations, user authentication, and memory/lore books. Agnaistic can be self-hosted or run using Docker, and it provides a range of customization options through its settings.json file. The tool is designed to be user-friendly and accessible, making it suitable for both casual users and developers.
mage
XMage is an open-source, cross-platform application that allows users to play the collectible card game Magic: The Gathering online against other players or computer opponents. It supports over 25,000 unique cards and more than 65,000 reprints from different editions, including custom sets like Star Wars. XMage supports single matches and tournaments with dozens of game modes, including duel, multiplayer, standard, modern, commander, pauper, oathbreaker, historic, freeform, and richman. It also features a deck editor, a player rating system, and support for special formats like Commander, Oathbreaker, Cube, Tiny Leaders, Super Standard, and Historic Standard.
RisuAI
RisuAI, or Risu for short, is a cross-platform AI chatting software/web application with powerful features such as multiple API support, assets in the chat, regex functions, and much more.
chatdev
ChatDev IDE is a tool for building your AI agent, Whether it's NPCs in games or powerful agent tools, you can design what you want for this platform. It accelerates prompt engineering through **JavaScript Support** that allows implementing complex prompting techniques.
AgentKit
AgentKit is a framework for constructing complex human thought processes from simple natural language prompts. It offers a unified way to represent and execute these processes as graphs, making it easy to design and tune agents without any programming experience. AgentKit can be used for a variety of tasks, including generating text, answering questions, and making decisions.