![aigcpanel](/statics/github-mark.png)
aigcpanel
AigcPanel 是一个简单易用的一站式AI数字人系统,支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。
Stars: 656
![screenshot](/screenshots_githubs/modstart-lib-aigcpanel.jpg)
AigcPanel is a simple and easy-to-use all-in-one AI digital human system that even beginners can use. It supports video synthesis, voice synthesis, voice cloning, simplifies local model management, and allows one-click import and use of AI models. It prohibits the use of this product for illegal activities and users must comply with the laws and regulations of the People's Republic of China.
README:
AigcPanel
是一个简单易用的一站式AI数字人系统,小白也可使用。
支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。
禁止使用本产品进行违法违规业务,使用本软件请遵守中华人民共和国法律法规。
- 支持视频数字人合成,支持视频画面和声音换口型匹配
- 支持语音合成、语音克隆,多种声音参数可设置
- 支持多模型导入、一键启动、模型设置、模型日志查看
- 支持国际化,支持简体中文、英语
- 支持多种模型一键启动包:
MuseTalk
、cosyvoice
参考 demo 中的视频文件。
- 访问 https://aigcpanel.com 下载 Windows 安装包,一键安装即可
安装完成后,打开软件,下载模型一键启动包,即可使用。
如果有第三方一键启动的模型,可以按照以下方式接入。
模型文件夹格式,只需要编写 config.json
和 server.js
两个文件即可。
|- 模型文件夹/
|-|- config.json - 模型配置文件
|-|- server.js - 模型对接文件
|-|- xxx - 其他模型文件,推荐将模型文件放在 model 文件夹下
{
"name": "server-xxx", // 模型名称
"version": "0.1.0", // 模型版本
"title": "语音模型", // 模型标题
"description": "模型描述", // 模型描述
"platformName": "win", // 支持系统,win, osx, linux
"platformArch": "x86", // 支持架构,x86, arm64
"entry": "server/main", // 入口文件,一键启动包文件
"functions": [
"videoGen", // 支持视频生成
"soundTTS", // 支持语音合成
"soundClone" // 支持语音克隆
],
"settings": [ // 模型配置项,可以显示在模型配置页面
{
"name": "port",
"type": "text",
"title": "服务端口",
"default": "",
"placeholder": "留空会检测使用随机端口"
}
]
}
以下以 MuseTalk 为例
const serverRuntime = {
port: 0,
}
let shellController = null
module.exports = {
ServerApi: null,
_url() {
return `http://localhost:${serverRuntime.port}/`
},
async _client() {
return await this.ServerApi.GradioClient.connect(this._url());
},
_send(serverInfo, type, data) {
this.ServerApi.event.sendChannel(serverInfo.eventChannelName, {type, data})
},
// 模型初始化
async init(ServerApi) {
this.ServerApi = ServerApi;
},
// 模型启动
async start(serverInfo) {
console.log('start', JSON.stringify(serverInfo))
this._send(serverInfo, 'starting', serverInfo)
let command = []
if (serverInfo.setting?.['port']) {
serverRuntime.port = serverInfo.setting.port
} else if (!serverRuntime.port || !await this.ServerApi.app.isPortAvailable(serverRuntime.port)) {
serverRuntime.port = await this.ServerApi.app.availablePort(50617)
}
if (serverInfo.setting?.['startCommand']) {
command.push(serverInfo.setting.startCommand)
} else {
//command.push(`"${serverInfo.localPath}/server/main"`)
command.push(`"${serverInfo.localPath}/server/.ai/python.exe"`)
command.push('-u')
command.push(`"${serverInfo.localPath}/server/run.py"`)
if (serverInfo.setting?.['gpuMode'] === 'cpu') {
command.push('--gpu_mode=cpu')
}
}
shellController = await this.ServerApi.app.spawnShell(command, {
cwd: `${serverInfo.localPath}/server`,
env: {
GRADIO_SERVER_PORT: serverRuntime.port,
PATH: [
process.env.PATH,
`${serverInfo.localPath}/server`,
`${serverInfo.localPath}/server/.ai/ffmpeg/bin`,
].join(';')
},
stdout: (data) => {
this.ServerApi.file.appendText(serverInfo.logFile, data)
},
stderr: (data) => {
this.ServerApi.file.appendText(serverInfo.logFile, data)
},
success: (data) => {
this._send(serverInfo, 'success', serverInfo)
},
error: (data, code) => {
this.ServerApi.file.appendText(serverInfo.logFile, data)
this._send(serverInfo, 'error', serverInfo)
},
})
},
// 模型启动检测
async ping(serverInfo) {
try {
const res = await this.ServerApi.request(`${this._url()}info`)
return true
} catch (e) {
}
return false
},
// 模型停止
async stop(serverInfo) {
this._send(serverInfo, 'stopping', serverInfo)
try {
shellController.stop()
shellController = null
} catch (e) {
console.log('stop error', e)
}
this._send(serverInfo, 'stopped', serverInfo)
},
// 模型配置
async config() {
return {
"code": 0,
"msg": "ok",
"data": {
"httpUrl": shellController ? this._url() : null,
"functions": {
"videoGen": {
"param": [
{
name: "box",
type: "inputNumber",
title: "嘴巴张开度",
defaultValue: -7,
placeholder: "",
tips: '嘴巴张开度可以控制生成视频中嘴巴的张开程度',
min: -9,
max: 9,
step: 1,
}
]
},
}
}
}
},
// 视频生成
async videoGen(serverInfo, data) {
console.log('videoGen', serverInfo, data)
const client = await this._client()
const resultData = {
// success, querying, retry
type: 'success',
start: 0,
end: 0,
jobId: '',
data: {
filePath: null
}
}
resultData.start = Date.now()
const result = await client.predict("/predict", [
this.ServerApi.GradioHandleFile(data.videoFile),
this.ServerApi.GradioHandleFile(data.soundFile),
parseInt(data.param.box)
]);
// console.log('videoGen.result', JSON.stringify(result))
resultData.end = Date.now()
resultData.data.filePath = result.data[0].value.video.path
return {
code: 0,
msg: 'ok',
data: resultData
}
},
}
electron
vue3
typescript
仅在 node 20 测试过
# 安装依赖
npm install
# 调试运行
npm run dev
# 打包
npm run build
微信群 | QQ群 |
---|---|
|
|
AGPL-3.0
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for aigcpanel
Similar Open Source Tools
![aigcpanel Screenshot](/screenshots_githubs/modstart-lib-aigcpanel.jpg)
aigcpanel
AigcPanel is a simple and easy-to-use all-in-one AI digital human system that even beginners can use. It supports video synthesis, voice synthesis, voice cloning, simplifies local model management, and allows one-click import and use of AI models. It prohibits the use of this product for illegal activities and users must comply with the laws and regulations of the People's Republic of China.
![go-anthropic Screenshot](/screenshots_githubs/liushuangls-go-anthropic.jpg)
go-anthropic
Go-anthropic is an unofficial API wrapper for Anthropic Claude in Go. It supports completions, streaming completions, messages, streaming messages, vision, and tool use. Users can interact with the Anthropic Claude API to generate text completions, analyze messages, process images, and utilize specific tools for various tasks.
![Senparc.AI Screenshot](/screenshots_githubs/Senparc-Senparc.AI.jpg)
Senparc.AI
Senparc.AI is an AI extension package for the Senparc ecosystem, focusing on LLM (Large Language Models) interaction. It provides modules for standard interfaces and basic functionalities, as well as interfaces using SemanticKernel for plug-and-play capabilities. The package also includes a library for supporting the 'PromptRange' ecosystem, compatible with various systems and frameworks. Users can configure different AI platforms and models, define AI interface parameters, and run AI functions easily. The package offers examples and commands for dialogue, embedding, and DallE drawing operations.
![python-genai Screenshot](/screenshots_githubs/googleapis-python-genai.jpg)
python-genai
The Google Gen AI SDK is a Python library that provides access to Google AI and Vertex AI services. It allows users to create clients for different services, work with parameter types, models, generate content, call functions, handle JSON response schemas, stream text and image content, perform async operations, count and compute tokens, embed content, generate and upscale images, edit images, work with files, create and get cached content, tune models, distill models, perform batch predictions, and more. The SDK supports various features like automatic function support, manual function declaration, JSON response schema support, streaming for text and image content, async methods, tuning job APIs, distillation, batch prediction, and more.
![orch Screenshot](/screenshots_githubs/guywaldman-orch.jpg)
orch
orch is a library for building language model powered applications and agents for the Rust programming language. It can be used for tasks such as text generation, streaming text generation, structured data generation, and embedding generation. The library provides functionalities for executing various language model tasks and can be integrated into different applications and contexts. It offers flexibility for developers to create language model-powered features and applications in Rust.
![airtable Screenshot](/screenshots_githubs/mehanizm-airtable.jpg)
airtable
A simple Golang package to access the Airtable API. It provides functionalities to interact with Airtable such as initializing client, getting tables, listing records, adding records, updating records, deleting records, and bulk deleting records. The package is compatible with Go 1.13 and above.
![RagaAI-Catalyst Screenshot](/screenshots_githubs/raga-ai-hub-RagaAI-Catalyst.jpg)
RagaAI-Catalyst
RagaAI Catalyst is a comprehensive platform designed to enhance the management and optimization of LLM projects. It offers features such as project management, dataset management, evaluation management, trace management, prompt management, synthetic data generation, and guardrail management. These functionalities enable efficient evaluation and safeguarding of LLM applications.
![acte Screenshot](/screenshots_githubs/j66n-acte.jpg)
acte
Acte is a framework designed to build GUI-like tools for AI Agents. It aims to address the issues of cognitive load and freedom degrees when interacting with multiple APIs in complex scenarios. By providing a graphical user interface (GUI) for Agents, Acte helps reduce cognitive load and constraints interaction, similar to how humans interact with computers through GUIs. The tool offers APIs for starting new sessions, executing actions, and displaying screens, accessible via HTTP requests or the SessionManager class.
![langchain-rust Screenshot](/screenshots_githubs/Abraxas-365-langchain-rust.jpg)
langchain-rust
LangChain Rust is a library for building applications with Large Language Models (LLMs) through composability. It provides a set of tools and components that can be used to create conversational agents, document loaders, and other applications that leverage LLMs. LangChain Rust supports a variety of LLMs, including OpenAI, Azure OpenAI, Ollama, and Anthropic Claude. It also supports a variety of embeddings, vector stores, and document loaders. LangChain Rust is designed to be easy to use and extensible, making it a great choice for developers who want to build applications with LLMs.
![herc.ai Screenshot](/screenshots_githubs/Bes-js-herc.ai.jpg)
herc.ai
Herc.ai is a powerful library for interacting with the Herc.ai API. It offers free access to users and supports all languages. Users can benefit from Herc.ai's features unlimitedly with a one-time subscription and API key. The tool provides functionalities for question answering and text-to-image generation, with support for various models and customization options. Herc.ai can be easily integrated into CLI, CommonJS, TypeScript, and supports beta models for advanced usage. Developed by FiveSoBes and Luppux Development.
![json-translator Screenshot](/screenshots_githubs/mololab-json-translator.jpg)
json-translator
The json-translator repository provides a free tool to translate JSON/YAML files or JSON objects into different languages using various translation modules. It supports CLI usage and package support, allowing users to translate words, sentences, JSON objects, and JSON files. The tool also offers multi-language translation, ignoring specific words, and safe translation practices. Users can contribute to the project by updating CLI, translation functions, JSON operations, and more. The roadmap includes features like Libre Translate option, Argos Translate option, Bing Translate option, and support for additional translation modules.
![aio-scrapy Screenshot](/screenshots_githubs/ConlinH-aio-scrapy.jpg)
aio-scrapy
Aio-scrapy is an asyncio-based web crawling and web scraping framework inspired by Scrapy. It supports distributed crawling/scraping, implements compatibility with scrapyd, and provides options for using redis queue and rabbitmq queue. The framework is designed for fast extraction of structured data from websites. Aio-scrapy requires Python 3.9+ and is compatible with Linux, Windows, macOS, and BSD systems.
![agents-flex Screenshot](/screenshots_githubs/agents-flex-agents-flex.jpg)
agents-flex
Agents-Flex is a LLM Application Framework like LangChain base on Java. It provides a set of tools and components for building LLM applications, including LLM Visit, Prompt and Prompt Template Loader, Function Calling Definer, Invoker and Running, Memory, Embedding, Vector Storage, Resource Loaders, Document, Splitter, Loader, Parser, LLMs Chain, and Agents Chain.
![aiotdlib Screenshot](/screenshots_githubs/pylakey-aiotdlib.jpg)
aiotdlib
aiotdlib is a Python asyncio Telegram client based on TDLib. It provides automatic generation of types and functions from tl schema, validation, good IDE type hinting, and high-level API methods for simpler work with tdlib. The package includes prebuilt TDLib binaries for macOS (arm64) and Debian Bullseye (amd64). Users can use their own binary by passing `library_path` argument to `Client` class constructor. Compatibility with other versions of the library is not guaranteed. The tool requires Python 3.9+ and users need to get their `api_id` and `api_hash` from Telegram docs for installation and usage.
For similar tasks
![InvokeAI Screenshot](/screenshots_githubs/invoke-ai-InvokeAI.jpg)
InvokeAI
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.
![Open-Sora-Plan Screenshot](/screenshots_githubs/PKU-YuanGroup-Open-Sora-Plan.jpg)
Open-Sora-Plan
Open-Sora-Plan is a project that aims to create a simple and scalable repo to reproduce Sora (OpenAI, but we prefer to call it "ClosedAI"). The project is still in its early stages, but the team is working hard to improve it and make it more accessible to the open-source community. The project is currently focused on training an unconditional model on a landscape dataset, but the team plans to expand the scope of the project in the future to include text2video experiments, training on video2text datasets, and controlling the model with more conditions.
![comflowyspace Screenshot](/screenshots_githubs/6174-comflowyspace.jpg)
comflowyspace
Comflowyspace is an open-source AI image and video generation tool that aims to provide a more user-friendly and accessible experience than existing tools like SDWebUI and ComfyUI. It simplifies the installation, usage, and workflow management of AI image and video generation, making it easier for users to create and explore AI-generated content. Comflowyspace offers features such as one-click installation, workflow management, multi-tab functionality, workflow templates, and an improved user interface. It also provides tutorials and documentation to lower the learning curve for users. The tool is designed to make AI image and video generation more accessible and enjoyable for a wider range of users.
![Rewind-AI-Main Screenshot](/screenshots_githubs/aidanmcintosh07-Rewind-AI-Main.jpg)
Rewind-AI-Main
Rewind AI is a free and open-source AI-powered video editing tool that allows users to easily create and edit videos. It features a user-friendly interface, a wide range of editing tools, and support for a variety of video formats. Rewind AI is perfect for beginners and experienced video editors alike.
![MoneyPrinterTurbo Screenshot](/screenshots_githubs/harry0703-MoneyPrinterTurbo.jpg)
MoneyPrinterTurbo
MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.
![Dough Screenshot](/screenshots_githubs/banodoco-Dough.jpg)
Dough
Dough is a tool for crafting videos with AI, allowing users to guide video generations with precision using images and example videos. Users can create guidance frames, assemble shots, and animate them by defining parameters and selecting guidance videos. The tool aims to help users make beautiful and unique video creations, providing control over the generation process. Setup instructions are available for Linux and Windows platforms, with detailed steps for installation and running the app.
![ragdoll-studio Screenshot](/screenshots_githubs/bennyschmidt-ragdoll-studio.jpg)
ragdoll-studio
Ragdoll Studio is a platform offering web apps and libraries for interacting with Ragdoll, enabling users to go beyond fine-tuning and create flawless creative deliverables, rich multimedia, and engaging experiences. It provides various modes such as Story Mode for creating and chatting with characters, Vector Mode for producing vector art, Raster Mode for producing raster art, Video Mode for producing videos, Audio Mode for producing audio, and 3D Mode for producing 3D objects. Users can export their content in various formats and share their creations on the community site. The platform consists of a Ragdoll API and a front-end React application for seamless usage.
![Whisper-TikTok Screenshot](/screenshots_githubs/MatteoFasulo-Whisper-TikTok.jpg)
Whisper-TikTok
Discover Whisper-TikTok, an innovative AI-powered tool that leverages the prowess of Edge TTS, OpenAI-Whisper, and FFMPEG to craft captivating TikTok videos. Whisper-TikTok effortlessly generates accurate transcriptions from audio files and integrates Microsoft Edge Cloud Text-to-Speech API for vibrant voiceovers. The program orchestrates the synthesis of videos using a structured JSON dataset, generating mesmerizing TikTok content in minutes.
For similar jobs
![Thor Screenshot](/screenshots_githubs/AIDotNet-Thor.jpg)
Thor
Thor is a powerful AI model management tool designed for unified management and usage of various AI models. It offers features such as user, channel, and token management, data statistics preview, log viewing, system settings, external chat link integration, and Alipay account balance purchase. Thor supports multiple AI models including OpenAI, Kimi, Starfire, Claudia, Zhilu AI, Ollama, Tongyi Qianwen, AzureOpenAI, and Tencent Hybrid models. It also supports various databases like SqlServer, PostgreSql, Sqlite, and MySql, allowing users to choose the appropriate database based on their needs.
![aigcpanel Screenshot](/screenshots_githubs/modstart-lib-aigcpanel.jpg)
aigcpanel
AigcPanel is a simple and easy-to-use all-in-one AI digital human system that even beginners can use. It supports video synthesis, voice synthesis, voice cloning, simplifies local model management, and allows one-click import and use of AI models. It prohibits the use of this product for illegal activities and users must comply with the laws and regulations of the People's Republic of China.
![Qwen-TensorRT-LLM Screenshot](/screenshots_githubs/Tlntin-Qwen-TensorRT-LLM.jpg)
Qwen-TensorRT-LLM
Qwen-TensorRT-LLM is a project developed for the NVIDIA TensorRT Hackathon 2023, focusing on accelerating inference for the Qwen-7B-Chat model using TRT-LLM. The project offers various functionalities such as FP16/BF16 support, INT8 and INT4 quantization options, Tensor Parallel for multi-GPU parallelism, web demo setup with gradio, Triton API deployment for maximum throughput/concurrency, fastapi integration for openai requests, CLI interaction, and langchain support. It supports models like qwen2, qwen, and qwen-vl for both base and chat models. The project also provides tutorials on Bilibili and blogs for adapting Qwen models in NVIDIA TensorRT-LLM, along with hardware requirements and quick start guides for different model types and quantization methods.
![dl_model_infer Screenshot](/screenshots_githubs/yhwang-hub-dl_model_infer.jpg)
dl_model_infer
This project is a c++ version of the AI reasoning library that supports the reasoning of tensorrt models. It provides accelerated deployment cases of deep learning CV popular models and supports dynamic-batch image processing, inference, decode, and NMS. The project has been updated with various models and provides tutorials for model exports. It also includes a producer-consumer inference model for specific tasks. The project directory includes implementations for model inference applications, backend reasoning classes, post-processing, pre-processing, and target detection and tracking. Speed tests have been conducted on various models, and onnx downloads are available for different models.
![joliGEN Screenshot](/screenshots_githubs/jolibrain-joliGEN.jpg)
joliGEN
JoliGEN is an integrated framework for training custom generative AI image-to-image models. It implements GAN, Diffusion, and Consistency models for various image translation tasks, including domain and style adaptation with conservation of semantics. The tool is designed for real-world applications such as Controlled Image Generation, Augmented Reality, Dataset Smart Augmentation, and Synthetic to Real transforms. JoliGEN allows for fast and stable training with a REST API server for simplified deployment. It offers a wide range of options and parameters with detailed documentation available for models, dataset formats, and data augmentation.
![ai-edge-torch Screenshot](/screenshots_githubs/google-ai-edge-ai-edge-torch.jpg)
ai-edge-torch
AI Edge Torch is a Python library that supports converting PyTorch models into a .tflite format for on-device applications on Android, iOS, and IoT devices. It offers broad CPU coverage with initial GPU and NPU support, closely integrating with PyTorch and providing good coverage of Core ATen operators. The library includes a PyTorch converter for model conversion and a Generative API for authoring mobile-optimized PyTorch Transformer models, enabling easy deployment of Large Language Models (LLMs) on mobile devices.
![awesome-RK3588 Screenshot](/screenshots_githubs/choushunn-awesome-RK3588.jpg)
awesome-RK3588
RK3588 is a flagship 8K SoC chip by Rockchip, integrating Cortex-A76 and Cortex-A55 cores with NEON coprocessor for 8K video codec. This repository curates resources for developing with RK3588, including official resources, RKNN models, projects, development boards, documentation, tools, and sample code.
![cl-waffe2 Screenshot](/screenshots_githubs/hikettei-cl-waffe2.jpg)
cl-waffe2
cl-waffe2 is an experimental deep learning framework in Common Lisp, providing fast, systematic, and customizable matrix operations, reverse mode tape-based Automatic Differentiation, and neural network model building and training features accelerated by a JIT Compiler. It offers abstraction layers, extensibility, inlining, graph-level optimization, visualization, debugging, systematic nodes, and symbolic differentiation. Users can easily write extensions and optimize their networks without overheads. The framework is designed to eliminate barriers between users and developers, allowing for easy customization and extension.