MoneyPrinterTurbo
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Stars: 15278
MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.
README:
由于该项目的 部署 和 使用,对于一些小白用户来说,还是 有一定的门槛,在此特别感谢
录咖(AI智能 多媒体服务平台) 网站基于该项目,提供的免费AI视频生成器
服务,可以不用部署,直接在线使用,非常方便。
感谢佐糖 https://picwish.cn 对该项目的支持和赞助,使得该项目能够持续的更新和维护。
佐糖专注于图像处理领域,提供丰富的图像处理工具,将复杂操作极致简化,真正实现让图像处理更简单。
- [x] 完整的 MVC架构,代码 结构清晰,易于维护,支持
API
和Web界面
- [x] 支持视频文案 AI自动生成,也可以自定义文案
- [x] 支持多种 高清视频 尺寸
- [x] 竖屏 9:16,
1080x1920
- [x] 横屏 16:9,
1920x1080
- [x] 竖屏 9:16,
- [x] 支持 批量视频生成,可以一次生成多个视频,然后选择一个最满意的
- [x] 支持 视频片段时长 设置,方便调节素材切换频率
- [x] 支持 中文 和 英文 视频文案
- [x] 支持 多种语音 合成,可 实时试听 效果
- [x] 支持 字幕生成,可以调整
字体
、位置
、颜色
、大小
,同时支持字幕描边
设置 - [x] 支持 背景音乐,随机或者指定音乐文件,可设置
背景音乐音量
- [x] 视频素材来源 高清,而且 无版权,也可以使用自己的 本地素材
- [x] 支持 OpenAI、Moonshot、Azure、gpt4free、one-api、通义千问、Google Gemini、Ollama、
DeepSeek、 文心一言 等多种模型接入
- 中国用户建议使用 DeepSeek 或 Moonshot 作为大模型提供商(国内可直接访问,不需要VPN。注册就送额度,基本够用)
- [ ] GPT-SoVITS 配音支持
- [ ] 优化语音合成,利用大模型,使其合成的声音,更加自然,情绪更加丰富
- [ ] 增加视频转场效果,使其看起来更加的流畅
- [ ] 增加更多视频素材来源,优化视频素材和文案的匹配度
- [ ] 增加视频长度选项:短、中、长
- [ ] 支持更多的语音合成服务商,比如 OpenAI TTS
- [ ] 自动上传到YouTube平台
|
更真实的合成声音 |
|
---|---|---|
|
|
---|---|
- 建议最低 CPU 4核或以上,内存 8G 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
下载一键启动包,解压直接使用(路径不要有 中文、特殊字符、空格)
- 百度网盘(1.2.1 最新版本): https://pan.baidu.com/s/1pSNjxTYiVENulTLm6zieMQ?pwd=g36q 提取码: g36q
下载后,建议先双击执行 update.bat
更新到最新代码,然后双击 start.bat
启动
启动后,会自动打开浏览器(如果打开是空白,建议换成 Chrome 或者 Edge 打开)
还没有制作一键启动包,看下面的 安装部署 部分,建议使用 docker 部署,更加方便。
- 尽量不要使用 中文路径,避免出现一些无法预料的问题
- 请确保你的 网络 是正常的,VPN需要打开
全局流量
模式
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
- 将
config.example.toml
文件复制一份,命名为config.toml
- 按照
config.toml
文件中的说明,配置好pexels_api_keys
和llm_provider
,并根据 llm_provider 对应的服务商,配置相关的 API Key
如果未安装 Docker,请先安装 https://www.docker.com/products/docker-desktop/
如果是Windows系统,请参考微软的文档:
- https://learn.microsoft.com/zh-cn/windows/wsl/install
- https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers
cd MoneyPrinterTurbo
docker-compose up
打开浏览器,访问 http://0.0.0.0:8501
打开浏览器,访问 http://0.0.0.0:8080/docs 或者 http://0.0.0.0:8080/redoc
视频教程
- 完整的使用演示:https://v.douyin.com/iFhnwsKY/
- 如何在Windows上部署:https://v.douyin.com/iFyjoW3M
建议使用 conda 创建 python 虚拟环境
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
cd MoneyPrinterTurbo
conda create -n MoneyPrinterTurbo python=3.10
conda activate MoneyPrinterTurbo
pip install -r requirements.txt
-
Windows:
- 下载 https://imagemagick.org/script/download.php 选择Windows版本,切记一定要选择 静态库 版本,比如 ImageMagick-7.1.1-32-Q16-x64-static.exe
- 安装下载好的 ImageMagick,注意不要修改安装路径
- 修改
配置文件 config.toml
中的imagemagick_path
为你的 实际安装路径
-
MacOS:
brew install imagemagick
-
Ubuntu
sudo apt-get install imagemagick
-
CentOS
sudo yum install ImageMagick
注意需要到 MoneyPrinterTurbo 项目 根目录
下执行以下命令
conda activate MoneyPrinterTurbo
webui.bat
conda activate MoneyPrinterTurbo
sh webui.sh
启动后,会自动打开浏览器(如果打开是空白,建议换成 Chrome 或者 Edge 打开)
python main.py
启动后,可以查看 API文档
http://127.0.0.1:8080/docs 或者 http://127.0.0.1:8080/redoc 直接在线调试接口,快速体验。
所有支持的声音列表,可以查看:声音列表
2024-04-16 v1.1.2 新增了9种Azure的语音合成声音,需要配置API KEY,该声音合成的更加真实。
当前支持2种字幕生成方式:
-
edge: 生成
速度快
,性能更好,对电脑配置没有要求,但是质量可能不稳定 -
whisper: 生成
速度慢
,性能较差,对电脑配置有一定要求,但是质量更可靠
。
可以修改 config.toml
配置文件中的 subtitle_provider
进行切换
建议使用 edge
模式,如果生成的字幕质量不好,再切换到 whisper
模式
注意:
- whisper 模式下需要到 HuggingFace 下载一个模型文件,大约 3GB 左右,请确保网络通畅
- 如果留空,表示不生成字幕。
由于国内无法访问 HuggingFace,可以使用以下方法下载
whisper-large-v3
的模型文件
下载地址:
- 百度网盘: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- 夸克网盘:https://pan.quark.cn/s/3ee3d991d64b
模型下载后解压,整个目录放到 .\MoneyPrinterTurbo\models
里面,
最终的文件路径应该是这样: .\MoneyPrinterTurbo\models\whisper-large-v3
MoneyPrinterTurbo
├─models
│ └─whisper-large-v3
│ config.json
│ model.bin
│ preprocessor_config.json
│ tokenizer.json
│ vocabulary.json
用于视频的背景音乐,位于项目的 resource/songs
目录下。
当前项目里面放了一些默认的音乐,来自于 YouTube 视频,如有侵权,请删除。
用于视频字幕的渲染,位于项目的 resource/fonts
目录下,你也可以放进去自己的字体。
OpenAI宣布ChatGPT里面3.5已经免费了,有开发者将其封装成了API,可以直接调用
确保你安装和启动了docker服务,执行以下命令启动docker服务
docker run -p 3040:3040 missuo/freegpt35
启动成功后,修改 config.toml
中的配置
-
llm_provider
设置为openai
-
openai_api_key
随便填写一个即可,比如 '123456' -
openai_base_url
改为http://localhost:3040/v1/
-
openai_model_name
改为gpt-3.5-turbo
注意:该方式稳定性较差
这个问题是由于大模型没有返回正确的回复导致的。
大概率是网络原因, 使用 VPN,或者设置 openai_base_url
为你的代理 ,应该就可以解决了。
同时建议使用 Moonshot 或 DeepSeek 作为大模型提供商,这两个服务商在国内访问速度更快,更加稳定。
通常情况下,ffmpeg 会被自动下载,并且会被自动检测到。 但是如果你的环境有问题,无法自动下载,可能会遇到如下错误:
RuntimeError: No ffmpeg exe could be found.
Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
此时你可以从 https://www.gyan.dev/ffmpeg/builds/ 下载ffmpeg,解压后,设置 ffmpeg_path
为你的实际安装路径即可。
[app]
# 请根据你的实际路径设置,注意 Windows 路径分隔符为 \\
ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
可以在ImageMagick的配置文件policy.xml中找到这些策略。
这个文件通常位于 /etc/ImageMagick-X
/ 或 ImageMagick 安装目录的类似位置。
修改包含pattern="@"
的条目,将rights="none"
更改为rights="read|write"
以允许对文件的读写操作。
这个问题是由于系统打开文件数限制导致的,可以通过修改系统的文件打开数限制来解决。
查看当前限制
ulimit -n
如果过低,可以调高一些,比如
ulimit -n 10240
LocalEntryNotfoundEror: Cannot find an appropriate cached snapshotfolderfor the specified revision on the local disk and outgoing trafic has been disabled. To enablerepo look-ups and downloads online, pass 'local files only=False' as input.
或者
An error occured while synchronizing the model Systran/faster-whisper-large-v3 from the Hugging Face Hub: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again. Trying to load the model directly from the local cache, if it exists.
解决方法:点击查看如何从网盘手动下载模型
- 可以提交 issue 或者 pull request。
该项目基于 https://github.com/FujiwaraChoki/MoneyPrinter 重构而来,做了大量的优化,增加了更多的功能。 感谢原作者的开源精神。
点击查看 LICENSE
文件
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MoneyPrinterTurbo
Similar Open Source Tools
MoneyPrinterTurbo
MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.
MarkMap-OpenAi-ChatGpt
MarkMap-OpenAi-ChatGpt is a Vue.js-based mind map generation tool that allows users to generate mind maps by entering titles or content. The application integrates the markmap-lib and markmap-view libraries, supports visualizing mind maps, and provides functions for zooming and adapting the map to the screen. Users can also export the generated mind map in PNG, SVG, JPEG, and other formats. This project is suitable for quickly organizing ideas, study notes, project planning, etc. By simply entering content, users can get an intuitive mind map that can be continuously expanded, downloaded, and shared.
Tianji
Tianji is a free, non-commercial artificial intelligence system developed by SocialAI for tasks involving worldly wisdom, such as etiquette, hospitality, gifting, wishes, communication, awkwardness resolution, and conflict handling. It includes four main technical routes: pure prompt, Agent architecture, knowledge base, and model training. Users can find corresponding source code for these routes in the tianji directory to replicate their own vertical domain AI applications. The project aims to accelerate the penetration of AI into various fields and enhance AI's core competencies.
Code-Interpreter-Api
Code Interpreter API is a project that combines a scheduling center with a sandbox environment, dedicated to creating the world's best code interpreter. It aims to provide a secure, reliable API interface for remotely running code and obtaining execution results, accelerating the development of various AI agents, and being a boon to many AI enthusiasts. The project innovatively combines Docker container technology to achieve secure isolation and execution of Python code. Additionally, the project supports storing generated image data in a PostgreSQL database and accessing it through API endpoints, providing rich data processing and storage capabilities.
AirPower4T
AirPower4T is a development base library based on Vue3 TypeScript Element Plus Vite, using decorators, object-oriented, Hook and other front-end development methods. It provides many common components and some feedback components commonly used in background management systems, and provides a lot of enums and decorators.
AI-Drug-Discovery-Design
AI-Drug-Discovery-Design is a repository focused on Artificial Intelligence-assisted Drug Discovery and Design. It explores the use of AI technology to accelerate and optimize the drug development process. The advantages of AI in drug design include speeding up research cycles, improving accuracy through data-driven models, reducing costs by minimizing experimental redundancies, and enabling personalized drug design for specific patients or disease characteristics.
airda
airda(Air Data Agent) is a multi-agent system for data analysis, which can understand data development and data analysis requirements, understand data, and generate SQL and Python code for data query, data visualization, machine learning and other tasks.
chatgpt-webui
ChatGPT WebUI is a user-friendly web graphical interface for various LLMs like ChatGPT, providing simplified features such as core ChatGPT conversation and document retrieval dialogues. It has been optimized for better RAG retrieval accuracy and supports various search engines. Users can deploy local language models easily and interact with different LLMs like GPT-4, Azure OpenAI, and more. The tool offers powerful functionalities like GPT4 API configuration, system prompt setup for role-playing, and basic conversation features. It also provides a history of conversations, customization options, and a seamless user experience with themes, dark mode, and PWA installation support.
chatgpt-web-sea
ChatGPT Web Sea is an open-source project based on ChatGPT-web for secondary development. It supports all models that comply with the OpenAI interface standard, allows for model selection, configuration, and extension, and is compatible with OneAPI. The tool includes a Chinese ChatGPT tuning guide, supports file uploads, and provides model configuration options. Users can interact with the tool through a web interface, configure models, and perform tasks such as model selection, API key management, and chat interface setup. The project also offers Docker deployment options and instructions for manual packaging.
gzm-design
Gzm Design is a free and open-source poster designer developed using the latest mainstream technologies such as Vue3, Vite4, TypeScript, etc. It provides features like PSD import, JSON import, multiple pages support, shortcut key support, template import, layer management, ruler tool, pen tool, element editing, preview, file download, canvas zooming and dragging, border stroke, filling, blending modes, text formatting, group handling, canvas size modification, rich text support, masking, shadow effects, undo/redo functionality, QR code tool, barcode tool, and ruler line npm package encapsulation.
hugging-llm
HuggingLLM is a project that aims to introduce ChatGPT to a wider audience, particularly those interested in using the technology to create new products or applications. The project focuses on providing practical guidance on how to use ChatGPT-related APIs to create new features and applications. It also includes detailed background information and system design introductions for relevant tasks, as well as example code and implementation processes. The project is designed for individuals with some programming experience who are interested in using ChatGPT for practical applications, and it encourages users to experiment and create their own applications and demos.
SQLAgent
DataAgent is a multi-agent system for data analysis, capable of understanding data development and data analysis requirements, understanding data, and generating SQL and Python code for tasks such as data query, data visualization, and machine learning.
99AI
99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.
paper-ai
Paper-ai is a tool that helps you write papers using artificial intelligence. It provides features such as AI writing assistance, reference searching, and editing and formatting tools. With Paper-ai, you can quickly and easily create high-quality papers.
MouseTooltipTranslator
MouseTooltipTranslator is a Chrome extension that allows users to translate any text on a webpage by simply hovering over it. It supports both Google Translate and Bing Translate, and can also be used to listen to the pronunciation of words and phrases. Additionally, the extension can be used to translate text in input boxes and highlighted text, and to display translated tooltips for PDFs and YouTube videos. It also supports OCR, allowing users to translate text in images by holding down the left shift key and hovering over the image.
ChatMemOllama
ChatMemOllama is a personal WeChat public account chatbot that combines a local AI model (provided by Ollama) and mem0 memory management functionality. The project aims to provide an intelligent, personalized chat experience. It features a local AI model for conversation, memory management through mem0 for a coherent dialogue experience, support for multiple users simultaneously (with logic issues in the test version), and quick responses within 5 seconds to users with timeout prompts. It allows or prohibits other users from calling AI, with ongoing development tasks including debugging multiple user handling logic and keyword replies, and completed tasks such as basic conversation and tool calling. The ultimate goal is to wait for pre-task testing completion.
For similar tasks
MoneyPrinterTurbo
MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.
InvokeAI
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.
Open-Sora-Plan
Open-Sora-Plan is a project that aims to create a simple and scalable repo to reproduce Sora (OpenAI, but we prefer to call it "ClosedAI"). The project is still in its early stages, but the team is working hard to improve it and make it more accessible to the open-source community. The project is currently focused on training an unconditional model on a landscape dataset, but the team plans to expand the scope of the project in the future to include text2video experiments, training on video2text datasets, and controlling the model with more conditions.
comflowyspace
Comflowyspace is an open-source AI image and video generation tool that aims to provide a more user-friendly and accessible experience than existing tools like SDWebUI and ComfyUI. It simplifies the installation, usage, and workflow management of AI image and video generation, making it easier for users to create and explore AI-generated content. Comflowyspace offers features such as one-click installation, workflow management, multi-tab functionality, workflow templates, and an improved user interface. It also provides tutorials and documentation to lower the learning curve for users. The tool is designed to make AI image and video generation more accessible and enjoyable for a wider range of users.
Rewind-AI-Main
Rewind AI is a free and open-source AI-powered video editing tool that allows users to easily create and edit videos. It features a user-friendly interface, a wide range of editing tools, and support for a variety of video formats. Rewind AI is perfect for beginners and experienced video editors alike.
Dough
Dough is a tool for crafting videos with AI, allowing users to guide video generations with precision using images and example videos. Users can create guidance frames, assemble shots, and animate them by defining parameters and selecting guidance videos. The tool aims to help users make beautiful and unique video creations, providing control over the generation process. Setup instructions are available for Linux and Windows platforms, with detailed steps for installation and running the app.
ragdoll-studio
Ragdoll Studio is a platform offering web apps and libraries for interacting with Ragdoll, enabling users to go beyond fine-tuning and create flawless creative deliverables, rich multimedia, and engaging experiences. It provides various modes such as Story Mode for creating and chatting with characters, Vector Mode for producing vector art, Raster Mode for producing raster art, Video Mode for producing videos, Audio Mode for producing audio, and 3D Mode for producing 3D objects. Users can export their content in various formats and share their creations on the community site. The platform consists of a Ragdoll API and a front-end React application for seamless usage.
Whisper-TikTok
Discover Whisper-TikTok, an innovative AI-powered tool that leverages the prowess of Edge TTS, OpenAI-Whisper, and FFMPEG to craft captivating TikTok videos. Whisper-TikTok effortlessly generates accurate transcriptions from audio files and integrates Microsoft Edge Cloud Text-to-Speech API for vibrant voiceovers. The program orchestrates the synthesis of videos using a structured JSON dataset, generating mesmerizing TikTok content in minutes.
For similar jobs
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.
exif-photo-blog
EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
Twitter-Insight-LLM
This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.
obs-localvocal
LocalVocal is a live-streaming AI assistant plugin for OBS that allows you to transcribe audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). It's privacy-first, with all data staying on your machine, and requires no GPU, cloud costs, network, or downtime.