FFAIVideo

A node.js project that generates short videos using popular AI LLM.

Stars: 55

Visit

FFAIVideo is a lightweight node.js project that utilizes popular AI LLM to intelligently generate short videos. It supports multiple AI LLM models such as OpenAI, Moonshot, Azure, g4f, Google Gemini, etc. Users can input text to automatically synthesize exciting video content with subtitles, background music, and customizable settings. The project integrates Microsoft Edge's online text-to-speech service for voice options and uses Pexels website for video resources. Installation of FFmpeg is essential for smooth operation. Inspired by MoneyPrinterTurbo, MoneyPrinter, and MsEdgeTTS, FFAIVideo is designed for front-end developers with minimal dependencies and simple usage.

README:

A node.js project that generates short videos using popular AI LLM.

FFAIVideo

A lightweight node.js project that utilizes the currently popular AI LLM in the industry to intelligently generate short videos. Without the need for complex configurations, simply input a short piece of text, and it can automatically synthesize an exciting video content.

Features

Fully developed on Node.js, enabling quick mastery by front-end developers.
Minimal dependencies, easy installation, cross-platform, and low machine requirements.
Simple usage, just input text to intelligently create compelling videos.
Supports Chinese and English scripts, with multiple voice options.
Includes subtitle generation, adjustable for font, position, color, and size.
Supports background music with adjustable volume.

Installation

npm install ffaivideo

Note: To run the preceding commands, Node.js and npm must be installed.

Example usage

const { generateVideo } = require('ffaivideo');

generateVideo(
  {
    provider: 'g4f',
    // Use the free g4f, or OpenAI, or Moonshot account
    // provider: 'openai',
    // openai: {
    //   apiKey: 'xxx',
    //   modelName: 'xxx',
    //   baseUrl: 'xxx',
    // },
    termsNum: 8,
    subtitleMaxWidth: 9,
    videoClipDuration: 12,
    voiceName: 'zh-CN-YunjianNeural',
    bgMusic: path.join(__dirname, './assets/songs/m1.mp3'),
    output: path.join(__dirname, './output'),
    pexels: {
      apiKey: 'xxx',
    },
    videoScript: `
    ...Enter your text here
  `,
  },
  progress => {
    console.log(progress);
  },
).then(videoPath => {
  console.log(videoPath);
});

Installation preparation

About the config of LLM

The current project already supports multiple AI LLM models such as OpenAI, Moonshot, Azure, g4f, Google Gemini, etc. to meet your different needs. If you want to introduce other AI LLM models, please fork this project and submit a Pull Request (PR) for us to evaluate and merge.

Before using this project, please make sure that you have applied for an API Key from the corresponding service provider. For example, if you plan to use GPT-4.0 or GPT-3.5, you need to make sure that you already have an API Key from OpenAI. In addition, you can also choose to use g4f, which is an open source library that provides free GPT usage services. Please note that although g4f is free, its service stability may fluctuate, and the usage experience may be good and bad from time to time. You can find its repository link on GitHub: https://github.com/xtekky/gpt4free.

In addition, as another option, you can apply for API services by visiting the Moonshot ai platform. After registration, you will immediately receive 15 of experience money, which is enough to support about 1,500 conversations. After successfully applying, you need to set the provider to moonshot and configure the corresponding apiKey to complete the project setup.

You need to configure apiKey, modelName and baseUrl. For azure ai, you also need to configure apiVersion.

openai: {
  apiKey: 'xxxx',
  modelName: 'gpt-4-turbo-preview',
  baseUrl: 'https://api.openai.com/v1',
},

About video material site

The video resources of this project use the Pexels website. Please visit https://www.pexels.com/api/new/ and follow the instructions to apply for a new API key so that you can use the rich materials provided by Pexels in your project.

About voice tts

FFAIVideo integrates Microsoft Edge's online text-to-speech service, powered by Microsoft Azure. Furthermore, it enables users to customize and set up their own app tokens for more flexible configuration and utilization of this service. As for the various voice options available for setup, you can find a detailed list in the following file within the GitHub repository: https://github.com/drawcall/FFAIVideo/blob/main/src/config/voice-config.ts. This way, you can select the most suitable voice according to your specific needs.

About installing ffmpeg

Since FFAIVideo relies on FFmpeg for its functionality, it is essential that you install a standard, well-maintained version of FFmpeg. This will ensure that FFAIVideo operates smoothly and without any compatibility issues.

How to Install and Use FFmpeg on CentOS https://linuxize.com/post/how-to-install-ffmpeg-on-centos-7/
How to Install FFmpeg on Debian https://linuxize.com/post/how-to-install-ffmpeg-on-debian-9/
How to compiling from Source on Linux https://www.tecmint.com/install-ffmpeg-in-linux/

API Configuration

Parameter name	Type	Default value	Description
provider	string	g4f	LLM Provider
moonshot	LLMConfig	-	Moonshot configuration
openai	LLMConfig	-	OpenAI configuration
azure	LLMConfig	-	Azure configuration
gemini	LLMConfig	-	Gemini configuration
g4f	LLMConfig	-	G4F configuration
pexels	MaterialSite	-	Pexels material site
videoScript	string	-	Script for generating videos
videoTerms	string \| string[]	-	Keywords for generating videos
videoAspect	VideoAspect	undefined	Video aspect ratio, can be undefined by default
videoClipDuration	number	5	Video clip duration, default is 5 seconds
termsNum	number	5	Number of keywords
output	string	-	Output path
cacheDir	string	-	Cache directory
voiceName	string	-	Voice name
voiceVolume	number	1.0	Voice volume, default is 1.0
bgMusic	string	-	Background music
bgMusicVolume	number	0.5	Background music volume, default is 0.2
fontsDir	string	-	Font directory
fontSize	number	24	Font size
fontName	string	-	Font name
textColor	string	"#FFFFFF"	Text color, default is "#FFFFFF"
strokeColor	string	"#000000"	Stroke color, default is "#000000"
strokeWidth	number	-	Stroke width
textBottom	number	20	Text bottom position
subtitleMaxWidth	number	-	Maximum subtitle width
debug	boolean	false	Debug mode
lastTime	number	5	Last time
removeCache	boolean	true	Whether to remove cache

Reference Project

This project is inspired by and builds upon the open-source contributions from several notable repositories, including MoneyPrinterTurbo, MoneyPrinter, and MsEdgeTTS. We express our sincere gratitude to the original authors for their dedication to the open-source community and their innovative spirit.

License

MIT LICENSE

For Tasks:

Click tags to check more tools for each tasks

create videos customize subtitles generate voiceovers apply background music integrate ai models

For Jobs:

video editor content creator front-end developer ai engineer digital marketer

Alternative AI tools for FFAIVideo

Similar Open Source Tools

FFAIVideo

github

: 55

llm-graph-builder

Knowledge Graph Builder App is a tool designed to convert PDF documents into a structured knowledge graph stored in Neo4j. It utilizes OpenAI's GPT/Diffbot LLM to extract nodes, relationships, and properties from PDF text content. Users can upload files from local machine or S3 bucket, choose LLM model, and create a knowledge graph. The app integrates with Neo4j for easy visualization and querying of extracted information.

github

: 3.2k

SemanticFinder

SemanticFinder is a frontend-only live semantic search tool that calculates embeddings and cosine similarity client-side using transformers.js and SOTA embedding models from Huggingface. It allows users to search through large texts like books with pre-indexed examples, customize search parameters, and offers data privacy by keeping input text in the browser. The tool can be used for basic search tasks, analyzing texts for recurring themes, and has potential integrations with various applications like wikis, chat apps, and personal history search. It also provides options for building browser extensions and future ideas for further enhancements and integrations.

github

: 204

boost

Laravel Boost accelerates AI-assisted development by providing essential context and structure for generating high-quality, Laravel-specific code. It includes an MCP server with specialized tools, AI guidelines, and a Documentation API. Boost is designed to streamline AI-assisted coding workflows by offering precise, context-aware results and extensive Laravel-specific information.

github

: 2.5k

GenAIExamples

This project provides a collective list of Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) examples such as chatbot with question and answering (ChatQnA), code generation (CodeGen), document summary (DocSum), etc.

github

: 398

TableLLM

TableLLM is a large language model designed for efficient tabular data manipulation tasks in real office scenarios. It can generate code solutions or direct text answers for tasks like insert, delete, update, query, merge, and chart operations on tables embedded in spreadsheets or documents. The model has been fine-tuned based on CodeLlama-7B and 13B, offering two scales: TableLLM-7B and TableLLM-13B. Evaluation results show its performance on benchmarks like WikiSQL, Spider, and self-created table operation benchmark. Users can use TableLLM for code and text generation tasks on tabular data.

github

: 77

opik

Comet Opik is a repository containing two main services: a frontend and a backend. It provides a Python SDK for easy installation. Users can run the full application locally with minikube, following specific installation prerequisites. The repository structure includes directories for applications like Opik backend, with detailed instructions available in the README files. Users can manage the installation using simple k8s commands and interact with the application via URLs for checking the running application and API documentation. The repository aims to facilitate local development and testing of Opik using Kubernetes technology.

github

: 14.3k

AI-Toolbox

AI-Toolbox is a C++ library aimed at representing and solving common AI problems, with a focus on MDPs, POMDPs, and related algorithms. It provides an easy-to-use interface that is extensible to many problems while maintaining readable code. The toolbox includes tutorials for beginners in reinforcement learning and offers Python bindings for seamless integration. It features utilities for combinatorics, polytopes, linear programming, sampling, distributions, statistics, belief updating, data structures, logging, seeding, and more. Additionally, it supports bandit/normal games, single agent MDP/stochastic games, single agent POMDP, and factored/joint multi-agent scenarios.

github

: 657

last_layer

last_layer is a security library designed to protect LLM applications from prompt injection attacks, jailbreaks, and exploits. It acts as a robust filtering layer to scrutinize prompts before they are processed by LLMs, ensuring that only safe and appropriate content is allowed through. The tool offers ultra-fast scanning with low latency, privacy-focused operation without tracking or network calls, compatibility with serverless platforms, advanced threat detection mechanisms, and regular updates to adapt to evolving security challenges. It significantly reduces the risk of prompt-based attacks and exploits but cannot guarantee complete protection against all possible threats.

github

: 79

flowgen

FlowGen is a tool built for AutoGen, a great agent framework from Microsoft and a lot of contributors. It provides intuitive visual tools that streamline the construction and oversight of complex agent-based workflows, simplifying the process for creators and developers. Users can create Autoflows, chat with agents, and share flow templates. The tool is fully dockerized and supports deployment on Railway.app. Contributions to the project are welcome, and the platform uses semantic-release for versioning and releases.

github

: 123

cambrian

Cambrian-1 is a fully open project focused on exploring multimodal Large Language Models (LLMs) with a vision-centric approach. It offers competitive performance across various benchmarks with models at different parameter levels. The project includes training configurations, model weights, instruction tuning data, and evaluation details. Users can interact with Cambrian-1 through a Gradio web interface for inference. The project is inspired by LLaVA and incorporates contributions from Vicuna, LLaMA, and Yi. Cambrian-1 is licensed under Apache 2.0 and utilizes datasets and checkpoints subject to their respective original licenses.

github

: 1.4k

crewAI-quickstart

CrewAI quickstart is a small project providing starter templates for an easy start with CrewAI. It includes notebooks, Python scripts, GUI with Streamlit, and Local LLMs for various tasks like web search, CSV lookup, web scraping, PDF search, and more. Contributions are welcome to enhance the project.

github

: 209

$MathEval Screenshot$

MathEval

MathEval is a benchmark designed for evaluating the mathematical capabilities of large models. It includes over 20 evaluation datasets covering various mathematical domains with more than 30,000 math problems. The goal is to assess the performance of large models across different difficulty levels and mathematical subfields. MathEval serves as a reliable reference for comparing mathematical abilities among large models and offers guidance on enhancing their mathematical capabilities in the future.

github

: 56

floneum

Floneum is a graph editor that makes it easy to develop your own AI workflows. It uses large language models (LLMs) to run AI models locally, without any external dependencies or even a GPU. This makes it easy to use LLMs with your own data, without worrying about privacy. Floneum also has a plugin system that allows you to improve the performance of LLMs and make them work better for your specific use case. Plugins can be used in any language that supports web assembly, and they can control the output of LLMs with a process similar to JSONformer or guidance.

github

: 1.8k

my-best-resources

my-best-resources is a curated list of resources for web developers and designers, aimed at making their lives easier. It includes sections on design inspiration, ready-made components, stock images, artificial intelligence tools for various use cases, and other useful sources. The repository provides links and descriptions for each resource, offering a valuable collection of tools and assets for web development and design projects.

github

: 66

dl_model_infer

This project is a c++ version of the AI reasoning library that supports the reasoning of tensorrt models. It provides accelerated deployment cases of deep learning CV popular models and supports dynamic-batch image processing, inference, decode, and NMS. The project has been updated with various models and provides tutorials for model exports. It also includes a producer-consumer inference model for specific tasks. The project directory includes implementations for model inference applications, backend reasoning classes, post-processing, pre-processing, and target detection and tracking. Speed tests have been conducted on various models, and onnx downloads are available for different models.

github

: 87

For similar tasks

FFAIVideo

github

: 55

InvokeAI

InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.

github

: 25.9k

Open-Sora-Plan

Open-Sora-Plan is a project that aims to create a simple and scalable repo to reproduce Sora (OpenAI, but we prefer to call it "ClosedAI"). The project is still in its early stages, but the team is working hard to improve it and make it more accessible to the open-source community. The project is currently focused on training an unconditional model on a landscape dataset, but the team plans to expand the scope of the project in the future to include text2video experiments, training on video2text datasets, and controlling the model with more conditions.

github

: 11.8k

comflowyspace

Comflowyspace is an open-source AI image and video generation tool that aims to provide a more user-friendly and accessible experience than existing tools like SDWebUI and ComfyUI. It simplifies the installation, usage, and workflow management of AI image and video generation, making it easier for users to create and explore AI-generated content. Comflowyspace offers features such as one-click installation, workflow management, multi-tab functionality, workflow templates, and an improved user interface. It also provides tutorials and documentation to lower the learning curve for users. The tool is designed to make AI image and video generation more accessible and enjoyable for a wider range of users.

github

: 1.8k

Rewind-AI-Main

Rewind AI is a free and open-source AI-powered video editing tool that allows users to easily create and edit videos. It features a user-friendly interface, a wide range of editing tools, and support for a variety of video formats. Rewind AI is perfect for beginners and experienced video editors alike.

github

: 248

MoneyPrinterTurbo

MoneyPrinterTurbo is a tool that can automatically generate video content based on a provided theme or keyword. It can create video scripts, materials, subtitles, and background music, and then compile them into a high-definition short video. The tool features a web interface and an API interface, supporting AI-generated video scripts, customizable scripts, multiple HD video sizes, batch video generation, customizable video segment duration, multilingual video scripts, multiple voice synthesis options, subtitle generation with font customization, background music selection, access to high-definition and copyright-free video materials, and integration with various AI models like OpenAI, moonshot, Azure, and more. The tool aims to simplify the video creation process and offers future plans to enhance voice synthesis, add video transition effects, provide more video material sources, offer video length options, include free network proxies, enable real-time voice and music previews, support additional voice synthesis services, and facilitate automatic uploads to YouTube platform.

github

: 25.7k

Dough

Dough is a tool for crafting videos with AI, allowing users to guide video generations with precision using images and example videos. Users can create guidance frames, assemble shots, and animate them by defining parameters and selecting guidance videos. The tool aims to help users make beautiful and unique video creations, providing control over the generation process. Setup instructions are available for Linux and Windows platforms, with detailed steps for installation and running the app.

github

: 395

ragdoll-studio

Ragdoll Studio is a platform offering web apps and libraries for interacting with Ragdoll, enabling users to go beyond fine-tuning and create flawless creative deliverables, rich multimedia, and engaging experiences. It provides various modes such as Story Mode for creating and chatting with characters, Vector Mode for producing vector art, Raster Mode for producing raster art, Video Mode for producing videos, Audio Mode for producing audio, and 3D Mode for producing 3D objects. Users can export their content in various formats and share their creations on the community site. The platform consists of a Ragdoll API and a front-end React application for seamless usage.

github

: 156

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k