VideoLingo

VideoLingo

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组

Stars: 3993

Visit
 screenshot

VideoLingo is an all-in-one video translation and localization dubbing tool designed to generate Netflix-level high-quality subtitles. It aims to eliminate stiff machine translation, multiple lines of subtitles, and can even add high-quality dubbing, allowing knowledge from around the world to be shared across language barriers. Through an intuitive Streamlit web interface, the entire process from video link to embedded high-quality bilingual subtitles and even dubbing can be completed with just two clicks, easily creating Netflix-quality localized videos. Key features and functions include using yt-dlp to download videos from Youtube links, using WhisperX for word-level timeline subtitle recognition, using NLP and GPT for subtitle segmentation based on sentence meaning, summarizing intelligent term knowledge base with GPT for context-aware translation, three-step direct translation, reflection, and free translation to eliminate strange machine translation, checking single-line subtitle length and translation quality according to Netflix standards, using GPT-SoVITS for high-quality aligned dubbing, and integrating package for one-click startup and one-click output in streamlit.

README:

VideoLingo Logo

VideoLingo: Connecting the World, Frame by Frame

Python License GitHub stars Open In Colab

English中文 | 日本語

🌟 Overview

VideoLingo is an all-in-one video translation, localization, and dubbing tool aimed at generating Netflix-quality subtitles. It eliminates stiff machine translations and multi-line subtitles while adding high-quality dubbing, enabling global knowledge sharing across language barriers. With an intuitive Streamlit interface, you can transform a video link into a localized video with high-quality bilingual subtitles and dubbing in just a few clicks.

Key features:

  • 🎥 YouTube video download via yt-dlp

  • 🎙️ Word-level subtitle recognition with WhisperX

  • 📝 NLP and GPT-based subtitle segmentation

  • 📚 GPT-generated terminology for coherent translation

  • 🔄 2-step translation process rivaling professional quality

  • ✅ Netflix-standard single-line subtitles only

  • 🗣️ Dubbing alignment (e.g., GPT-SoVITS)

  • 🚀 One-click startup and output in Streamlit

  • 📝 Detailed logging with progress resumption

  • 🌐 Comprehensive multi-language support

Difference from similar projects: Single-line subtitles only, superior translation quality

🎥 Demo

Russian Translation


https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7

GPT-SoVITS


https://github.com/user-attachments/assets/47d965b2-b4ab-4a0b-9d08-b49a7bf3508c

OAITTS


https://github.com/user-attachments/assets/85c64f8c-06cf-4af9-b153-ee9d2897b768

Language Support:

Current input language support and examples:

Input Language Support Level Translation Demo
English 🤩 English to Chinese
Russian 😊 Russian to Chinese
French 🤩 French to Japanese
German 🤩 German to Chinese
Italian 🤩 Italian to Chinese
Spanish 🤩 Spanish to Chinese
Japanese 😐 Japanese to Chinese
Chinese* 🤩 Chinese to English

*Chinese requires separate configuration of the whisperX model, only applicable for local source code installation. See the installation documentation for the configuration process, and be sure to specify the transcription language as zh in the webpage sidebar

Translation language support depends on the capabilities of the large language model used, while dubbing language depends on the chosen TTS method.

🚀 Quick Start

Online Experience

Experience VideoLingo quickly in Colab in just 5 minutes:

Open In Colab

Local Installation

VideoLingo offers two local installation methods: One-click Simple Package and Source Code Installation. Please refer to the installation documentation: English | 简体中文

Docker Installation

VideoLingo provides a Dockerfile for Docker installation. Please refer to the installation documentation: English | 简体中文

🏭 Batch Mode

Usage instructions: English | 简体中文

⚠️ Current Limitations

  1. UVR5 voice separation is resource-intensive and processes slowly. It's recommended to use this feature only on devices with more than 16GB of RAM and 8GB of VRAM. Note: For videos with loud BGM, not performing voice separation before whisper may cause word-level subtitle adhesion, resulting in errors in the final alignment step.

  2. The quality of dubbing may not be perfect due to differences in language structure and morpheme information density between source and target languages. For best results, choose TTS with similar speech rates based on the original video's speed and content characteristics. The best practice is to train the original video's voice using GPT-SoVITS, then use "Mode 3: Use every reference audio" for dubbing. This ensures maximum consistency in voice, speech rate, and tone. See the demo for effects.

  3. Multilingual video transcription recognition will only retain the main language. This is because whisperX uses a specialized model for a single language when forcibly aligning word-level subtitles, deleting unrecognized languages.

  4. Multi-character separate dubbing is currently unavailable. While whisperX has VAD potential, specific development is needed, and this feature is not yet implemented.

🚗 Roadmap

  • [ ] VAD to distinguish speakers, multi-character dubbing
  • [ ] Customizable translation styles
  • [ ] User terminology glossary
  • [ ] Provide commercial services
  • [ ] Lip sync for dubbed videos

📄 License

This project is licensed under the Apache 2.0 License. When using this project, please follow these rules:

  1. When publishing works, it is recommended (not mandatory) to credit VideoLingo for subtitle generation.
  2. Follow the terms of the large language models and TTS used for proper attribution.
  3. If you copy the code, please include the full copy of the Apache 2.0 License.

We sincerely thank the following open-source projects for their contributions, which provided important support for the development of VideoLingo:

📬 Contact Us

⭐ Star History

Star History Chart


If you find VideoLingo helpful, please give us a ⭐️!

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for VideoLingo

Similar Open Source Tools

For similar tasks

For similar jobs