manga-translator-android

手机端的即时自动漫画翻译软件，由LLM驱动。Instant automatic manga translation app for mobile devices, powered by LLM.

Stars: 61

Visit

Manga Translator Android is a mobile application designed to translate text from manga images using optical character recognition (OCR) technology. Users can simply take a picture of the manga panel they want to translate, and the app will extract the text and provide a translation in real-time. This tool is especially useful for manga enthusiasts who want to read manga in languages they are not familiar with, as well as for language learners looking to practice reading in a fun and engaging way. With an intuitive user interface and fast processing speed, Manga Translator Android makes it easy to enjoy manga content in multiple languages on the go.

README:

Manga Translator 📖

面向安卓的漫画翻译 App：本地气泡检测与 OCR，结合 OpenAI 兼容接口完成翻译，并在原图上覆盖显示可拖动的翻译气泡。使用教程：https://github.com/jedzqer/manga-translator/blob/main/Tutorial/简中教程.md

原图	翻译结果	嵌入效果

主要功能 ✨

日译中，英译中
漫画库管理：新建文件夹、批量导入图片、EhViewer 导入
翻译流程：气泡检测 + OCR + LLM 翻译，支持标准模式与全文速译
阅读体验：翻译覆盖层、翻译气泡位置可拖动、阅读进度自动保存
译名表与缓存：按文件夹维护 glossary.json，自动累积固定译名
更新与日志：启动检查更新，翻译期间前台服务与日志查看

快速使用 🚀

在漫画库中新建文件夹并导入图片
确保图片文件名顺序与阅读顺序一致（例如 1.jpg, 2.jpg）
在设置页填写 API 地址、API Key、模型名称（OpenAI 兼容）
回到漫画库，选择文件夹并点击“翻译文件夹”
翻译完成后点击“开始阅读”，在阅读页可拖动气泡位置

全文速译建议：页数较多时分批上传翻译，或在设置中提高 API 超时。

常见问题 ❓

翻译失败或结果为空：确认 API 地址以 /v1 结尾，模型名与供应商一致，且网络可达
翻译顺序错乱：请先对图片按阅读顺序重命名
怎么获取AI：具体获取方法可以去搜索一下

交流

可以进QQ群提问交流：1080302768

Star History

** 喜欢的话可以点个Star哦 **

数据与文件说明 🗂️

漫画库存储：/Android/data/<package>/files/manga_library/
每张图片生成同名 *.json 翻译结果，OCR 缓存为 *.ocr.json
译名表：每个文件夹维护 glossary.json
阅读进度、全文速译开关等存储在 SharedPreferences

从源码构建 🧩

环境要求

JDK 17.0.17+
Kotlin 2.0.0+
Gradle 8.11.1+
Android SDK: platform 35, build-tools 35.0.0

构建命令

./gradlew :app:assembleDebug
./gradlew :app:assembleRelease

模型与资源

将以下模型文件放入 assets/：

comic-speech-bubble-detector.onnx（气泡检测）
encoder_model.onnx、decoder_model.onnx（OCR）
en_PP-OCRv5_rec_mobile_infer.onnx（英文 OCR）
ysgyolo_1.2_OS1.0.onnx（文本补检 + 文字蒙版）
Multilingual_PP-OCRv3_det_infer.onnx（英文行检测）
migan_512.onnx、migan_512.onnx.data（嵌字抹除）

模型下载链接：

气泡检测模型：https://huggingface.co/ogkalu/comic-speech-bubble-detector-yolov8m
OCR 模型：https://huggingface.co/l0wgear/manga-ocr-2025-onnx

提示词与 OCR 配置位于 assets/，名称需与代码保持一致。

发布版本号同步

需同时修改：

app/src/main/java/com/manga/translate/VersionInfo.kt
app/build.gradle.kts
update.json

🙏 致谢

PaddleOCR - 提供 OCR 模型支持
kha-white/manga-ocr - MangaOCR 模型支持
所有用户的支持

For Tasks:

Click tags to check more tools for each tasks

translate text read manga learn languages edit manga develop mobile apps

For Jobs:

translator language teacher manga editor mobile app developer language learner

Alternative AI tools for manga-translator-android

Similar Open Source Tools

manga-translator-android

github

: 61

Verbiverse

Verbiverse is a tool that uses a large language model to assist in reading PDFs and watching videos, aimed at improving language proficiency. It provides a more convenient and efficient way to use large models through predefined prompts, designed for those looking to enhance their language skills. The tool analyzes unfamiliar words and sentences in foreign language PDFs or video subtitles, providing better contextual understanding compared to traditional dictionary translations or ambiguous meanings. It offers features such as automatic loading of subtitles, word analysis by clicking or double-clicking, and a word database for collecting words. Users can run the tool on Windows x86_64 or ubuntu_22.04 x86_64 platforms by downloading the precompiled packages or by cloning the source code and setting up a virtual environment with Python. It is recommended to use a local model or smaller PDF files for testing due to potential token consumption issues with large files.

github

: 96

MaterialSearch

MaterialSearch is a tool for searching local images and videos using natural language. It provides functionalities such as text search for images, image search for images, text search for videos (providing matching video clips), image search for videos (searching for the segment in a video through a screenshot), image-text similarity calculation, and Pexels video search. The tool can be deployed through the source code or Docker image, and it supports GPU acceleration. Users can configure the tool through environment variables or a .env file. The tool is still under development, and configurations may change frequently. Users can report issues or suggest improvements through issues or pull requests.

github

: 1.4k

magic-resume

Magic Resume is a modern online resume editor that makes creating professional resumes simple and fun. Built on Next.js and Framer Motion, it supports real-time preview and custom themes. Features include Next.js 14+ based construction, smooth animation effects (Framer Motion), custom theme support, responsive design, dark mode, export to PDF, real-time preview, auto-save, and local storage. The technology stack includes Next.js 14+, TypeScript, Framer Motion, Tailwind CSS, Shadcn/ui, and Lucide Icons.

github

: 2.9k

Awesome-Lists

Awesome-Lists is a curated list of awesome lists across various domains of computer science and beyond, including programming languages, web development, data science, and more. It provides a comprehensive index of articles, books, courses, open source projects, and other resources. The lists are organized by topic and subtopic, making it easy to find the information you need. Awesome-Lists is a valuable resource for anyone looking to learn more about a particular topic or to stay up-to-date on the latest developments in the field.

github

: 597

Windrecorder

Windrecorder is an open-source tool that helps you retrieve memory cues by recording everything on your screen. It can search based on OCR text or image descriptions and provides a summary of your activities. All of its capabilities run entirely locally, without the need for an internet connection or uploading any data, giving you complete ownership of your data.

github

: 893

aidea

AIdea is an app that integrates mainstream large language models and drawing models, developed using Flutter. The code is completely open-source and supports various functions such as GPT-3.5, GPT-4 from OpenAI, Claude instant, Claude 2.1 from Anthropic, Gemini Pro and visual language models from Google, as well as various Chinese and open-source models. It also supports features like text-to-image, super-resolution, coloring black and white images, artistic fonts, artistic QR codes, and more.

github

: 6.7k

Awesome-Lists-and-CheatSheets

Awesome-Lists is a curated index of selected resources spanning various fields including programming languages and theories, web and frontend development, server-side development and infrastructure, cloud computing and big data, data science and artificial intelligence, product design, etc. It includes articles, books, courses, examples, open-source projects, and more. The repository categorizes resources according to the knowledge system of different domains, aiming to provide valuable and concise material indexes for readers. Users can explore and learn from a wide range of high-quality resources in a systematic way.

github

: 620

yomitoku

YomiToku is a Japanese-focused AI document image analysis engine that provides full-text OCR and layout analysis capabilities for images. It recognizes, extracts, and converts text information and figures in images. It includes 4 AI models trained on Japanese datasets for tasks such as detecting text positions, recognizing text strings, analyzing layouts, and recognizing table structures. The models are specialized for Japanese document images, supporting recognition of over 7000 Japanese characters and analyzing layout structures specific to Japanese documents. It offers features like layout analysis, table structure analysis, and reading order estimation to extract information from document images without disrupting their semantic structure. YomiToku supports various output formats such as HTML, markdown, JSON, and CSV, and can also extract figures, tables, and images from documents. It operates efficiently in GPU environments, enabling fast and effective analysis of document transcriptions without requiring high-end GPUs.

github

: 568

ai-paint-today-BE

AI Paint Today is an API server repository that allows users to record their emotions and daily experiences, and based on that, AI generates a beautiful picture diary of their day. The project includes features such as generating picture diaries from written entries, utilizing DALL-E 2 model for image generation, and deploying on AWS and Cloudflare. The project also follows specific conventions and collaboration strategies for development.

github

: 60

smriti-ai

Smriti AI is an intelligent learning assistant that helps users organize, understand, and retain study materials. It transforms passive content into active learning tools by capturing resources, converting them into summaries and quizzes, providing spaced revision with reminders, tracking progress, and offering a multimodal interface. Suitable for students, self-learners, professionals, educators, and coaching institutes.

github

: 52

GraphGen

GraphGen is a framework for synthetic data generation guided by knowledge graphs. It enhances supervised fine-tuning for large language models (LLMs) by generating synthetic data based on a fine-grained knowledge graph. The tool identifies knowledge gaps in LLMs, prioritizes generating QA pairs targeting high-value knowledge, incorporates multi-hop neighborhood sampling, and employs style-controlled generation to diversify QA data. Users can use LLaMA-Factory and xtuner for fine-tuning LLMs after data generation.

github

: 898

DeepClaude

DeepClaude is an open-source project inspired by the DeepSeek R1 model, aiming to provide the best results in various tasks by combining different models. It supports OpenAI-compatible input and output formats, integrates with DeepSeek and Claude APIs, and offers special support for other OpenAI-compatible models. Users can run the project locally or deploy it on a server to access a powerful language model service. The project also provides guidance on obtaining necessary APIs and running the project, including using Docker for deployment.

github

: 2.3k

celeste-python

Celeste AI is a type-safe, modality/provider-agnostic tool that offers unified interface for various providers like OpenAI, Anthropic, Gemini, Mistral, and more. It supports multiple modalities including text, image, audio, video, and embeddings, with full Pydantic validation and IDE autocomplete. Users can switch providers instantly, ensuring zero lock-in and a lightweight architecture. The tool provides primitives, not frameworks, for clean I/O operations.

github

: 208

FeedCraft

FeedCraft is a powerful tool to process your rss feeds as a middleware. Use it to translate your feed, extract fulltext, emulate browser to render js-heavy page, use llm such as google gemini to generate brief for your rss article, use natural language to filter your rss feed, and more! It is an open-source tool that can be self-deployed and used with any RSS reader. It supports AI-powered processing using Open AI compatible LLMs, custom prompt, saving rules to apply to different RSS sources, portable mode for on-the-go usage, and dock mode for advanced customization of RSS sources and processing parameters.

github

: 66

xiaozhi-esp32

The xiaozhi-esp32 repository is the first hardware project by Xia Ge, focusing on creating an AI chatbot using ESP32, SenseVoice, and Qwen72B. The project aims to help beginners in AI hardware development understand how to apply language models to hardware devices. It supports various functionalities such as Wi-Fi configuration, offline voice wake-up, multilingual speech recognition, voiceprint recognition, TTS using large models, and more. The project encourages participation for learning and improvement, providing resources for hardware and firmware development.

github

: 10.2k

For similar tasks

react-native-airship

React Native Airship is a module designed to integrate Airship's iOS and Android SDKs into React Native applications. It provides developers with the necessary tools to incorporate Airship's push notification services seamlessly. The module offers a simple and efficient way to leverage Airship's features within React Native projects, enhancing user engagement and retention through targeted notifications.

github

: 86

openagents

OpenAgents is a platform for AI agents using open protocols. The current flagship product (v4) is an agentic chat app live at openagents.com. This repository holds the new cross-platform version (v5), with an initial focus on Coder, a desktop app intended to replace Claude Code with standard chat UI & thread history and first-class MCP integration. The v5 tech stack includes React, React Native, TypeScript for frontend, Cloudflare stack for backend, better-auth for authentication, and Vercel AI SDK. The architecture considerations aim for cross-platform code reuse, open protocol interoperability, long-running agent processes, composability, proportional payment to contributors, and agent wallets for Bitcoin/Lightning & stablecoins via Spark wallet.

github

: 212

airsync-android

Android app for AirSync 2.0 built with Kotlin Jetpack Compose. Users can connect using a QR code scan and save the last device for easier re-connection. The app is developed with gratitude to the community, AI research for assistance, and various contributors. It aims to provide a seamless experience for users to manage notifications efficiently.

github

: 122

manga-translator-android

github

: 61

aidoku-zh-sources

Aidoku 中文图源 is a collection of Chinese manga sources for the Aidoku manga reader app. It includes links to over 30 different sources, including popular sites like 139漫画, 无敌漫画, and 巴卡漫画. The sources are organized into a single, easy-to-use list, making it easy to find and read your favorite manga.

github

: 245

gpt-subtrans

GPT-Subtrans is an open-source subtitle translator that utilizes large language models (LLMs) as translation services. It supports translation between any language pairs that the language model supports. Note that GPT-Subtrans requires an active internet connection, as subtitles are sent to the provider's servers for translation, and their privacy policy applies.

github

: 418

WeeaBlind

Weeablind is a program that uses modern AI speech synthesis, diarization, language identification, and voice cloning to dub multi-lingual media and anime. It aims to create a pleasant alternative for folks facing accessibility hurdles such as blindness, dyslexia, learning disabilities, or simply those that don't enjoy reading subtitles. The program relies on state-of-the-art technologies such as ffmpeg, pydub, Coqui TTS, speechbrain, and pyannote.audio to analyze and synthesize speech that stays in-line with the source video file. Users have the option of dubbing every subtitle in the video, setting the start and end times, dubbing only foreign-language content, or full-blown multi-speaker dubbing with speaking rate and volume matching.

github

: 168

Synthalingua

Synthalingua is an advanced, self-hosted tool that leverages artificial intelligence to translate audio from various languages into English in near real time. It offers multilingual outputs and utilizes GPU and CPU resources for optimized performance. Although currently in beta, it is actively developed with regular updates to enhance capabilities. The tool is not intended for professional use but for fun, language learning, and enjoying content at a reasonable pace. Users must ensure speakers speak clearly for accurate translations. It is not a replacement for human translators and users assume their own risk and liability when using the tool.

github

: 176

For similar jobs

gpt-subtrans

github

: 418

WeeaBlind

github

: 168

Chenyme-AAVT

Chenyme-AAVT is a user-friendly tool that provides automatic video and audio recognition and translation. It leverages the capabilities of Whisper, a powerful speech recognition model, to accurately identify speech in videos and audios. The recognized speech is then translated using ChatGPT or KIMI, ensuring high-quality translations. With Chenyme-AAVT, you can quickly generate字幕 files and merge them with the original video, making video translation a breeze. The tool supports various languages, allowing you to translate videos and audios into your desired language. Additionally, Chenyme-AAVT offers features such as VAD (Voice Activity Detection) to enhance recognition accuracy, GPU acceleration for faster processing, and support for multiple字幕 formats. Whether you're a content creator, translator, or anyone looking to make video translation more efficient, Chenyme-AAVT is an invaluable tool.

github

: 1.2k

chatgpt-subtitle-translator

This tool utilizes the OpenAI ChatGPT API to translate text, with a focus on line-based translation, particularly for SRT subtitles. It optimizes token usage by removing SRT overhead and grouping text into batches, allowing for arbitrary length translations without excessive token consumption while maintaining a one-to-one match between line input and output.

github

: 370

anki_packager

anki_packager is an intelligent tool for generating high-quality Anki flashcards for English vocabulary. It integrates multiple curated dictionaries, provides automated learning experiences, supports various features like Google TTS pronunciation and AI models for word summarization and story generation, offers convenient data import from other sources, ensures a good command-line interface, and can be run using Docker. Each flashcard includes detailed learning resources such as definitions, tenses, AI-generated roots for mnemonic aids, phrases, example sentences, word differentiations, and English explanations with AI-generated stories.

github

: 107

manga-translator-android

github

: 61

chatgpt-web

ChatGPT Web is a web application that provides access to the ChatGPT API. It offers two non-official methods to interact with ChatGPT: through the ChatGPTAPI (using the `gpt-3.5-turbo-0301` model) or through the ChatGPTUnofficialProxyAPI (using a web access token). The ChatGPTAPI method is more reliable but requires an OpenAI API key, while the ChatGPTUnofficialProxyAPI method is free but less reliable. The application includes features such as user registration and login, synchronization of conversation history, customization of API keys and sensitive words, and management of users and keys. It also provides a user interface for interacting with ChatGPT and supports multiple languages and themes.

github

: 1.4k

ChatterUI

ChatterUI is a mobile app that allows users to manage chat files and character cards, and to interact with Large Language Models (LLMs). It supports multiple backends, including local, koboldcpp, text-generation-webui, Generic Text Completions, AI Horde, Mancer, Open Router, and OpenAI. ChatterUI provides a mobile-friendly interface for interacting with LLMs, making it easy to use them for a variety of tasks, such as generating text, translating languages, writing code, and answering questions.

github

: 1.8k