PanelCleaner
An AI-powered tool to clean manga panels.
Stars: 289
Panel Cleaner is a tool that uses machine learning to find text in images and generate masks to cover it up with high accuracy. It is designed to clean text bubbles without leaving artifacts, avoiding painting over non-text parts, and inpainting bubbles that can't be masked out. The tool offers various customization options, detailed analytics on the cleaning process, supports batch processing, and can run OCR on pages. It supports CUDA acceleration, multiple themes, and can handle bubbles on any solid grayscale background color. Panel Cleaner is aimed at saving time for cleaners by automating monotonous work and providing precise cleaning of text bubbles.
README:
This tool uses machine learning to find text and then generates masks to cover it up with the highest accuracy possible. It is designed to clean easy bubbles, no in-painting or out-of-bubble text removal is done. This is intended to save a lot of monotonous work for people who have to clean a lot of panels, while making sure it doesn't paint over anything that it wasn't supposed to.
Visualized in the top right page:
-
Various boxes are drawn where the AI found text.
-
(Green) The AI also generates a precise mask where it detected text.
-
(Purple) These masks are expanded to cover any nearby text that wasn't detected, as well as jpeg artifacts.
-
(Blue) For masks that are a tight fit, the border around the edge of the mask is denoised for final clean-up, without affecting the rest of the image.
The two bottom pages are what the program can output: either just the transparent mask layer and/or the mask applied to the original image, cleaning it.
Features
Limitations
Why Use This Program?
Installation
Usage
Profiles
OCR
Examples
Acknowledgements
License
Roadmap
FAQ (Frequently Asked Questions)
Translating
-
Cleans text bubbles without leaving artifacts.
-
Avoids painting over parts of the image that aren't text.
-
Inpaints bubbles (with LaMa machine learning) that can't simply be masked out.
-
Ignores bubbles containing only symbols or numbers, as those don't need translation.
-
Offers a GUI for easy use, dark, light, and system themes are supported.
-
No internet connection required after installing the model data.
-
Offers a plethora of options to customize the cleaning process and the ability to save multiple presets as profiles. See the default profile for a list of all options.
-
Provides detailed analytics on the cleaning process, to see how your settings affect the results.
-
Supports CUDA acceleration, if installed as a python package and your hardware supports it.
-
Supports batch processing of images and directories.
-
Can handle bubbles on any solid background color.
-
Can also cut out the text from the rest of the image, e.g. to paste it over a colored rendition.
-
Can also run OCR on the pages and output the text to a file.
-
Review cleaning and OCR output, including editing the OCR output interactively before saving it.
-
Interface available in: English, German, Bulgarian, Spanish (See Translating for more languages)
-
It only supports Japanese and English text for cleaning (success may vary with other languages), Japanese only for OCR.
-
Supported file types: .jpeg, .jpg, .png, .bmp, .tiff, .tif, .jp2, .dib, .webp, .ppm
-
Supported file types (export only): .psd
-
The program relies on AI for the initial text detection, which by nature is imperfect. Sometimes it will miss little bits of text or think part of the bubble belongs to the text, which will prevent that bubble from being cleaned. From testing, this typically affects between 2–8% of bubbles, depending on your settings.
-
Due to the conservative approach taken in the selection of masks, if the program can't clean the bubble to a satisfying degree, it will skip that bubble outright. This does, however, also prevent false positives.
-
For masks, only grayscale is currently supported. This means it can cover up text in white, black, or gray bubbles, but not colored ones.
This program is designed to precisely and fully clean text bubbles, without leaving any artifacts. Its aim is to save a cleaner's time, by taking care of monotonous work. The AI used to detect text and generate the initial mask was not created as part of this project, this project merely uses it as a starting point and improves upon the output.
| Original | AI Output | Panel Cleaner |
|---|---|---|
![]() |
![]() |
![]() |
As you can see, with a bit of extra cleanup applied to the AI output, some leftover text and jpeg compression artifacts are removed, and the bubble is fully cleaned.
When fully cleaning it isn't possible, Panel Cleaner will instead skip the bubble so as not to waste your time with a poorly cleaned bit of text. The exact cleaning behavior is highly configurable, see Profiles for more details.
You have the choice between installing a pre-built binary (exe or elf) from the releases section (recommended for most users), or installing it to your local python interpreter using pip.
Note: All versions will need to download model data on first launch (approx. 500MB). This model data will not need to be downloaded again if Panel Cleaner updates.
Important: The pre-built binaries do not support CUDA acceleration. To use CUDA, you must install the program with pip and ensure you install the appropriate pytorch version for your system.
The program requires Python 3.10 or newer.
Install the program with both the command line interface and graphical interface using pip from PyPI:
pip install pcleanerOr if you only wish to use the command line interface:
pip install pcleaner-cliNote: pcleaner and pcleaner-cli can be installed side by side, but the CLI-only package would be redundant.
Note: The program has been tested to work on Linux, MacOS, and Windows, with varying levels of setup required. See the FAQ for help.
This installs the program in a pipx environment, which allows pytorch to download the appropriate CUDA version for your system, making this the best method of installation.
You can find the package here: panelcleaner
This will provide the pcleaner and pcleaner-gui commands, along with a desktop file for the GUI.
Install it with your favorite AUR helper, e.g. with yay:
yay -S panelcleanerThis installs the prebuilt binary in a flatpak container, which does not support CUDA acceleration.
Build the image with buildx:
docker buildx build -t pcleaner:v1 .Or with the legacy builder:
docker image build -t pcleaner:v1 .Then initialize the docker image, specifying a root folder for the container to access.
In this example, the current directory (pwd) is used:
docker run -it --name pcleaner -v $(pwd):/app pcleaner:v1This will also start an interactive shell in the container.
You can open another one later on with:
docker start pcleaner
docker exec -it pcleaner bashThe program can be run from the command line, and, in the most common use, takes any number of images or directories as input. The program will create a new directory called cleaned in the same directory as the input files, and place the cleaned images and/or masks there. Often, it's more useful to only export the mask layer, and you can do so by adding the --save-only-mask, or -m for short, option.
Examples:
pcleaner clean image1.png image2.png image3.png
pcleaner clean -m folder1 image1.pngDemonstration with 46 images, real time, with CUDA acceleration.

There are many more options, which can be seen by running
pcleaner --helpThe GUI can be launched from the command line using the gui command:
pcleaner guior directly with
pcleaner-guiIf pcleaner cannot be found, ensure it is in your PATH variable, or try
python -m pcleanerinstead.
The program exposes every setting possible in a configuration profile, which are saved as simple text files and can also be accessed using the GUI. Each configuration option is explained inside the file itself, allowing you to optimize each parameter of the cleaning process for your specific needs.
Just generate a new profile with
pcleaner profile new my_profile_name_hereand it will open your new profile for you in a text editor.
Here is a tiny snippet from the default profile, for example:
# Number of pixels to grow the mask by each step.
# This bulks up the outline of the mask, so smaller values will be more accurate but slower.
mask_growth_step_pixels = 2
# Number of steps to grow the mask by.
# A higher number will make more and larger masks, ultimately limited by the reference box size.
mask_growth_steps = 11Run the cleaner with your specified profile by adding --profile=my_profile_name_here or
-p my_profile_name_here to the command.
If you are having trouble seeing how the settings affect the results, you can use the
--cache-masks option to save visualizations of intermediate steps to the cache directory.
| Default Profile | Custom Profile |
|---|---|
![]() |
![]() |
| mask_growth_step_pixels = 2 | mask_growth_step_pixels = 4 |
| mask_growth_steps = 11 | mask_growth_steps = 4 |
Additionally, analytics are provided for each processing step in the terminal, so you can see how your settings affect the results on a whole.
See the default profile for a list of all options.
Note: The default profile is optimized for images roughly 1100x1600 pixels in size. Adjust size parameters accordingly in a profile if you are using images with a significantly lower or higher resolution.
Review your settings with a selection of view modes before exporting the cleaned images.

You can also use Panel Cleaner to perform Optical Character Recognition (OCR) on the pages,
and output the text to a file. This could be useful to assist in translation, or to extract
text for analytical purposes.
You can run OCR with:
pcleaner ocr myfolder --output-path=output.txtThis is also available in the GUI, as the OCR output option.
Panel Cleaner handles Japanese OCR with MangaOCR out of the box, and that is the preferred way to OCR Japanese text. If available, Panel Cleaner can also use Tesseract for OCR capabilities, specifically for processing English and Japanese text, the only two languages currently supported. Follow the instructions below to install Tesseract on your system.
- Download the installer from the official Tesseract GitHub repository. We recommend getting the latest version from UB Mannheim linked there (64 bit).
- Run the installer and follow the on-screen instructions for a system-wide installation.
- Add the Tesseract installation directory to your PATH environment variable.
If you did the system-wide installation, this will mean adding the directory
C:\Program Files\Tesseract-OCRto your PATH. - Restart your computer.
Use Homebrew to install Tesseract:
brew install tesseractFor Debian-based distributions, use apt:
sudo apt install tesseract-ocrFor other distributions, refer to your package manager and the official Tesseract documentation.
For detailed installation instructions and additional information, please refer to the official Tesseract documentation.
Note: While Tesseract supports additional languages, Panel Cleaner will only utilize Tesseract for English and Japanese text recognition. English is installed by default. Follow the instructions here Installing additional language packs to install the Japanese language pack.
Review and edit the OCR output interactively.

| Original | Cleaned |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
-
Comic Text Detector for finding text bubbles and generating the initial mask.
-
Manga OCR for detecting which bubbles only contain symbols or numbers, and performing the dedicated OCR command.
-
Simple Lama Inpainting for inpainting bubbles that can't be masked out. Using the fine-tuned Model by dreMaz.
This project is licensed under the GNU General Public License v3.0 – see the LICENSE file for details.
- Currently no new features are planned.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for PanelCleaner
Similar Open Source Tools
PanelCleaner
Panel Cleaner is a tool that uses machine learning to find text in images and generate masks to cover it up with high accuracy. It is designed to clean text bubbles without leaving artifacts, avoiding painting over non-text parts, and inpainting bubbles that can't be masked out. The tool offers various customization options, detailed analytics on the cleaning process, supports batch processing, and can run OCR on pages. It supports CUDA acceleration, multiple themes, and can handle bubbles on any solid grayscale background color. Panel Cleaner is aimed at saving time for cleaners by automating monotonous work and providing precise cleaning of text bubbles.
snd
Sales & Dungeons is a tool that utilizes thermal printers for creating customizable handouts, quick references, and more for Dungeons and Dragons sessions. It offers extensive templating and random generation systems, supports various connection methods, and allows importing/exporting templates and data sources. Users can access external data sources like Open5e, import data from CSV and other formats, and utilize AI prompt generation and translation. The tool supports cloud sync and is compatible with multiple operating systems and devices.
Pandrator
Pandrator is a GUI tool for generating audiobooks and dubbing using voice cloning and AI. It transforms text, PDF, EPUB, and SRT files into spoken audio in multiple languages. It leverages XTTS, Silero, and VoiceCraft models for text-to-speech conversion and voice cloning, with additional features like LLM-based text preprocessing and NISQA for audio quality evaluation. The tool aims to be user-friendly with a one-click installer and a graphical interface.
FunClip
FunClip is an open-source, locally deployable automated video editing tool that utilizes the FunASR Paraformer series models from Alibaba DAMO Academy for speech recognition in videos. Users can select text segments or speakers from the recognition results and click the clip button to obtain the corresponding video segments. FunClip integrates advanced features such as the Paraformer-Large model for accurate Chinese ASR, SeACo-Paraformer for customized hotword recognition, CAM++ speaker recognition model, Gradio interactive interface for easy usage, support for multiple free edits with automatic SRT subtitles generation, and segment-specific SRT subtitles.
FunClip
FunClip is an open-source, locally deployed automated video clipping tool that leverages Alibaba TONGYI speech lab's FunASR Paraformer series models for speech recognition on videos. Users can select text segments or speakers from recognition results to obtain corresponding video clips. It integrates industrial-grade models for accurate predictions and offers hotword customization and speaker recognition features. The tool is user-friendly with Gradio interaction, supporting multi-segment clipping and providing full video and target segment subtitles. FunClip is suitable for users looking to automate video clipping tasks with advanced AI capabilities.
nobodywho
NobodyWho is a plugin for the Godot game engine that enables interaction with local LLMs for interactive storytelling. Users can install it from Godot editor or GitHub releases page, providing their own LLM in GGUF format. The plugin consists of `NobodyWhoModel` node for model file, `NobodyWhoChat` node for chat interaction, and `NobodyWhoEmbedding` node for generating embeddings. It offers a programming interface for sending text to LLM, receiving responses, and starting the LLM worker.
MARS5-TTS
MARS5 is a novel English speech model (TTS) developed by CAMB.AI, featuring a two-stage AR-NAR pipeline with a unique NAR component. The model can generate speech for various scenarios like sports commentary and anime with just 5 seconds of audio and a text snippet. It allows steering prosody using punctuation and capitalization in the transcript. Speaker identity is specified using an audio reference file, enabling 'deep clone' for improved quality. The model can be used via torch.hub or HuggingFace, supporting both shallow and deep cloning for inference. Checkpoints are provided for AR and NAR models, with hardware requirements of 750M+450M params on GPU. Contributions to improve model stability, performance, and reference audio selection are welcome.
KlicStudio
Klic Studio is a versatile audio and video localization and enhancement solution developed by Krillin AI. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats. With an end-to-end workflow, users can transform raw materials into beautifully ready-to-use cross-platform content with just a few clicks. The tool offers features like video acquisition, accurate speech recognition, intelligent segmentation, terminology replacement, professional translation, voice cloning, video composition, and cross-platform support. It also supports various speech recognition services, large language models, and TTS text-to-speech services. Users can easily deploy the tool using Docker and configure it for different tasks like subtitle translation, large model translation, and optional voice services.
ComfyBench
ComfyBench is a comprehensive benchmark tool designed to evaluate agents' ability to design collaborative AI systems in ComfyUI. It provides tasks for agents to learn from documents and create workflows, which are then converted into code for better understanding by LLMs. The tool measures performance based on pass rate and resolve rate, reflecting the correctness of workflow execution and task realization. ComfyAgent, a component of ComfyBench, autonomously designs new workflows by learning from existing ones, interpreting them as collaborative AI systems to complete given tasks.
blurt
Blurt is a Gnome shell extension that enables accurate speech-to-text input in Linux. It is based on the command line utility NoteWhispers and supports Gnome shell version 48. Users can transcribe speech using a local whisper.cpp installation or a whisper.cpp server. The extension allows for easy setup, start/stop of speech-to-text input with key bindings or icon click, and provides visual indicators during operation. It offers convenience by enabling speech input into any window that allows text input, with the transcribed text sent to the clipboard for easy pasting.
CyberScraper-2077
CyberScraper 2077 is an advanced web scraping tool powered by AI, designed to extract data from websites with precision and style. It offers a user-friendly interface, supports multiple data export formats, operates in stealth mode to avoid detection, and promises lightning-fast scraping. The tool respects ethical scraping practices, including robots.txt and site policies. With upcoming features like proxy support and page navigation, CyberScraper 2077 is a futuristic solution for data extraction in the digital realm.
KrillinAI
KrillinAI is a video subtitle translation and dubbing tool based on AI large models, featuring speech recognition, intelligent sentence segmentation, professional translation, and one-click deployment of the entire process. It provides a one-stop workflow from video downloading to the final product, empowering cross-language cultural communication with AI. The tool supports multiple languages for input and translation, integrates features like automatic dependency installation, video downloading from platforms like YouTube and Bilibili, high-speed subtitle recognition, intelligent subtitle segmentation and alignment, custom vocabulary replacement, professional-level translation engine, and diverse external service selection for speech and large model services.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
vector_companion
Vector Companion is an AI tool designed to act as a virtual companion on your computer. It consists of two personalities, Axiom and Axis, who can engage in conversations based on what is happening on the screen. The tool can transcribe audio output and user microphone input, take screenshots, and read text via OCR to create lifelike interactions. It requires specific prerequisites to run on Windows and uses VB Cable to capture audio. Users can interact with Axiom and Axis by running the main script after installation and configuration.
RAVE
RAVE is a variational autoencoder for fast and high-quality neural audio synthesis. It can be used to generate new audio samples from a given dataset, or to modify the style of existing audio samples. RAVE is easy to use and can be trained on a variety of audio datasets. It is also computationally efficient, making it suitable for real-time applications.
sdkit
sdkit (stable diffusion kit) is an easy-to-use library for utilizing Stable Diffusion in AI Art projects. It includes features like ControlNets, LoRAs, Textual Inversion Embeddings, GFPGAN, CodeFormer for face restoration, RealESRGAN for upscaling, k-samplers, support for custom VAEs, NSFW filter, model-downloader, parallel GPU support, and more. It offers a model database, auto-scanning for malicious models, and various optimizations. The API consists of modules for loading models, generating images, filters, model merging, and utilities, all managed through the sdkit.Context object.
For similar tasks
IOPaint
IOPaint is a free and open-source inpainting & outpainting tool powered by SOTA AI model. It supports various AI models to perform erase, inpainting, or outpainting tasks. Users can remove unwanted objects, defects, watermarks, or people from images using erase models. Additionally, diffusion models can replace objects or perform outpainting. The tool also offers plugins for interactive object segmentation, background removal, anime segmentation, super resolution, face restoration, and file management. IOPaint provides a web UI for easy access to the latest AI models and supports batch processing of images through the command line. Developers can contribute to the project by installing front-end dependencies, setting up the backend, and starting the development environment for both front-end and back-end components.
PanelCleaner
Panel Cleaner is a tool that uses machine learning to find text in images and generate masks to cover it up with high accuracy. It is designed to clean text bubbles without leaving artifacts, avoiding painting over non-text parts, and inpainting bubbles that can't be masked out. The tool offers various customization options, detailed analytics on the cleaning process, supports batch processing, and can run OCR on pages. It supports CUDA acceleration, multiple themes, and can handle bubbles on any solid grayscale background color. Panel Cleaner is aimed at saving time for cleaners by automating monotonous work and providing precise cleaning of text bubbles.
AI-Lossless-Zoomer
AI-Lossless-Zoomer is a tool that utilizes the Real-ESRGAN model provided by Tencent ARC Lab to enhance images, particularly portraits and anime pictures, with fast processing. It supports multi-thread processing, batch image processing, customizable options, output formats, output paths, AI engine selection, and batch cleaning tasks. The tool is designed for Windows 7 or later with .NET Framework 4.6+. Users can choose between the installable version (.exe) and the portable version (.zip) that includes the latest AI engine. The tool is efficient for enlarging images while maintaining quality.
manga-translator-ui
This repository is a manga image translator tool that allows users to translate text in manga images automatically. It supports various types of manga, including Japanese, Korean, and American, in both black and white and color formats. The tool can detect, translate, and embed text, supporting multiple languages such as Japanese, Chinese, and English. It also includes a visual editor for adjusting text boxes. Users can interact with the tool through a Qt interface or command-line mode for batch processing. The tool offers features like intelligent text detection, multi-language OCR, multiple translation engines, high-quality translation using AI models, automatic term extraction, AI sentence segmentation, intelligent typesetting, PSD export, and batch processing. Additionally, it provides a visual editor for region editing, text editing, mask editing, undo/redo functionality, shortcut key support, and mouse wheel shortcuts.
For similar jobs
PanelCleaner
Panel Cleaner is a tool that uses machine learning to find text in images and generate masks to cover it up with high accuracy. It is designed to clean text bubbles without leaving artifacts, avoiding painting over non-text parts, and inpainting bubbles that can't be masked out. The tool offers various customization options, detailed analytics on the cleaning process, supports batch processing, and can run OCR on pages. It supports CUDA acceleration, multiple themes, and can handle bubbles on any solid grayscale background color. Panel Cleaner is aimed at saving time for cleaners by automating monotonous work and providing precise cleaning of text bubbles.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.
exif-photo-blog
EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
Twitter-Insight-LLM
This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.





















