WatermarkRemover-AI

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.

Stars: 78

Visit

WatermarkRemover-AI is an advanced application that utilizes AI models for precise watermark detection and seamless removal. It leverages Florence-2 for watermark identification and LaMA for inpainting. The tool offers both a command-line interface (CLI) and a PyQt6-based graphical user interface (GUI), making it accessible to users of all levels. It supports dual modes for processing images, advanced watermark detection, seamless inpainting, customizable output settings, real-time progress tracking, dark mode support, and efficient GPU acceleration using CUDA.

README:

WatermarkRemover-AI

AI-Powered Watermark Removal Tool using Florence-2 and LaMA Models

Example of watermark removal with LaMa inpainting

Overview

WatermarkRemover-AI is a cutting-edge application that leverages AI models for precise watermark detection and seamless removal. It uses Florence-2 from Microsoft for watermark identification and LaMA for inpainting to fill in the removed regions naturally. The software offers both a command-line interface (CLI) and a PyQt6-based graphical user interface (GUI), making it accessible to both casual and advanced users.

Features
Technical Overview
Installation
Usage
Upgrade Notes
Alpha Masking
Contributing
License

Features

Dual Modes: Process individual images or entire directories of images.
Advanced Watermark Detection: Utilizes Florence-2's open-vocabulary detection for accurate watermark identification.
Seamless Inpainting: Employs LaMA for high-quality, context-aware inpainting.
Customizable Output:
- Configure maximum bounding box size for watermark detection.
- Set transparency for watermark regions.
- Force specific output formats (PNG, WEBP, JPG).
Progress Tracking: Real-time progress updates in both GUI and CLI modes.
Dark Mode Support: GUI automatically adapts to system dark mode settings.
Efficient Resource Management: Optimized for GPU acceleration using CUDA (optional).

Technical Overview

Florence-2 for Watermark Detection

Florence-2 detects watermarks using open-vocabulary object detection.
Bounding boxes are filtered to ensure that only small regions (configurable by the user) are processed.

LaMA for Inpainting

The LaMA model seamlessly fills in watermark regions with context-aware content.
Supports high-resolution inpainting by using cropping and resizing strategies.

PyQt6 GUI

User-friendly interface for selecting input/output paths, configuring settings, and tracking progress.
Dark mode and customization options enhance the user experience.

Installation

Prerequisites

Conda/Miniconda installed.
CUDA (optional for GPU acceleration; the application runs well on CPUs too).

Steps

Clone the Repository:

git clone https://github.com/D-Ogi/WatermarkRemover-AI.git
cd WatermarkRemover-AI

Run the Setup Script:
```
bash setup.sh
```
The setup.sh script automatically sets up the environment, installs dependencies, and launches the GUI application. It also provides convenient options for CLI usage.
Fast-Track Options:
- To Use the CLI Immediately: After running setup.sh, you can use the CLI directly without activating the environment manually:
```
./setup.sh input_path output_path [options]
```
  Example:
```
./setup.sh ./input_images ./output_images --overwrite --transparent
```
- To Activate the Environment Without Starting the Application: Use:
```
conda activate py312aiwatermark
```

Usage

Preferred Way: Setup Script

Run the Setup Script:
```
bash setup.sh
```
- The GUI will launch automatically, and the environment will be ready for immediate CLI or GUI use.
- For CLI use, run:
```
./setup.sh input_path output_path [options]
```
  Example:
```
./setup.sh ./input_images ./output_images --overwrite --transparent
```

Manual Way

Activate the Environment:
```
conda activate py312aiwatermark
```

Launch GUI or CLI:

GUI:
```
python remwmgui.py
```

CLI:

python remwm.py input_path output_path [options]

Using the GUI

Launch the GUI: If not launched automatically, start it with:
```
python remwmgui.py
```
Configure Settings:
- Mode: Select "Process Single Image" or "Process Directory".
- Paths: Browse and set the input/output directories.
- Options:
  - Enable overwriting of existing files (directory processing only, single image processing always overwrites)
  - Enable transparency for watermark regions.
  - Adjust the maximum bounding box size for watermark detection.
- Output Format: Choose between PNG, WEBP, JPG, or retain the original format.
Start Processing:
- Click "Start" to begin processing.
- Monitor progress and logs in the GUI.

Using the CLI

Basic Command:
```
python remwm.py input_path output_path
```
Options:
- --overwrite: Overwrite existing files.
- --transparent: Make watermark regions transparent instead of removing them.
- --max-bbox-percent: Set the maximum bounding box size for watermark detection (default: 10%).
- --force-format: Force output format (PNG, WEBP, or JPG).

Example:

python remwm.py ./input_images ./output_images --overwrite --max-bbox-percent=15 --force-format=PNG

Upgrade Notes

If you have previously used an older version of the repository or set up an incorrect Conda environment, follow these steps to upgrade:

Update the Repository:
```
git pull
```

Remove the Old Environment:

conda deactivate
conda env remove -n py312

Run the Setup Script:
```
bash setup.sh
```

This will recreate the correct environment (py312aiwatermark) and ensure all dependencies are up-to-date.

Alpha Masking

We implemented alpha masking to allow selective manipulation of watermark regions without altering other parts of the image.

Why Alpha Masking?

Precision: Enable box-targeted watermark removal by isolating specific regions.
Flexibility: By controlling opacity in alpha layers, we can achieve a range of effects by complete removal to transparency.
Minimal Impact: This method ensures that areas outside the watermark remain untouched, preserving image quality.

Contributing

Contributions are welcome! To contribute:

Fork the repository.
Create a new branch for your feature.
Submit a pull request detailing your changes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

For Tasks:

Click tags to check more tools for each tasks

remove watermarks edit images enhance photos automate editing detect and fill regions

For Jobs:

image editor graphic designer photographer ai engineer software developer

Alternative AI tools for WatermarkRemover-AI

Similar Open Source Tools

WatermarkRemover-AI

github

: 78

rkllama

RKLLama is a server and client tool designed for running and interacting with LLM models optimized for Rockchip RK3588(S) and RK3576 platforms. It allows models to run on the NPU, with features such as running models on NPU, partial Ollama API compatibility, pulling models from Huggingface, API REST with documentation, dynamic loading/unloading of models, inference requests with streaming modes, simplified model naming, CPU model auto-detection, and optional debug mode. The tool supports Python 3.8 to 3.12 and has been tested on Orange Pi 5 Pro and Orange Pi 5 Plus with specific OS versions.

github

: 88

Hacx-GPT

Hacx GPT is a cutting-edge AI tool developed by BlackTechX, inspired by WormGPT, designed to push the boundaries of natural language processing. It is an advanced broken AI model that facilitates seamless and powerful interactions, allowing users to ask questions and perform various tasks. The tool has been rigorously tested on platforms like Kali Linux, Termux, and Ubuntu, offering powerful AI conversations and the ability to do anything the user wants. Users can easily install and run Hacx GPT on their preferred platform to explore its vast capabilities.

github

: 102

swift-ocr-llm-powered-pdf-to-markdown

Swift OCR is a powerful tool for extracting text from PDF files using OpenAI's GPT-4 Turbo with Vision model. It offers flexible input options, advanced OCR processing, performance optimizations, structured output, robust error handling, and scalable architecture. The tool ensures accurate text extraction, resilience against failures, and efficient handling of multiple requests.

github

: 219

trendFinder

Trend Finder is a tool designed to help users stay updated on trending topics on social media by collecting and analyzing posts from key influencers. It sends Slack notifications when new trends or product launches are detected, saving time, keeping users informed, and enabling quick responses to emerging opportunities. The tool features AI-powered trend analysis, social media and website monitoring, instant Slack notifications, and scheduled monitoring using cron jobs. Built with Node.js and Express.js, Trend Finder integrates with Together AI, Twitter/X API, Firecrawl, and Slack Webhooks for notifications.

github

: 2.2k

RealtimeSTT_LLM_TTS

RealtimeSTT is an easy-to-use, low-latency speech-to-text library for realtime applications. It listens to the microphone and transcribes voice into text, making it ideal for voice assistants and applications requiring fast and precise speech-to-text conversion. The library utilizes Voice Activity Detection, Realtime Transcription, and Wake Word Activation features. It supports GPU-accelerated transcription using PyTorch with CUDA support. RealtimeSTT offers various customization options for different parameters to enhance user experience and performance. The library is designed to provide a seamless experience for developers integrating speech-to-text functionality into their applications.

github

: 276

miner-release

Heurist Miner is a tool that allows users to contribute their GPU for AI inference tasks on the Heurist network. It supports dual mining capabilities for image generation models and Large Language Models, offers flexible setup on Windows or Linux with multiple GPUs, ensures secure rewards through a dual-wallet system, and is fully open source. Users can earn rewards by hosting AI models and supporting applications in the Heurist ecosystem.

github

: 73

DeepSeekAI

DeepSeekAI is a browser extension plugin that allows users to interact with AI by selecting text on web pages and invoking the DeepSeek large model to provide AI responses. The extension enhances browsing experience by enabling users to get summaries or answers for selected text directly on the webpage. It features context text selection, API key integration, draggable and resizable window, AI streaming replies, Markdown rendering, one-click copy, re-answer option, code copy functionality, language switching, and multi-turn dialogue support. Users can install the extension from Chrome Web Store or Edge Add-ons, or manually clone the repository, install dependencies, and build the extension. Configuration involves entering the DeepSeek API key in the extension popup window to start using the AI-driven responses.

github

: 203

CrewAI-Studio

CrewAI Studio is an application with a user-friendly interface for interacting with CrewAI, offering support for multiple platforms and various backend providers. It allows users to run crews in the background, export single-page apps, and use custom tools for APIs and file writing. The roadmap includes features like better import/export, human input, chat functionality, automatic crew creation, and multiuser environment support.

github

: 682

minefield

BitBom Minefield is a tool that uses roaring bit maps to graph Software Bill of Materials (SBOMs) with a focus on speed, air-gapped operation, scalability, and customizability. It is optimized for rapid data processing, operates securely in isolated environments, supports millions of nodes effortlessly, and allows users to extend the project without relying on upstream changes. The tool enables users to manage and explore software dependencies within isolated environments by offline processing and analyzing SBOMs.

github

: 705

probe

Probe is an AI-friendly, fully local, semantic code search tool designed to power the next generation of AI coding assistants. It combines the speed of ripgrep with the code-aware parsing of tree-sitter to deliver precise results with complete code blocks, making it perfect for large codebases and AI-driven development workflows. Probe is fully local, keeping code on the user's machine without relying on external APIs. It supports multiple languages, offers various search options, and can be used in CLI mode, MCP server mode, AI chat mode, and web interface. The tool is designed to be flexible, fast, and accurate, providing developers and AI models with full context and relevant code blocks for efficient code exploration and understanding.

github

: 110

MM-RLHF

MM-RLHF is a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. It includes a high-quality MLLM alignment dataset, a Critique-Based MLLM reward model, a novel alignment algorithm MM-DPO, and benchmarks for reward models and multimodal safety. The dataset covers image understanding, video understanding, and safety-related tasks with model-generated responses and human-annotated scores. The reward model generates critiques of candidate texts before assigning scores for enhanced interpretability. MM-DPO is an alignment algorithm that achieves performance gains with simple adjustments to the DPO framework. The project enables consistent performance improvements across 10 dimensions and 27 benchmarks for open-source MLLMs.

github

: 116

Visionatrix

Visionatrix is a project aimed at providing easy use of ComfyUI workflows. It offers simplified setup and update processes, a minimalistic UI for daily workflow use, stable workflows with versioning and update support, scalability for multiple instances and task workers, multiple user support with integration of different user backends, LLM power for integration with Ollama/Gemini, and seamless integration as a service with backend endpoints and webhook support. The project is approaching version 1.0 release and welcomes new ideas for further implementation.

github

: 122

word-GPT-Plus

Word GPT Plus seamlessly integrates AI models into Microsoft Word, allowing users to generate, translate, summarize, and polish text directly within their documents. The tool supports multiple AI models, offers built-in templates for various text-related tasks, and provides customization options for user preferences. Users can install the tool through a hosted service, Docker deployment, or self-hosting, and can easily fill in API keys to access different AI services. Word GPT Plus enhances writing workflows by providing AI-powered assistance without leaving the Word environment.

github

: 768

aiogram-django-template

Aiogram & Django API Template is a robust and secure Django template with advanced features like Docker integration, Celery for asynchronous tasks, Sentry for error tracking, Django Rest Framework for building APIs, and more. It provides scalability options, up-to-date dependencies, and integration with AWS S3 for storage. The template includes configuration guides for secrets, ports, performance tuning, application settings, CORS and CSRF settings, and database configuration. Security, scalability, and monitoring are emphasized for efficient Django API development.

github

: 142

Vodalus-Expert-LLM-Forge

Vodalus Expert LLM Forge is a tool designed for crafting datasets and efficiently fine-tuning models using free open-source tools. It includes components for data generation, LLM interaction, RAG engine integration, model training, fine-tuning, and quantization. The tool is suitable for users at all levels and is accompanied by comprehensive documentation. Users can generate synthetic data, interact with LLMs, train models, and optimize performance for local execution. The tool provides detailed guides and instructions for setup, usage, and customization.

github

: 131

For similar tasks

ShortGPT

ShortGPT is a powerful framework for automating content creation, simplifying video creation, footage sourcing, voiceover synthesis, and editing tasks. It offers features like automated editing framework, scripts and prompts, voiceover support in multiple languages, caption generation, asset sourcing, and persistency of editing variables. The tool is designed for youtube automation, Tiktok creativity program automation, and offers customization options for efficient and creative content creation.

github

: 5.5k

WatermarkRemover-AI

github

: 78

InvokeAI

InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.

github

: 24.8k

StableSwarmUI

StableSwarmUI is a modular Stable Diffusion web user interface that emphasizes making power tools easily accessible, high performance, and extensible. It is designed to be a one-stop-shop for all things Stable Diffusion, providing a wide range of features and capabilities to enhance the user experience.

github

: 2.7k

civitai

Civitai is a platform where people can share their stable diffusion models (textual inversions, hypernetworks, aesthetic gradients, VAEs, and any other crazy stuff people do to customize their AI generations), collaborate with others to improve them, and learn from each other's work. The platform allows users to create an account, upload their models, and browse models that have been shared by others. Users can also leave comments and feedback on each other's models to facilitate collaboration and knowledge sharing.

github

: 6.5k

ap-plugin

AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.

github

: 103

ComfyUI-IF_AI_tools

ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.

github

: 610

midjourney-proxy

Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.

github

: 4.9k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

WatermarkRemover-AI

README:

WatermarkRemover-AI

Overview

Table of Contents

Features

Technical Overview

Florence-2 for Watermark Detection

LaMA for Inpainting

PyQt6 GUI

Installation

Prerequisites

Steps

Usage

Preferred Way: Setup Script

Manual Way

Using the GUI

Using the CLI

Upgrade Notes

Alpha Masking

Why Alpha Masking?

Contributing

License

For Tasks:

For Jobs:

Alternative AI tools for WatermarkRemover-AI

Similar Open Source Tools

WatermarkRemover-AI

rkllama

Hacx-GPT

swift-ocr-llm-powered-pdf-to-markdown

trendFinder

RealtimeSTT_LLM_TTS

miner-release

DeepSeekAI

CrewAI-Studio

minefield

probe

MM-RLHF

Visionatrix

word-GPT-Plus

aiogram-django-template

Vodalus-Expert-LLM-Forge

For similar tasks

ShortGPT

WatermarkRemover-AI

InvokeAI

StableSwarmUI

civitai

ap-plugin

ComfyUI-IF_AI_tools

midjourney-proxy

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape