Best AI tools for< cleaning >
20 - AI tool Sites
Roundtable
Roundtable is an AI-assisted data cleaning tool that helps users clean survey data efficiently. It offers an easy-to-integrate API for cleaning open-ended survey responses, saving up to 70% of time. The tool is trusted by leaders and innovators in various industries for improving data quality efforts and providing reliable human-generated insights.
Audo Studio
Audo Studio is an online audio cleaning tool that uses artificial intelligence to remove background noise, reduce echoes, and adjust volume levels. It is designed to be easy to use, with a simple one-click interface. Audo Studio is suitable for a variety of users, including podcasters, YouTubers, and anyone else who needs to clean up their audio recordings.
CloudMinds
CloudMinds is a world-leading creator, producer, and operator of cloud robot systems and services. Founded in 2015, CloudMinds has created a unique Cloud Robot Architecture based on the vision of “Cloud AI Connecting To The Future”, and launched the HARIX Cloud AI Robot Operating System and end-to-end commercial services in 2017. With the mission of “Operating Smart Robots for People”, the company aims to lead the cutting-edge technologies development for building a humanoid robot for enterprise and families. Our cloud robots will help people to do dull, dirty, dangerous or demeaning (4D) work, and thus making people’s lives more enjoyable.
Meow Apps
Meow Apps is a collection of powerful WordPress plugins designed to supercharge websites with AI capabilities, optimization features, and more. Created by Jordy Meow, a software engineer and photographer based in Tokyo, the plugins aim to enhance productivity and user experience on WordPress platforms. With a focus on optimization, imagery, and AI integration, Meow Apps offers a range of tools to elevate content, automate social posts, clean databases, manage media files, and add AI features like chatbots and content generation. The plugins are known for their friendly user interface, extensive features, and support for databases of all sizes. Meow Apps strives for perfection by providing high-quality tools that can transform the WordPress experience for users.
FireCut
FireCut is an AI-powered video editing tool that seamlessly integrates with Adobe Premiere Pro. It automates time-consuming tasks such as cutting silences, adding captions, creating chapters, and more, allowing video creators to save significant time and effort. FireCut's intuitive interface and advanced AI algorithms make it accessible to users of all skill levels, empowering them to produce professional-quality videos with ease.
Dobb-E
Dobb-E is an open-source, general framework for learning household robotic manipulation. It is an affordable yet versatile general-purpose system for learning robotic manipulation within household settings. Dobb-E can learn a new task with only five minutes of a user showing it how to do it, thanks to a demonstration collection tool ("The Stick") built out of cheap parts and iPhones. Dobb-E has been tested in 10 homes, with a total of 109 tasks in different environments, and finally achieved a success rate of 81%.
DataCamp
DataCamp is an online learning platform that offers courses in data science, AI, and machine learning. The platform provides interactive exercises, short videos, and hands-on projects to help learners develop the skills they need to succeed in the field. DataCamp also offers a variety of resources for businesses, including team training, custom content development, and data science consulting.
Mito
Mito is a low-code data app infrastructure that allows users to edit spreadsheets and automatically generate Python code. It is designed to help analysts automate their repetitive Excel work and take automation into their own hands. Mito is a Jupyter extension and Streamlit component, so users don't need to set up any new infrastructure. It is easy to get started with Mito, simply install it using pip and start using it in Jupyter or Streamlit.
MonkeyLearn
MonkeyLearn is a text analytics tool that simplifies the process of cleaning, labeling, and visualizing customer feedback in one place. It is powered by cutting-edge Artificial Intelligence and is designed to cater to the needs of data-driven companies. With features like instant data visualizations, detailed insights, pre-built and custom machine learning models, and business templates, MonkeyLearn offers a comprehensive text analysis and data visualization studio for users to gain valuable insights from their data effortlessly.
MailEcho
MailEcho is an AI-powered email inbox filtering and cleaning service that helps users keep their inboxes free of promotional and sales emails. It uses AI to monitor your email inbox and automatically archives all promotional and sales emails. This keeps your inbox clean and ensures you never miss an important email.
Lettria
Lettria is a no-code AI platform for text that helps users turn unstructured text data into structured knowledge. It combines the best of Large Language Models (LLMs) and symbolic AI to overcome current limitations in knowledge extraction. Lettria offers a suite of APIs for text cleaning, text mining, text classification, and prompt engineering. It also provides a Knowledge Studio for building knowledge graphs and private GPT models. Lettria is trusted by large organizations such as AP-HP and Leroy Merlin to improve their data analysis and decision-making processes.
Firecrawl
Firecrawl is an advanced web crawling and data conversion tool designed to transform any website into clean, LLM-ready markdown. It automates the collection, cleaning, and formatting of web data, streamlining the preparation process for Large Language Model (LLM) applications. Firecrawl is best suited for business websites, documentation, and help centers, offering features like crawling all accessible subpages, handling dynamic content, converting data into well-formatted markdown, and more. It is built by LLM engineers for LLM engineers, providing clean data the way users want it.
Airscale
Airscale is a lead generation tool that helps businesses find, enrich, and export leads from various sources. It offers a range of features including lead scraping, data enrichment, AI-powered content generation, and data cleaning. Airscale integrates with popular CRMs and outbound tools, making it easy for businesses to manage their lead generation process.
maya.ai
Crayon Data's maya.ai platform is an AI-led revenue acceleration platform for enterprises. It helps businesses unlock the value of data to increase customer engagement and revenue. The platform offers a range of capabilities, including data cleaning and enrichment, personalized recommendations, and plug-and-play APIs. Maya.ai has been used by leading global enterprises to achieve significant results, including increased revenue, improved customer engagement, and reduced time to market.
dataset.macgence
dataset.macgence is an AI-powered data analysis tool that helps users extract valuable insights from their datasets. It offers a user-friendly interface for uploading, cleaning, and analyzing data, making it suitable for both beginners and experienced data analysts. With advanced algorithms and visualization capabilities, dataset.macgence enables users to uncover patterns, trends, and correlations in their data, leading to informed decision-making. Whether you're a business professional, researcher, or student, dataset.macgence can streamline your data analysis process and enhance your data-driven strategies.
Clay
Clay is an AI-powered data enrichment and outreach automation tool designed to help go-to-market teams scale personalized outbound campaigns. It combines 75+ data enrichment tools, AI capabilities, and automation features to streamline lead generation, data cleaning, and personalized messaging. With access to 50+ data providers, Clay offers comprehensive coverage of information and enables users to connect, enrich, and sync their CRM data effortlessly. The platform also features AI web scraping, personalized email building, automated inbound and outbound processes, and data formatting functionalities.
Kanaries
Kanaries is an augmented analytics platform that uses AI to automate the process of data exploration and visualization. It offers a variety of features to help users quickly and easily find insights in their data, including: * **RATH:** An AI-powered engine that can automatically generate insights and recommendations based on your data. * **Graphic Walker:** A visual analytics tool that allows you to explore your data in a variety of ways, including charts, graphs, and maps. * **Data Painter:** A data cleaning and transformation tool that makes it easy to prepare your data for analysis. * **Causal Analysis:** A tool that helps you identify and understand the causal relationships between variables in your data. Kanaries is designed to be easy to use, even for users with no prior experience with data analysis. It is also highly scalable, so it can be used to analyze large datasets. Kanaries is a valuable tool for anyone who wants to quickly and easily find insights in their data. It can be used by businesses of all sizes, and it is particularly well-suited for organizations that are looking to improve their data-driven decision-making.
Displayr
Displayr is a comprehensive data workspace designed for teams, offering a range of capabilities including survey analysis, data visualization, dashboarding, automatic updating, PowerPoint reporting, finding data stories, and data cleaning. The platform aims to streamline workflow efficiency, promote self-sufficiency through DIY analytics, enable data storytelling with compelling narratives, and ensure quality control to minimize errors. Displayr caters to statisticians, market researchers, report creators, and professionals working with data, providing a user-friendly interface for creating interactive and insightful data stories.
ChartPixel
ChartPixel is an AI-powered data analysis tool that helps users visualize and understand their data in seconds. With ChartPixel, you can upload your data file or questionnaire and let the AI auto-generate a gallery of charts with insightfully written AI-assisted annotations and statistics. This makes it easy to explore the quirks and patterns in your data, even if you're a data novice. ChartPixel also offers a variety of features to help you save time and effort, such as data cleaning, formatting charts, extracting trends and insights, and presenting and collaborating. With ChartPixel, you can get data-driven insights in seconds, without the need for steep learning curve.
Altered Studio
Altered Studio is a Voice Content Creation platform that provides exclusive access to our unique Speech-To-Speech Voice Morphing and integrates various Voice AI technologies into a single user friendly application for media production.
20 - Open Source Tools
PanelCleaner
Panel Cleaner is a tool that uses machine learning to find text in images and generate masks to cover it up with high accuracy. It is designed to clean text bubbles without leaving artifacts, avoiding painting over non-text parts, and inpainting bubbles that can't be masked out. The tool offers various customization options, detailed analytics on the cleaning process, supports batch processing, and can run OCR on pages. It supports CUDA acceleration, multiple themes, and can handle bubbles on any solid grayscale background color. Panel Cleaner is aimed at saving time for cleaners by automating monotonous work and providing precise cleaning of text bubbles.
Chinese-Tiny-LLM
Chinese-Tiny-LLM is a repository containing procedures for cleaning Chinese web corpora and pre-training code. It introduces CT-LLM, a 2B parameter language model focused on the Chinese language. The model primarily uses Chinese data from a 1,200 billion token corpus, showing excellent performance in Chinese language tasks. The repository includes tools for filtering, deduplication, and pre-training, aiming to encourage further research and innovation in language model development.
uBlockOrigin-HUGE-AI-Blocklist
A huge blocklist of sites containing AI generated content (~950 sites) for cleaning image search engines with uBlock Origin or uBlacklist. Includes hosts file for pi-hole/adguard. Provides instructions for importing blocklists and additional lists for specific content. Allows users to create allowlists and customize filtering based on keywords. Offers tips and tricks for advanced filtering and comparison between uBlock Origin and uBlacklist implementations.
sycamore
Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.
glm-free-api
GLM AI Free 服务 provides high-speed streaming output, multi-turn dialogue support, intelligent agent dialogue support, AI drawing support, online search support, long document interpretation support, image parsing support. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository also includes six other free APIs for various services like Moonshot AI, StepChat, Qwen, Metaso, Spark, and Emohaa. The tool supports tasks such as chat completions, AI drawing, document interpretation, image parsing, and refresh token survival check.
spark-free-api
Spark AI Free 服务 provides high-speed streaming output, multi-turn dialogue support, AI drawing support, long document interpretation, and image parsing. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository includes multiple free-api projects for various AI services. Users can access the API for tasks such as chat completions, AI drawing, document interpretation, image analysis, and ssoSessionId live checking. The project also provides guidelines for deployment using Docker, Docker-compose, Render, Vercel, and native deployment methods. It recommends using custom clients for faster and simpler access to the free-api series projects.
qwen-free-api
Qwen AI Free service supports high-speed streaming output, multi-turn dialogue, watermark-free AI drawing, long document interpretation, image parsing, zero-configuration deployment, multi-token support, automatic session trace cleaning. It is fully compatible with the ChatGPT interface. The repository provides various free APIs for different AI services. Users can access the service through different deployment methods like Docker, Docker-compose, Render, Vercel, and native deployment. It offers interfaces for chat completions, AI drawing, document interpretation, image parsing, and token checking. Users need to provide 'login_tongyi_ticket' for authorization. The project emphasizes research, learning, and personal use only, discouraging commercial use to avoid service pressure on the official platform.
step-free-api
The StepChat Free service provides high-speed streaming output, multi-turn dialogue support, online search support, long document interpretation, and image parsing. It offers zero-configuration deployment, multi-token support, and automatic session trace cleaning. It is fully compatible with the ChatGPT interface. Additionally, it provides seven other free APIs for various services. The repository includes a disclaimer about using reverse APIs and encourages users to avoid commercial use to prevent service pressure on the official platform. It offers online testing links, showcases different demos, and provides deployment guides for Docker, Docker-compose, Render, Vercel, and native deployments. The repository also includes information on using multiple accounts, optimizing Nginx reverse proxy, and checking the liveliness of refresh tokens.
AI-Lossless-Zoomer
AI-Lossless-Zoomer is a tool that utilizes the Real-ESRGAN model provided by Tencent ARC Lab to enhance images, particularly portraits and anime pictures, with fast processing. It supports multi-thread processing, batch image processing, customizable options, output formats, output paths, AI engine selection, and batch cleaning tasks. The tool is designed for Windows 7 or later with .NET Framework 4.6+. Users can choose between the installable version (.exe) and the portable version (.zip) that includes the latest AI engine. The tool is efficient for enlarging images while maintaining quality.
gollama
Gollama is a tool designed for managing Ollama models through a Text User Interface (TUI). Users can list, inspect, delete, copy, and push Ollama models, as well as link them to LM Studio. The application offers interactive model selection, sorting by various criteria, and actions using hotkeys. It provides features like sorting and filtering capabilities, displaying model metadata, model linking, copying, pushing, and more. Gollama aims to be user-friendly and useful for managing models, especially for cleaning up old models.
LakeSoul
LakeSoul is a cloud-native Lakehouse framework that supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing. It supports multiple computing engines like Spark, Flink, Presto, and PyTorch, and computing modes such as batch, stream, MPP, and AI. LakeSoul scales metadata management and achieves ACID control by using PostgreSQL. It provides features like automatic compaction, table lifecycle maintenance, redundant data cleaning, and permission isolation for metadata.
data-juicer
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. It is a systematic & reusable library of 80+ core OPs, 20+ reusable config recipes, and 20+ feature-rich dedicated toolkits, designed to function independently of specific LLM datasets and processing pipelines. Data-Juicer allows detailed data analyses with an automated report generation feature for a deeper understanding of your dataset. Coupled with multi-dimension automatic evaluation capabilities, it supports a timely feedback loop at multiple stages in the LLM development process. Data-Juicer offers tens of pre-built data processing recipes for pre-training, fine-tuning, en, zh, and more scenarios. It provides a speedy data processing pipeline requiring less memory and CPU usage, optimized for maximum productivity. Data-Juicer is flexible & extensible, accommodating most types of data formats and allowing flexible combinations of OPs. It is designed for simplicity, with comprehensive documentation, easy start guides and demo configs, and intuitive configuration with simple adding/removing OPs from existing configs.
llm-search
pyLLMSearch is an advanced RAG system that offers a convenient question-answering system with a simple YAML-based configuration. It enables interaction with multiple collections of local documents, with improvements in document parsing, hybrid search, chat history, deep linking, re-ranking, customizable embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) from OpenAI or installed locally. It supports various document formats, incremental embedding updates, dense and sparse embeddings, multiple embedding models, 'Retrieve and Re-rank' strategy, HyDE (Hypothetical Document Embeddings), multi-querying, chat history, and interaction with embedded documents using different models. It also offers simple CLI and web interfaces, deep linking, offline response saving, and an experimental API.
Cradle
The Cradle project is a framework designed for General Computer Control (GCC), empowering foundation agents to excel in various computer tasks through strong reasoning abilities, self-improvement, and skill curation. It provides a standardized environment with minimal requirements, constantly evolving to support more games and software. The repository includes released versions, publications, and relevant assets.
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
DB-GPT
DB-GPT is an open source AI native data app development framework with AWEL(Agentic Workflow Expression Language) and agents. It aims to build infrastructure in the field of large models, through the development of multiple technical capabilities such as multi-model management (SMMF), Text2SQL effect optimization, RAG framework and optimization, Multi-Agents framework collaboration, AWEL (agent workflow orchestration), etc. Which makes large model applications with data simpler and more convenient.
haystack
Haystack is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search and more. Whether you want to perform retrieval-augmented generation (RAG), document search, question answering or answer generation, Haystack can orchestrate state-of-the-art embedding models and LLMs into pipelines to build end-to-end NLP applications and solve your use case.
RLHF-Reward-Modeling
This repository contains code for training reward models for Deep Reinforcement Learning-based Reward-modulated Hierarchical Fine-tuning (DRL-based RLHF), Iterative Selection Fine-tuning (Rejection sampling fine-tuning), and iterative Decision Policy Optimization (DPO). The reward models are trained using a Bradley-Terry model based on the Gemma and Mistral language models. The resulting reward models achieve state-of-the-art performance on the RewardBench leaderboard for reward models with base models of up to 13B parameters.
prompt-generator-comfyui
Custom AI prompt generator node for ComfyUI. With this node, you can use text generation models to generate prompts. Before using, text generation model has to be trained with prompt dataset.
fastllm
A collection of LLM services you can self host via docker or modal labs to support your applications development. The goal is to provide docker containers or modal labs deployments of common patterns when using LLMs and endpoints to integrate easily with existing codebases using the openai api. It supports GPT4all's embedding api, JSONFormer api for chat completion, Cross Encoders based on sentence transformers, and provides documentation using MkDocs.
17 - OpenAI Gpts
Cleaning Genius
👌 AI-Powered Eco-Friendly Stain Solver 👌 Your smart stain-removing companion for any surface. Say goodbye to tough stains with Clean Genius! 🌱✨
CleanGPT ADHD Cleaning Helper
making you have a fun time and be accountable for a clean space
Cleaning Advisor
A virtual assistant for cleaning and organizing, offering personalized advice and schedules.
CleanBiz Mentor
A mentor for janitorial entrepreneurs offering guidance for scaling cleaning businesses.
HomeSync AI
Your AI home organizer for streamlined cleaning schedules, inventory tracking, and decluttering support, tailored to your household dynamics.
Live Dwell
I teach Home Economics and help with Cooking, Cleaning, and Running a Household.
Squeaky Data Cleaner
Clean and structure your raw data with automatic file output for your Custom GPT knowledge.
La Suegra Limpiadora
Experta en la eliminación de manchas de ropa, sofás y otros tejidos. Te dejaré la ropa "perfesssstaaa"
Quotes from Dharma Master Cheng Yen of Taiwan
Quotes from Master Cheng Yen of Tzu Chi, Taiwan: Cleansing the mind, planting good seeds, awakening the inherent goodness within.
GeoGuide France
Un assistant sur les règlementations débroussaillage et zone naturelles en France