Best AI tools for< Customize Extraction >
20 - AI tool Sites

ChartAI
ChartAI is an AI tool that allows users to customize bar charts from screenshots. Users can upload their chart images, and ChartAI's visual recognition technology extracts the data to recreate the chart. Users can adjust the style, add or remove data, and change the chart type effortlessly. The tool simplifies the process of creating charts and provides quick updates based on user requests.

Pipeless Agents
Pipeless Agents is a platform that allows users to convert any video feed into an actionable data stream, enabling automation of tasks based on visual inputs. It serves as a serverless platform for Vision AI, offering the ability to create projects, connect video sources, and customize agents for specific needs. With a focus on simplicity and efficiency, Pipeless Agents empowers users to extract structured data from various video sources and automate processes with minimal coding requirements.

Otto
Otto is an AI-powered tool designed to streamline work processes by bringing reasoning to data. It allows users to define tables once and automate numerous tasks in minutes. With features like research capabilities, outbound message creation, and customizable columns, Otto enables users to work 10x faster by leveraging AI agents for parallel processing. The tool unlocks insights from various data sources, including websites, documents, and images, and offers an AI Assistant for contextual assistance. Otto aims to enhance productivity and efficiency by providing advanced data analysis and processing functionalities.

Jamorphosia
Jamorphosia is an AI-powered application that allows users to remove instruments from a song. With advanced technology and audio separation capabilities, users can easily extract vocals, isolate specific instruments, and create custom backing tracks. The tool transforms audio files into personalized songs, suitable for practice or performance. Jamorphosia enhances the music experience by providing a platform for musicians to engage with original tracks in a more immersive way.

Picture To Summary AI
Picture To Summary AI is an online tool that leverages cutting-edge AI technology to provide summaries from images or pictures. Users can upload images and receive concise and accurate summaries generated by AI, extract text from images, generate captions for social media posts, and customize prompts to tailor descriptions. The tool aims to simplify communication and understanding of image content through AI-driven analysis.

Picture To Summary AI
Picture To Summary AI is a powerful online tool that leverages cutting-edge AI technology to analyze images and generate insightful summaries or descriptions. Users can upload images and receive concise and accurate summaries, extract text from images, generate captions for social media posts, and customize prompts to tailor the output. The application aims to simplify communication and understanding by providing quick and efficient image analysis solutions.

Procys
Procys is a document processing platform powered by AI that offers automated document processing solutions. It provides features such as a self-learning engine, seamless integration with ERP systems, OCR API powered by AI, customized data extraction, and AI autosplit for automatic document splitting. Procys helps with tasks like invoice OCR, ID card OCR, receipt OCR, and account payable automation. The platform aims to streamline document workflows, eliminate manual processes, save time, reduce errors, and ensure compliance for businesses.

Lettria
Lettria is a no-code AI platform for text that helps users turn unstructured text data into structured knowledge. It combines the best of Large Language Models (LLMs) and symbolic AI to overcome current limitations in knowledge extraction. Lettria offers a suite of APIs for text cleaning, text mining, text classification, and prompt engineering. It also provides a Knowledge Studio for building knowledge graphs and private GPT models. Lettria is trusted by large organizations such as AP-HP and Leroy Merlin to improve their data analysis and decision-making processes.

InstantAPI.ai
InstantAPI.ai is an AI-powered web scraping tool that allows developers, data scientists, and SEO specialists to instantly turn any web page into a personalized API. With the ability to effortlessly scrape, customize, and integrate data, users can enhance their projects, drive insights, and optimize performance. The tool offers features such as scraping precise data, transforming information into various formats, generating new content, providing advanced analysis, and extracting valuable insights from data. Users can tailor the output to meet specific needs and unleash creativity by using AI for unique purposes. InstantAPI.ai simplifies the process of web scraping and data manipulation, offering a seamless experience for users seeking to leverage AI technology for their projects.

Reedr
Reedr is an AI-powered browser automation tool that simplifies scraping at scale. It offers features such as text recognition (OCR), custom headers, CAPTCHA solver, and proxying for efficient data extraction. With Reedr, users can automate tasks, generate reports, and monitor running tasks in real-time. The tool utilizes AI capabilities to convert visible text and images on web pages into formatted data, supporting various data processing needs. Additionally, Reedr provides customized real-time reporting with API endpoints for different reporting teams, enabling data export in formats like CSV, XLSX, JSON, and YAML. The tool prioritizes industry-leading compliance, adhering to data protection laws and privacy regulations like GDPR.

SkillOk
SkillOk is an AI-powered resume builder that helps users create tailored resumes for each job application. It automates the process of customizing resumes by extracting skills from job descriptions, fine-tuning content based on targeted questions, and generating customized intros. The tool ensures that resumes are relevant to job requirements, optimized for ATS software, and aligned with industry best practices. Users can also fully customize their resumes using the drag-n-drop builder and export them in various formats. SkillOk aims to increase users' chances of getting interviews and landing their dream jobs with confidence.

Painboard
Painboard is a customer feedback analysis tool designed to help businesses uncover and prioritize customer pain points. It automatically summarizes, groups, and sorts customer feedback, allowing users to focus on the most important aspects. With features like original quote display, customizable grouping, and Zapier support, Painboard streamlines the process of extracting valuable insights from customer reviews. By simplifying the task of analyzing large volumes of feedback, Painboard enables businesses to make informed decisions and improve customer satisfaction.

InstantAPI.ai
InstantAPI.ai is a powerful web scraping API and Chrome extension that allows users to extract data from any website with ease. The tool leverages AI technology to automate data extraction, adapt to site changes, and deliver customized JSON objects. With features like worldwide geotargeting, proxy management, JavaScript rendering, and CAPTCHA bypass, InstantAPI.ai ensures fast and reliable results. Users can describe the data they need and receive it in real-time, tailored to their exact requirements. The tool offers unlimited concurrency, human support, and a user-friendly interface, making web scraping simple and efficient.

FlexClip
FlexClip is a powerful yet easy-to-use online video editing tool. With its extensive templates and resources, you can easily create high-quality videos for personal or business purposes without any learning curve.

SnapSite
SnapSite is an AI-powered website service that allows users to customize their website effortlessly. With its flat-rate all-in-one solution, there's no need for design, development, or marketing expertise. Users can simply send their request in natural language and SnapSite will deliver a stunning, highly functional website tailored to their specific needs.

My Hacker News
My Hacker News is an AI-powered platform that offers a personalized daily dose of Hacker News through a customized newsletter. The platform utilizes AI algorithms, including Claude3.5 Sonnet and GPT-4o, to semantically index HN stories and comments daily, finding new stories matching users' interests and reranking them. Users receive a tailored newsletter directly in their inbox, saving time and keeping them informed. The platform allows users to shape their digest and offers a free digest email service without the need for sign up.

Triggs AI
Triggs AI is an advanced AI tool designed to provide users with a seamless experience in customizing scrollbars. With its intuitive interface and powerful algorithms, Triggs AI allows users to personalize scrollbar designs effortlessly. Whether you are a web developer, designer, or enthusiast, Triggs AI offers a range of customization options to enhance the visual appeal of your websites. Say goodbye to standard scrollbars and hello to a new level of creativity with Triggs AI.

OffRobe
OffRobe is a powerful NSFW AI image generator and editor that allows users to create and edit realistic and high-quality NSFW images with ease. The platform is designed with strong privacy and security measures to ensure that users' activities and data remain confidential. OffRobe's AI models offer a high level of customization, enabling users to bring their unique fantasies to life. With its user-friendly interface and advanced features, OffRobe is the perfect tool for anyone looking to explore the world of NSFW AI art.

Oda Moodboard
Oda Moodboard is an AI-powered tool that allows users to create mood boards for their home decor. With Oda Moodboard, users can choose their style and color preferences, and the AI will generate a mood board with matching images. Users can also upload their own photos to create a mood board. Oda Moodboard is a great way to get inspiration for your home decor, and it can help you to create a cohesive look for your home.

VERSA
VERSA is a text-based adventure game that allows users to choose their own adventure and customize their companion. Users can choose from a variety of settings, including sci-fi, wild-west, horror, drama, war, university, or fantasy. They can also choose a male, female, or non-binary companion to be their friend, romance, or enemy. VERSA is designed to push the limits of what's possible with a 1-gem model, while keeping it as entertaining as possible.
20 - Open Source AI Tools

ExtractThinker
ExtractThinker is a library designed for extracting data from files and documents using Language Model Models (LLMs). It offers ORM-style interaction between files and LLMs, supporting multiple document loaders such as Tesseract OCR, Azure Form Recognizer, AWS TextExtract, and Google Document AI. Users can customize extraction using contract definitions, process documents asynchronously, handle various document formats efficiently, and split and process documents. The project is inspired by the LangChain ecosystem and focuses on Intelligent Document Processing (IDP) using LLMs to achieve high accuracy in document extraction tasks.

panda-etl
PandaETL is an open-source, no-code ETL tool designed to extract and parse data from various document types including PDFs, emails, websites, audio files, and more. With an intuitive interface and powerful backend, PandaETL simplifies the process of data extraction and transformation, making it accessible to users without programming skills.

OneKE
OneKE is a flexible dockerized system for schema-guided knowledge extraction, capable of extracting information from the web and raw PDF books across multiple domains like science and news. It employs a collaborative multi-agent approach and includes a user-customizable knowledge base to enable tailored extraction. OneKE offers various IE tasks support, data sources support, LLMs support, extraction method support, and knowledge base configuration. Users can start with examples using YAML, Python, or Web UI, and perform tasks like Named Entity Recognition, Relation Extraction, Event Extraction, Triple Extraction, and Open Domain IE. The tool supports different source formats like Plain Text, HTML, PDF, Word, TXT, and JSON files. Users can choose from various extraction models like OpenAI, DeepSeek, LLaMA, Qwen, ChatGLM, MiniCPM, and OneKE for information extraction tasks. Extraction methods include Schema Agent, Extraction Agent, and Reflection Agent. The tool also provides support for schema repository and case repository management, along with solutions for network issues. Contributors to the project include Ningyu Zhang, Haofen Wang, Yujie Luo, Xiangyuan Ru, Kangwei Liu, Lin Yuan, Mengshu Sun, Lei Liang, Zhiqiang Zhang, Jun Zhou, Lanning Wei, Da Zheng, and Huajun Chen.

open-deep-research
Open Deep Research is an open-source tool designed to generate AI-powered reports from web search results efficiently. It combines Bing Search API for search results retrieval, JinaAI for content extraction, and customizable report generation. Users can customize settings, export reports in multiple formats, and benefit from rate limiting for stability. The tool aims to streamline research and report creation in a user-friendly platform.

deepdoctection
**deep** doctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated framework for fine-tuning, evaluating and running models. For more specific text processing tasks use one of the many other great NLP libraries. **deep** doctection focuses on applications and is made for those who want to solve real world problems related to document extraction from PDFs or scans in various image formats. **deep** doctection provides model wrappers of supported libraries for various tasks to be integrated into pipelines. Its core function does not depend on any specific deep learning library. Selected models for the following tasks are currently supported: * Document layout analysis including table recognition in Tensorflow with **Tensorpack**, or PyTorch with **Detectron2**, * OCR with support of **Tesseract**, **DocTr** (Tensorflow and PyTorch implementations available) and a wrapper to an API for a commercial solution, * Text mining for native PDFs with **pdfplumber**, * Language detection with **fastText**, * Deskewing and rotating images with **jdeskew**. * Document and token classification with all LayoutLM models provided by the **Transformer library**. (Yes, you can use any LayoutLM-model with any of the provided OCR-or pdfplumber tools straight away!). * Table detection and table structure recognition with **table-transformer**. * There is a small dataset for token classification available and a lot of new tutorials to show, how to train and evaluate this dataset using LayoutLMv1, LayoutLMv2, LayoutXLM and LayoutLMv3. * Comprehensive configuration of **analyzer** like choosing different models, output parsing, OCR selection. Check this notebook or the docs for more infos. * Document layout analysis and table recognition now runs with **Torchscript** (CPU) as well and **Detectron2** is not required anymore for basic inference. * [**new**] More angle predictors for determining the rotation of a document based on **Tesseract** and **DocTr** (not contained in the built-in Analyzer). * [**new**] Token classification with **LiLT** via **transformers**. We have added a model wrapper for token classification with LiLT and added a some LiLT models to the model catalog that seem to look promising, especially if you want to train a model on non-english data. The training script for LayoutLM can be used for LiLT as well and we will be providing a notebook on how to train a model on a custom dataset soon. **deep** doctection provides on top of that methods for pre-processing inputs to models like cropping or resizing and to post-process results, like validating duplicate outputs, relating words to detected layout segments or ordering words into contiguous text. You will get an output in JSON format that you can customize even further by yourself. Have a look at the **introduction notebook** in the notebook repo for an easy start. Check the **release notes** for recent updates. **deep** doctection or its support libraries provide pre-trained models that are in most of the cases available at the **Hugging Face Model Hub** or that will be automatically downloaded once requested. For instance, you can find pre-trained object detection models from the Tensorpack or Detectron2 framework for coarse layout analysis, table cell detection and table recognition. Training is a substantial part to get pipelines ready on some specific domain, let it be document layout analysis, document classification or NER. **deep** doctection provides training scripts for models that are based on trainers developed from the library that hosts the model code. Moreover, **deep** doctection hosts code to some well established datasets like **Publaynet** that makes it easy to experiment. It also contains mappings from widely used data formats like COCO and it has a dataset framework (akin to **datasets** so that setting up training on a custom dataset becomes very easy. **This notebook** shows you how to do this. **deep** doctection comes equipped with a framework that allows you to evaluate predictions of a single or multiple models in a pipeline against some ground truth. Check again **here** how it is done. Having set up a pipeline it takes you a few lines of code to instantiate the pipeline and after a for loop all pages will be processed through the pipeline.

llm-document-ocr
LLM Document OCR is a Node.js tool that utilizes GPT4 and Claude3 for OCR and data extraction. It converts PDFs into PNGs, crops white-space, cleans up JSON strings, and supports various image formats. Users can customize prompts for data extraction. The tool is sponsored by Mercoa, offering API for BillPay and Invoicing.

instructor-php
Instructor for PHP is a library designed for structured data extraction in PHP, powered by Large Language Models (LLMs). It simplifies the process of extracting structured, validated data from unstructured text or chat sequences. Instructor enhances workflow by providing a response model, validation capabilities, and max retries for requests. It supports classes as response models and provides features like partial results, string input, extracting scalar and enum values, and specifying data models using PHP type hints or DocBlock comments. The library allows customization of validation and provides detailed event notifications during request processing. Instructor is compatible with PHP 8.2+ and leverages PHP reflection, Symfony components, and SaloonPHP for communication with LLM API providers.

Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.

crawl4ai
Crawl4AI is a powerful and free web crawling service that extracts valuable data from websites and provides LLM-friendly output formats. It supports crawling multiple URLs simultaneously, replaces media tags with ALT, and is completely free to use and open-source. Users can integrate Crawl4AI into Python projects as a library or run it as a standalone local server. The tool allows users to crawl and extract data from specified URLs using different providers and models, with options to include raw HTML content, force fresh crawls, and extract meaningful text blocks. Configuration settings can be adjusted in the `crawler/config.py` file to customize providers, API keys, chunk processing, and word thresholds. Contributions to Crawl4AI are welcome from the open-source community to enhance its value for AI enthusiasts and developers.

OpenAI-CLIP-Feature
This repository provides code for extracting image and text features using OpenAI CLIP models, supporting both global and local grid visual features. It aims to facilitate multi visual-and-language downstream tasks by allowing users to customize input and output grid resolution easily. The extracted features have shown comparable or superior results in image captioning tasks without hyperparameter tuning. The repo supports various CLIP models and provides detailed information on supported settings and results on MSCOCO image captioning. Users can get started by setting up experiments with the extracted features using X-modaler.

blinkid-ios
BlinkID iOS is a mobile SDK that enables developers to easily integrate ID scanning and data extraction capabilities into their iOS applications. The SDK supports scanning and processing various types of identity documents, such as passports, driver's licenses, and ID cards. It provides accurate and fast data extraction, including personal information and document details. With BlinkID iOS, developers can enhance their apps with secure and reliable ID verification functionality, improving user experience and streamlining identity verification processes.

SemanticFinder
SemanticFinder is a frontend-only live semantic search tool that calculates embeddings and cosine similarity client-side using transformers.js and SOTA embedding models from Huggingface. It allows users to search through large texts like books with pre-indexed examples, customize search parameters, and offers data privacy by keeping input text in the browser. The tool can be used for basic search tasks, analyzing texts for recurring themes, and has potential integrations with various applications like wikis, chat apps, and personal history search. It also provides options for building browser extensions and future ideas for further enhancements and integrations.

VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.

doc2plan
doc2plan is a browser-based application that helps users create personalized learning plans by extracting content from documents. It features a Creator for manual or AI-assisted plan construction and a Viewer for interactive plan navigation. Users can extract chapters, key topics, generate quizzes, and track progress. The application includes AI-driven content extraction, quiz generation, progress tracking, plan import/export, assistant management, customizable settings, viewer chat with text-to-speech and speech-to-text support, and integration with various Retrieval-Augmented Generation (RAG) models. It aims to simplify the creation of comprehensive learning modules tailored to individual needs.

CyberScraper-2077
CyberScraper 2077 is an advanced web scraping tool powered by AI, designed to extract data from websites with precision and style. It offers a user-friendly interface, supports multiple data export formats, operates in stealth mode to avoid detection, and promises lightning-fast scraping. The tool respects ethical scraping practices, including robots.txt and site policies. With upcoming features like proxy support and page navigation, CyberScraper 2077 is a futuristic solution for data extraction in the digital realm.

PulsarRPA
PulsarRPA is a high-performance, distributed, open-source Robotic Process Automation (RPA) framework designed to handle large-scale RPA tasks with ease. It provides a comprehensive solution for browser automation, web content understanding, and data extraction. PulsarRPA addresses challenges of browser automation and accurate web data extraction from complex and evolving websites. It incorporates innovative technologies like browser rendering, RPA, intelligent scraping, advanced DOM parsing, and distributed architecture to ensure efficient, accurate, and scalable web data extraction. The tool is open-source, customizable, and supports cutting-edge information extraction technology, making it a preferred solution for large-scale web data extraction.

crawlee-python
Crawlee-python is a web scraping and browser automation library that covers crawling and scraping end-to-end, helping users build reliable scrapers fast. It allows users to crawl the web for links, scrape data, and store it in machine-readable formats without worrying about technical details. With rich configuration options, users can customize almost any aspect of Crawlee to suit their project's needs.

ai-voice-cloning
This repository provides a tool for AI voice cloning, allowing users to generate synthetic speech that closely resembles a target speaker's voice. The tool is designed to be user-friendly and accessible, with a graphical user interface that guides users through the process of training a voice model and generating synthetic speech. The tool also includes a variety of features that allow users to customize the generated speech, such as the pitch, volume, and speaking rate. Overall, this tool is a valuable resource for anyone interested in creating realistic and engaging synthetic speech.

AutoGroq
AutoGroq is a revolutionary tool that dynamically generates tailored teams of AI agents based on project requirements, eliminating manual configuration. It enables users to effortlessly tackle questions, problems, and projects by creating expert agents, workflows, and skillsets with ease and efficiency. With features like natural conversation flow, code snippet extraction, and support for multiple language models, AutoGroq offers a seamless and intuitive AI assistant experience for developers and users.

readme-ai
README-AI is a developer tool that auto-generates README.md files using a combination of data extraction and generative AI. It streamlines documentation creation and maintenance, enhancing developer productivity. This project aims to enable all skill levels, across all domains, to better understand, use, and contribute to open-source software. It offers flexible README generation, supports multiple large language models (LLMs), provides customizable output options, works with various programming languages and project types, and includes an offline mode for generating boilerplate README files without external API calls.
20 - OpenAI Gpts

Tattoo Ideas GPT
Helps design and customize tattoos, recommends artists, and provides aftercare advice.

Quick QR Art - QR Code AI Art Generator
Create, Customize, and Track Stunning QR Codes Art with Our Free QR Code AI Art Generator. Seamlessly integrate these artistic codes into your marketing materials, packaging, and digital platforms.

Instant Command GPT
Executes tasks via short commands instantly, using a single seesion to customize commands.

GAPP STORE
Welcome to GAPP Store: Chat, create, customize—your all-in-one AI app universe

Sneaker Genius
Expert in sneaker customization, buying, collecting, and offering detailed advice on painting techniques and design inspiration

Preference Card Estimator
Generates detailed orthopedic surgery cards using uploaded formats.

Vikas' Scripting Helper
Guides in creating, customizing Airtable scripts with user-friendly explanations.

QR Code Creator & Customizer
Create a QR code in 30 seconds + add a cool design effect or overlay it on top of any image. Free, no watermarks, no email required, and we don't store your messages/images.

Corporate Trainer
Develops training programs, customizing content to fit corporate culture and objectives.