Best AI tools for< Split Documents >
20 - AI tool Sites

Procys
Procys is a document processing platform powered by AI that offers automated document processing solutions. It provides features such as a self-learning engine, seamless integration with ERP systems, OCR API powered by AI, customized data extraction, and AI autosplit for automatic document splitting. Procys helps with tasks like invoice OCR, ID card OCR, receipt OCR, and account payable automation. The platform aims to streamline document workflows, eliminate manual processes, save time, reduce errors, and ensure compliance for businesses.

Doclingo
Doclingo is an AI-powered document translation tool that supports translating documents in various formats such as PDF, Word, Excel, PowerPoint, SRT subtitles, ePub ebooks, AR&ZIP packages, and more. It utilizes large language models to provide accurate and professional translations, preserving the original layout of the documents. Users can enjoy a limited-time free trial upon registration, with the option to subscribe for more features. Doclingo aims to offer high-quality translation services through continuous algorithm improvements.

PDF Translator & Editor
PDF Translator & Editor is a powerful AI-driven tool designed for translating and editing PDF documents. It supports the translation of native PDF, scanned PDF, Word, Excel, PowerPoint, and image files to 136 languages while preserving the original format and layout. The tool also offers versatile PDF conversion capabilities, allowing users to convert PDFs to images and vice versa, edit PDF text, scan to PDF, and split PDF files. With AI-driven translation powered by Google and Microsoft's Neural Machine Translation models, the tool provides automatic language detection and multilingual document translation. Trusted by users worldwide, PDF Translator & Editor offers unlimited access without file size or page limits.

Docsumo
Docsumo is an advanced Document AI platform designed for scalability and efficiency. It offers a wide range of capabilities such as pre-processing documents, extracting data, reviewing and analyzing documents. The platform provides features like document classification, touchless processing, ready-to-use AI models, auto-split functionality, and smart table extraction. Docsumo is a leader in intelligent document processing and is trusted by various industries for its accurate data extraction capabilities. The platform enables enterprises to digitize their document processing workflows, reduce manual efforts, and maximize data accuracy through its AI-powered solutions.

pdfAssistant
pdfAssistant is a powerful AI chatbot designed to assist users with various PDF processing tasks. It offers a user-friendly chat-based interface that allows users to convert, watermark, merge, split, and perform other PDF-related operations using natural language commands. The application is powered by industry-leading PDF and AI technology, providing fast and accurate results. With pdfAssistant, users can work smarter and more efficiently by simplifying complex PDF software processes.

**万兴科技**
**万兴科技** is an AI-powered tool that helps users create and edit PDF documents. It offers a wide range of features, including the ability to convert PDFs to other formats, edit text and images, and add annotations. **万兴科技** is a valuable tool for anyone who needs to work with PDFs on a regular basis.

LightPDF
LightPDF is an AI-powered, free online PDF editor, converter, and reader. It offers a wide range of PDF tools, including the ability to convert PDFs to and from other formats, edit PDFs, add watermarks, split and merge PDFs, rotate PDFs, annotate PDFs, optimize PDFs, compress PDFs, perform OCR on PDFs, and protect PDFs. LightPDF also offers a variety of AI-powered features, such as an AI chatbot that can answer questions about documents and an AI-powered OCR engine that can convert scanned PDFs and images to text.

HiPDF
HiPDF is a free online PDF solution that offers a wide range of tools for editing, converting, compressing, and organizing PDFs. It also includes AI-powered tools such as Chat with PDF and AI Detector. With HiPDF, you can easily edit PDFs in your browser, convert PDFs to and from other formats, compress PDFs to reduce their size, and merge, split, and extract images from PDFs. You can also protect your PDFs with passwords and redact sensitive information. HiPDF is a convenient and easy-to-use tool that can help you with all your PDF needs.

MJSplitter
MJSplitter is a free online tool that allows users to split their Midjourney Grid images into single images. Users can either paste or upload their image grid, and the tool will automatically split the images and save them as JPEGs. The tool is not affiliated with Midjourney, and images are deleted from the server after 24 hours.

SplitSong
SplitSong.com is an AI-powered tool that allows users to split songs into individual instrument tracks. Created by @markdoppler_, the website offers a user-friendly platform where users can upload their songs or extract from YouTube. By leveraging Artificial Intelligence technology, SplitSong.com enables users to download separate tracks for drums, instrumental, bass, and voice, providing a unique and innovative way to interact with music.

Vocalremover.org
Vocalremover.org is a website that offers a tool to remove vocals from music tracks. Users can upload their audio files and the tool will process them to create a version without vocals. The site aims to provide a simple and efficient solution for musicians, DJs, and music enthusiasts who want to create karaoke tracks or remixes without vocals.

LALAL.AI
LALAL.AI is a next-generation AI-powered vocal remover and music source separation service that offers fast, easy, and precise stem extraction. It allows users to remove vocals, instrumental tracks, drums, bass, guitar, and more from audio and video files without compromising quality. The platform leverages advanced AI technology to provide high-quality stem splitting based on transformer-based audio separation approach, ensuring seamless integration and cross-platform support for individuals and businesses.

Gaudio Studio
Gaudio Studio is an AI music separation tool designed for creators to unleash their creativity with ease. It allows users to extract background music, separate instruments, and remove vocals from any music content. Powered by GSEP (Gaudio source SEParation), a high-quality and easy-to-use AI stem separation model, Gaudio Studio offers a seamless experience for audio separation. Users can upload their songs in various formats, access the tool from desktop or mobile devices, and enjoy Studio Plans for advanced processing. Additionally, Gaudio Studio can be integrated with cloud APIs and On-device SDKs for business applications, offering a versatile solution for music professionals and enthusiasts.

Music Demixer
Music Demixer is an AI-powered online tool that offers advanced stem separation and automatic music transcription capabilities. Users can effortlessly isolate vocals, drums, bass, melody, guitar, and piano in their music tracks, generate precise MIDI files, scores, and sheet music. The tool is perfect for musicians, DJs, producers, and creators looking for a simple and superior solution for music editing and transcription. With a focus on privacy, Music Demixer operates entirely in the browser without cloud storage. It leverages cutting-edge AI models and technologies from the Sony Music Demixing Challenges to provide high-quality results.

Jamorphosia
Jamorphosia is an AI-powered application that allows users to remove instruments from a song. With advanced technology and audio separation capabilities, users can easily extract vocals, isolate specific instruments, and create custom backing tracks. The tool transforms audio files into personalized songs, suitable for practice or performance. Jamorphosia enhances the music experience by providing a platform for musicians to engage with original tracks in a more immersive way.

AdCopy
AdCopy is an AI-powered advertising platform that helps businesses create high-quality ads and optimize their ad campaigns. The platform uses AI to generate ad copy, create ad creatives, and provide insights into ad performance. AdCopy is designed to help businesses save time and money on their advertising campaigns, while also improving their results.

SplitMyExpenses
SplitMyExpenses is an AI-powered application designed to simplify shared expenses with friends. It allows users to create groups, split bills effortlessly, track debts, and settle up using integrated payment apps. The app offers modern design, AI receipt itemization, friend data powered by payment apps, and beautiful spending charts to help users manage their expenses efficiently. With over 150 supported currencies and secure handling of data, SplitMyExpenses revolutionizes the age-old problem of bill splitting, providing users with a stress-free experience.

SplitParty
SplitParty is an AI-powered application that simplifies the process of splitting complicated bills with friends. Users can easily split a bill by taking a photo of the receipt, allowing the AI to identify items, quantities, and prices. The application enables users to add friends, select items ordered, and split the bill effortlessly. SplitParty Plus offers additional features such as creating groups, saving bill history, and more. Developed by @dqnamo, SplitParty is a bootstrapped indie product designed to streamline bill-splitting experiences.

SnaptoBook
SanptoBook is a personal accounting software designed to help individuals manage their finances efficiently. It offers features such as invoice and receipt management, reimbursement facilitation, tax filing assistance, bill splitting, and project tracking. The application aims to simplify financial tasks and improve overall financial organization for users. With AI-powered efficiency, SnaptoBook provides state-of-the-art receipt recognition technology and secure cloud storage for all receipts.

Splitter.ai
Splitter.ai is an AI-driven audio processing platform developed by a Swedish research company. It offers advanced audio processing technologies, including stem separation/extraction, reverb removal, and direct YouTube splitting. The platform is designed to assist music producers, DJs, artists, forensics engineers, audio engineers, karaoke enthusiasts, police, scientists, and more in enhancing their audio processing tasks. Splitter.ai aims to provide high-quality services through AI-driven solutions to meet the diverse needs of its users.
20 - Open Source AI Tools

ExtractThinker
ExtractThinker is a library designed for extracting data from files and documents using Language Model Models (LLMs). It offers ORM-style interaction between files and LLMs, supporting multiple document loaders such as Tesseract OCR, Azure Form Recognizer, AWS TextExtract, and Google Document AI. Users can customize extraction using contract definitions, process documents asynchronously, handle various document formats efficiently, and split and process documents. The project is inspired by the LangChain ecosystem and focuses on Intelligent Document Processing (IDP) using LLMs to achieve high accuracy in document extraction tasks.

ai-chunking
AI Chunking is a powerful Python library for semantic document chunking and enrichment using AI. It provides intelligent document chunking capabilities with various strategies to split text while preserving semantic meaning, particularly useful for processing markdown documentation. The library offers multiple chunking strategies such as Recursive Text Splitting, Section-based Semantic Chunking, and Base Chunking. Users can configure chunk sizes, overlap, and support various text formats. The tool is easy to extend with custom chunking strategies, making it versatile for different document processing needs.

chipper
Chipper provides a web interface, CLI, and architecture for pipelines, document chunking, web scraping, and query workflows. It is built with Haystack, Ollama, Hugging Face, Docker, Tailwind, and ElasticSearch, running locally or as a Dockerized service. Originally created to assist in creative writing, it now offers features like local Ollama and Hugging Face API, ElasticSearch embeddings, document splitting, web scraping, audio transcription, user-friendly CLI, and Docker deployment. The project aims to be educational, beginner-friendly, and a playground for AI exploration and innovation.

pixeltable
Pixeltable is a Python library designed for ML Engineers and Data Scientists to focus on exploration, modeling, and app development without the need to handle data plumbing. It provides a declarative interface for working with text, images, embeddings, and video, enabling users to store, transform, index, and iterate on data within a single table interface. Pixeltable is persistent, acting as a database unlike in-memory Python libraries such as Pandas. It offers features like data storage and versioning, combined data and model lineage, indexing, orchestration of multimodal workloads, incremental updates, and automatic production-ready code generation. The tool emphasizes transparency, reproducibility, cost-saving through incremental data changes, and seamless integration with existing Python code and libraries.

llama_ros
This repository provides a set of ROS 2 packages to integrate llama.cpp into ROS 2. By using the llama_ros packages, you can easily incorporate the powerful optimization capabilities of llama.cpp into your ROS 2 projects by running GGUF-based LLMs and VLMs.

llms
The 'llms' repository is a comprehensive guide on Large Language Models (LLMs), covering topics such as language modeling, applications of LLMs, statistical language modeling, neural language models, conditional language models, evaluation methods, transformer-based language models, practical LLMs like GPT and BERT, prompt engineering, fine-tuning LLMs, retrieval augmented generation, AI agents, and LLMs for computer vision. The repository provides detailed explanations, examples, and tools for working with LLMs.

document-ai-samples
The Google Cloud Document AI Samples repository contains code samples and Community Samples demonstrating how to analyze, classify, and search documents using Google Cloud Document AI. It includes various projects showcasing different functionalities such as integrating with Google Drive, processing documents using Python, content moderation with Dialogflow CX, fraud detection, language extraction, paper summarization, tax processing pipeline, and more. The repository also provides access to test document files stored in a publicly-accessible Google Cloud Storage Bucket. Additionally, there are codelabs available for optical character recognition (OCR), form parsing, specialized processors, and managing Document AI processors. Community samples, like the PDF Annotator Sample, are also included. Contributions are welcome, and users can seek help or report issues through the repository's issues page. Please note that this repository is not an officially supported Google product and is intended for demonstrative purposes only.

rlama
RLAMA is a powerful AI-driven question-answering tool that seamlessly integrates with local Ollama models. It enables users to create, manage, and interact with Retrieval-Augmented Generation (RAG) systems tailored to their documentation needs. RLAMA follows a clean architecture pattern with clear separation of concerns, focusing on lightweight and portable RAG capabilities with minimal dependencies. The tool processes documents, generates embeddings, stores RAG systems locally, and provides contextually-informed responses to user queries. Supported document formats include text, code, and various document types, with troubleshooting steps available for common issues like Ollama accessibility, text extraction problems, and relevance of answers.

dynamiq
Dynamiq is an orchestration framework designed to streamline the development of AI-powered applications, specializing in orchestrating retrieval-augmented generation (RAG) and large language model (LLM) agents. It provides an all-in-one Gen AI framework for agentic AI and LLM applications, offering tools for multi-agent orchestration, document indexing, and retrieval flows. With Dynamiq, users can easily build and deploy AI solutions for various tasks.

awesome-llm
Awesome LLM is a curated list of resources related to Large Language Models (LLMs), including models, projects, datasets, benchmarks, materials, papers, posts, GitHub repositories, HuggingFace repositories, and reading materials. It provides detailed information on various LLMs, their parameter sizes, announcement dates, and contributors. The repository covers a wide range of LLM-related topics and serves as a valuable resource for researchers, developers, and enthusiasts interested in the field of natural language processing and artificial intelligence.

LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing
LLM-PowerHouse is a comprehensive and curated guide designed to empower developers, researchers, and enthusiasts to harness the true capabilities of Large Language Models (LLMs) and build intelligent applications that push the boundaries of natural language understanding. This GitHub repository provides in-depth articles, codebase mastery, LLM PlayLab, and resources for cost analysis and network visualization. It covers various aspects of LLMs, including NLP, models, training, evaluation metrics, open LLMs, and more. The repository also includes a collection of code examples and tutorials to help users build and deploy LLM-based applications.

Controllable-RAG-Agent
This repository contains a sophisticated deterministic graph-based solution for answering complex questions using a controllable autonomous agent. The solution is designed to ensure that answers are solely based on the provided data, avoiding hallucinations. It involves various steps such as PDF loading, text preprocessing, summarization, database creation, encoding, and utilizing large language models. The algorithm follows a detailed workflow involving planning, retrieval, answering, replanning, content distillation, and performance evaluation. Heuristics and techniques implemented focus on content encoding, anonymizing questions, task breakdown, content distillation, chain of thought answering, verification, and model performance evaluation.

lloco
LLoCO is a technique that learns documents offline through context compression and in-domain parameter-efficient finetuning using LoRA, which enables LLMs to handle long context efficiently.

WDoc
WDoc is a powerful Retrieval-Augmented Generation (RAG) system designed to summarize, search, and query documents across various file types. It supports querying tens of thousands of documents simultaneously, offers tailored summaries to efficiently manage large amounts of information, and includes features like supporting multiple file types, various LLMs, local and private LLMs, advanced RAG capabilities, advanced summaries, trust verification, markdown formatted answers, sophisticated embeddings, extensive documentation, scriptability, type checking, lazy imports, caching, fast processing, shell autocompletion, notification callbacks, and more. WDoc is ideal for researchers, students, and professionals dealing with extensive information sources.

gfm-rag
The GFM-RAG is a graph foundation model-powered pipeline that combines graph neural networks to reason over knowledge graphs and retrieve relevant documents for question answering. It features a knowledge graph index, efficiency in multi-hop reasoning, generalizability to unseen datasets, transferability for fine-tuning, compatibility with agent-based frameworks, and interpretability of reasoning paths. The tool can be used for conducting retrieval and question answering tasks using pre-trained models or fine-tuning on custom datasets.

wdoc
wdoc is a powerful Retrieval-Augmented Generation (RAG) system designed to summarize, search, and query documents across various file types. It aims to handle large volumes of diverse document types, making it ideal for researchers, students, and professionals dealing with extensive information sources. wdoc uses LangChain to process and analyze documents, supporting tens of thousands of documents simultaneously. The system includes features like high recall and specificity, support for various Language Model Models (LLMs), advanced RAG capabilities, advanced document summaries, and support for multiple tasks. It offers markdown-formatted answers and summaries, customizable embeddings, extensive documentation, scriptability, and runtime type checking. wdoc is suitable for power users seeking document querying capabilities and AI-powered document summaries.

how-to-optim-algorithm-in-cuda
This repository documents how to optimize common algorithms based on CUDA. It includes subdirectories with code implementations for specific optimizations. The optimizations cover topics such as compiling PyTorch from source, NVIDIA's reduce optimization, OneFlow's elementwise template, fast atomic add for half data types, upsample nearest2d optimization in OneFlow, optimized indexing in PyTorch, OneFlow's softmax kernel, linear attention optimization, and more. The repository also includes learning resources related to deep learning frameworks, compilers, and optimization techniques.

sec-parser
The `sec-parser` project simplifies extracting meaningful information from SEC EDGAR HTML documents by organizing them into semantic elements and a tree structure. It helps in parsing SEC filings for financial and regulatory analysis, analytics and data science, AI and machine learning, causal AI, and large language models. The tool is especially beneficial for AI, ML, and LLM applications by streamlining data pre-processing and feature extraction.

raglite
RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite. It offers configurable options for choosing LLM providers, database types, and rerankers. The toolkit is fast and permissive, utilizing lightweight dependencies and hardware acceleration. RAGLite provides features like PDF to Markdown conversion, multi-vector chunk embedding, optimal semantic chunking, hybrid search capabilities, adaptive retrieval, and improved output quality. It is extensible with a built-in Model Context Protocol server, customizable ChatGPT-like frontend, document conversion to Markdown, and evaluation tools. Users can configure RAGLite for various tasks like configuring, inserting documents, running RAG pipelines, computing query adapters, evaluating performance, running MCP servers, and serving frontends.

DB-GPT
DB-GPT is a personal database administrator that can solve database problems by reading documents, using various tools, and writing analysis reports. It is currently undergoing an upgrade. **Features:** * **Online Demo:** * Import documents into the knowledge base * Utilize the knowledge base for well-founded Q&A and diagnosis analysis of abnormal alarms * Send feedbacks to refine the intermediate diagnosis results * Edit the diagnosis result * Browse all historical diagnosis results, used metrics, and detailed diagnosis processes * **Language Support:** * English (default) * Chinese (add "language: zh" in config.yaml) * **New Frontend:** * Knowledgebase + Chat Q&A + Diagnosis + Report Replay * **Extreme Speed Version for localized llms:** * 4-bit quantized LLM (reducing inference time by 1/3) * vllm for fast inference (qwen) * Tiny LLM * **Multi-path extraction of document knowledge:** * Vector database (ChromaDB) * RESTful Search Engine (Elasticsearch) * **Expert prompt generation using document knowledge** * **Upgrade the LLM-based diagnosis mechanism:** * Task Dispatching -> Concurrent Diagnosis -> Cross Review -> Report Generation * Synchronous Concurrency Mechanism during LLM inference * **Support monitoring and optimization tools in multiple levels:** * Monitoring metrics (Prometheus) * Flame graph in code level * Diagnosis knowledge retrieval (dbmind) * Logical query transformations (Calcite) * Index optimization algorithms (for PostgreSQL) * Physical operator hints (for PostgreSQL) * Backup and Point-in-time Recovery (Pigsty) * **Continuously updated papers and experimental reports** This project is constantly evolving with new features. Don't forget to star ⭐ and watch 👀 to stay up to date.
7 - OpenAI Gpts

Split Screen Ad Engine
Simply Enter your Niche and we'll create your Split Screen Ads for you.

RFP Proposal Pro (IT / Software Sales assistant)
Step 1: Upload RFP Step 2: Prompt: I need a comprehensive summary of the RFP. Split the summary in multiple blocks / section. After giving me one section wait for my command to move to the next section. Step 3: Prompt: Move to the next section, please :)

Pace Assistant
Provides running splits for Strava Routes, accounting for distance and elevation changes