Best AI tools for< Scan Documents >
20 - AI tool Sites
Scanner Go
Scanner Go is a free PDF tool that offers easy-to-use scanning and conversion features. Users can quickly scan various types of documents, images, and books, and convert them to PDF format. The tool also provides OCR technology for extracting text from PDFs and images, as well as options for managing, editing, printing, and sharing documents. With cloud storage access, users can securely store and access their documents from any device. Scanner Go aims to simplify the digitization process and enhance productivity.
PDF Translator & Editor
PDF Translator & Editor is a powerful AI-driven tool that offers multilingual document translation with format and layout preservation. It supports translation of native PDF, scanned PDF, Word, Excel, PowerPoint, and image files to 136 languages. The tool is equipped with Google and Microsoft's Neural Machine Translation models, ensuring accurate and efficient translations. With versatile PDF conversion and editing capabilities, users can easily convert PDF files to images and vice versa, edit PDF text, scan documents to PDF, and split PDF files. PDF Translator & Editor has a global user base and is trusted by users from over 200 countries and regions. It provides unlimited access with no file size or page limits, and offers seamless integration with other apps through the share extension.
Codeway
Codeway is a leading mobile AI app developer that actively supports earthquake relief efforts in Turkey. With a focus on creating AI-powered apps, Codeway leverages cutting-edge AI technologies to deliver unparalleled user experiences. The company invests in R&D operations to ensure excellence in technology implementation, and is committed to understanding user needs for continuous app evolution. Codeway's products include mobile apps like Cleanup, Scanner+, Ask AI, Facedance, Wonder, Rumble Rivals, and PixelUp. The company excels in marketing, product management, and culture, attracting top talent and fostering a data-driven roadmap to success.
Woy AI Tools
Woy AI Tools is an online tool that offers free image to text conversion with over 99% accuracy and automatic recognition of more than 100 languages. Users can easily upload an image and receive the textual information contained within it. The tool supports multiple languages, prioritizes user privacy and data protection, has a simple and user-friendly interface, and is available for free usage. It utilizes advanced machine learning and OCR technology to continuously optimize recognition algorithms for clear and high-resolution images.
MixerBox
MixerBox is an AI-powered platform offering a variety of super-apps designed to simplify and enhance daily life. It includes features such as AI chatbot, GPTs, social map, unlimited shows and news, smart scanning, meditation, bubble shooter game, cashback rewards, and more. MixerBox aims to provide users with convenience, entertainment, and productivity tools through innovative AI technology.
ReadyRunner
ReadyRunner is a ChatGPT powered AI assistant application designed for desktop and web use. It offers three chat types - Assistant chat for standard AI interactions, ScratchPad for collaborative code/text editing, and Document Chat for document-related queries. The application provides features like Global Hotkey Access, System Prompt Library, Messages stream in from the top, Assistant Memory, Multi-line composer with history, and GPT-3 & GPT-4 Model Switcher.
Evernote
Evernote is a powerful note-taking application that helps users organize their notes, tasks, and schedules in one place. It offers features such as AI-powered search, collaboration tools, web clipping, document scanning, and personalization options. Users can access their information across all devices, even in offline mode. Evernote is suitable for executives, entrepreneurs, students, and creative individuals to capture and arrange their ideas efficiently.
Genaios
Genaios is an AI-powered web application and Chrome plugin that helps users detect and verify the authenticity of online information, particularly in distinguishing between real content and AI-generated texts. With the power of AI, Genaios enables users to fact-check documents, validate sources, and identify AI-generated texts in multiple languages. The application aims to combat fake news and information overload on the internet, providing a reliable solution for users to trust the media again.
WhatLetter
WhatLetter is an AI-powered document translation tool designed to help immigrant families and seniors navigate important paperwork without language barriers. Users can simply snap a photo of any document to get instant insights and translations in over 30 languages. The AI chatbot provides personalized explanations and answers, ensuring a global experience. WhatLetter prioritizes user privacy by not saving images on servers and retaining chat history solely for user reference. With upcoming features like WhatsApp and Telegram integration, WhatLetter aims to make understanding important paperwork easy and accessible.
Speak4Me
Speak4Me is a text-to-speech application that converts any text file, including PDFs and websites, into audible content. It enables users to listen to their documents or school materials anytime, anywhere. With features like scanning physical or digital text, reading web pages aloud, and a new ChatWithMe function, Speak4Me aims to enhance reading experiences and improve focus for individuals with reading issues. The application is trusted by over 15,000 people on the App Store and offers a free version for schools, making education more accessible for everyone.
SparkReceipt
SparkReceipt is an AI-powered receipt scanner, expense tracker, and document manager application that streamlines pre-accounting tasks by reducing manual data entry up to 95%. It allows users to scan receipts, invoices, and bank statements, track expenses and income with AI-powered scanning and automatic categorization. The application works in any language and supports 150 currencies. SparkReceipt offers features like automatic data extraction (OCR), forwarding e-receipts from email, managing finances across borders, separating business and personal expenses, real-time profit/loss monitoring, and lightning-fast expense tracking.
AnyToSpeech
AnyToSpeech is an AI text-to-speech and PDF to Audiobook solution that offers a clean and simple way to convert text, PDFs, documents, scans, and images to speech. It provides a variety of realistic voices in multiple languages for users to choose from. The platform also allows users to convert URLs to speech and offers a library to save and access their generated audio files at any time.
Procys
Procys is a document processing platform powered by AI solutions. It offers a self-learning engine for document processing, seamless integration with over 260 apps, OCR API powered by AI for optical character recognition, customized data extraction capabilities, and AI autosplit feature for automatic document splitting. Procys caters to various use cases such as invoice OCR, ID card OCR, receipt OCR, and account payable automation. It ensures data security as a top priority and simplifies the digitalization of documents in PDF, scan, or image format.
AI Manga Translator
AI Manga Translator is an online tool powered by AI technology that allows users to upload and translate manga instantly. It supports multiple languages and translation engines, ensuring precision manga translation without altering the original style. The tool is user-friendly, making it accessible to all users, whether they are manga fans or professionals needing document translations. AI Manga Translator offers various plans for different translation needs, with accurate and fast translations powered by AI technology.
Zing Coach
Zing Coach is a fitness coaching website that helps individuals of all fitness levels achieve their health goals. The platform offers personalized workout plans, nutrition guidance, and motivation to keep users on track. Whether you are a beginner looking to start your fitness journey or an advanced athlete seeking to improve performance, Zing Coach provides the tools and support needed to succeed.
Siwalu
Siwalu is an AI-based image recognition application that specializes in identifying animals. The app provides specific information about the characteristics and traits of pets, enabling pet owners to learn more about their pets quickly and accurately. By using advanced AI technology, Siwalu offers a reliable statement about the breed of pets within seconds, eliminating the need for time-consuming and costly DNA analysis. The app focuses on recognizing various species, including purebred and mixed breed dogs, cats, and horses, with a goal to increase knowledge about global biodiversity.
MagiScan
MagiScan is a 3D scanner app available for iOS and Android platforms. It utilizes AI technology to provide users with a simple and professional tool for creating high-quality 3D models. The app offers various export formats, affordable pricing, and seamless integration with NVIDIA Omniverse. MagiScan aims to digitize objects quickly to meet the increasing demand for 3D content. The application is user-friendly, catering to both professionals and amateurs, with a focus on continuous improvement based on user feedback.
Qlone
Qlone is a user-friendly 3D scanning app that allows users to easily create 3D models using their smartphone or tablet. The app offers seamless integration with leading 3D platforms for printing, sharing, and selling models. Users can create AR menus, scan various objects like food, people, and art, and engage in educational activities. Qlone is developed by EyeCue Vision Technologies LTD and is designed to provide a simple and efficient 3D scanning experience.
ScanMyGolfBall
ScanMyGolfBall is an AI-powered application designed to revolutionize the golfing experience. By simply scanning any golf ball, users can uncover detailed insights and receive personalized recommendations to enhance their gameplay. The app features advanced AI algorithms for ball analysis, personalized ball fitting, and hassle-free user experience. With a user-friendly interface and a focus on privacy and security, ScanMyGolfBall aims to provide golfers with the perfect ball for their unique playing style, ultimately elevating their performance on the course.
OpalAi
OpalAi is a revolutionary floor plan creator app that empowers users to create detailed floor plans and BIM models using only their iPhone or iPad. With its cutting-edge AI technology, OpalAi automates the entire process, eliminating the need for manual measurements, note-taking, and furniture removal. Simply scan your space, texture it within the app, and upload the project to receive a complete floor plan in just 10 minutes. OpalAi supports various output formats, including 3D CAD & BIM models, Revit, AutoCAD, Sketchup, Rhino, PDF, and 2020 Design models, with options for textured and colored models. The app's advanced features and capabilities make it an ideal tool for architects, contractors, real estate agents, interior designers, and homeowners alike.
20 - Open Source AI Tools
blinkid-ios
BlinkID iOS is a mobile SDK that enables developers to easily integrate ID scanning and data extraction capabilities into their iOS applications. The SDK supports scanning and processing various types of identity documents, such as passports, driver's licenses, and ID cards. It provides accurate and fast data extraction, including personal information and document details. With BlinkID iOS, developers can enhance their apps with secure and reliable ID verification functionality, improving user experience and streamlining identity verification processes.
sane-airscan
sane-airscan is a SANE backend that supports driverless scanning using Apple AirScan (eSCL) and Microsoft WSD protocols. It automatically chooses between the two protocols and has been tested with various devices from Brother, Canon, Dell, Kyocera, Lexmark, Epson, HP, OKI, Panasonic, Pantum, Ricoh, Samsung, and Xerox. The backend allows for automatic and manual device discovery and configuration, supports scanning from platen and ADF in color and grayscale modes, and works with both IPv4 and IPv6. It does not require installation and does not conflict with vendor-provided proprietary software.
AirSane
AirSane is a SANE frontend and scanner server that supports Apple's AirScan protocol. It automatically detects scanners and publishes them through mDNS. Acquired images can be transferred in JPEG, PNG, and PDF/raster format. The tool is intended to be used with AirScan/eSCL clients such as Apple's Image Capture, sane-airscan on Linux, and the eSCL client built into Windows 10 and 11. It provides a simple web interface and encodes images on-the-fly to keep memory/storage demands low, making it suitable for devices like Raspberry Pi. Authentication and secure communication are supported in conjunction with a proxy server like nginx. AirSane has been reverse-engineered from Apple's AirScanScanner client communication protocol and offers a range of installation and configuration options for different operating systems.
how-to-optim-algorithm-in-cuda
This repository documents how to optimize common algorithms based on CUDA. It includes subdirectories with code implementations for specific optimizations. The optimizations cover topics such as compiling PyTorch from source, NVIDIA's reduce optimization, OneFlow's elementwise template, fast atomic add for half data types, upsample nearest2d optimization in OneFlow, optimized indexing in PyTorch, OneFlow's softmax kernel, linear attention optimization, and more. The repository also includes learning resources related to deep learning frameworks, compilers, and optimization techniques.
receipt-scanner
The receipt-scanner repository is an AI-Powered Receipt and Invoice Scanner for Laravel that allows users to easily extract structured receipt data from images, PDFs, and emails within their Laravel application using OpenAI. It provides a light wrapper around OpenAI Chat and Completion endpoints, supports various input formats, and integrates with Textract for OCR functionality. Users can install the package via composer, publish configuration files, and use it to extract data from plain text, PDFs, images, Word documents, and web content. The scanned receipt data is parsed into a DTO structure with main classes like Receipt, Merchant, and LineItem.
cognita
Cognita is an open-source framework to organize your RAG codebase along with a frontend to play around with different RAG customizations. It provides a simple way to organize your codebase so that it becomes easy to test it locally while also being able to deploy it in a production ready environment. The key issues that arise while productionizing RAG system from a Jupyter Notebook are: 1. **Chunking and Embedding Job** : The chunking and embedding code usually needs to be abstracted out and deployed as a job. Sometimes the job will need to run on a schedule or be trigerred via an event to keep the data updated. 2. **Query Service** : The code that generates the answer from the query needs to be wrapped up in a api server like FastAPI and should be deployed as a service. This service should be able to handle multiple queries at the same time and also autoscale with higher traffic. 3. **LLM / Embedding Model Deployment** : Often times, if we are using open-source models, we load the model in the Jupyter notebook. This will need to be hosted as a separate service in production and model will need to be called as an API. 4. **Vector DB deployment** : Most testing happens on vector DBs in memory or on disk. However, in production, the DBs need to be deployed in a more scalable and reliable way. Cognita makes it really easy to customize and experiment everything about a RAG system and still be able to deploy it in a good way. It also ships with a UI that makes it easier to try out different RAG configurations and see the results in real time. You can use it locally or with/without using any Truefoundry components. However, using Truefoundry components makes it easier to test different models and deploy the system in a scalable way. Cognita allows you to host multiple RAG systems using one app. ### Advantages of using Cognita are: 1. A central reusable repository of parsers, loaders, embedders and retrievers. 2. Ability for non-technical users to play with UI - Upload documents and perform QnA using modules built by the development team. 3. Fully API driven - which allows integration with other systems. > If you use Cognita with Truefoundry AI Gateway, you can get logging, metrics and feedback mechanism for your user queries. ### Features: 1. Support for multiple document retrievers that use `Similarity Search`, `Query Decompostion`, `Document Reranking`, etc 2. Support for SOTA OpenSource embeddings and reranking from `mixedbread-ai` 3. Support for using LLMs using `Ollama` 4. Support for incremental indexing that ingests entire documents in batches (reduces compute burden), keeps track of already indexed documents and prevents re-indexing of those docs.
giskard
Giskard is an open-source Python library that automatically detects performance, bias & security issues in AI applications. The library covers LLM-based applications such as RAG agents, all the way to traditional ML models for tabular data.
data-juicer
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. It is a systematic & reusable library of 80+ core OPs, 20+ reusable config recipes, and 20+ feature-rich dedicated toolkits, designed to function independently of specific LLM datasets and processing pipelines. Data-Juicer allows detailed data analyses with an automated report generation feature for a deeper understanding of your dataset. Coupled with multi-dimension automatic evaluation capabilities, it supports a timely feedback loop at multiple stages in the LLM development process. Data-Juicer offers tens of pre-built data processing recipes for pre-training, fine-tuning, en, zh, and more scenarios. It provides a speedy data processing pipeline requiring less memory and CPU usage, optimized for maximum productivity. Data-Juicer is flexible & extensible, accommodating most types of data formats and allowing flexible combinations of OPs. It is designed for simplicity, with comprehensive documentation, easy start guides and demo configs, and intuitive configuration with simple adding/removing OPs from existing configs.
Agently
Agently is a development framework that helps developers build AI agent native application really fast. You can use and build AI agent in your code in an extremely simple way. You can create an AI agent instance then interact with it like calling a function in very few codes like this below. Click the run button below and witness the magic. It's just that simple: python # Import and Init Settings import Agently agent = Agently.create_agent() agent\ .set_settings("current_model", "OpenAI")\ .set_settings("model.OpenAI.auth", {"api_key": ""}) # Interact with the agent instance like calling a function result = agent\ .input("Give me 3 words")\ .output([("String", "one word")])\ .start() print(result) ['apple', 'banana', 'carrot'] And you may notice that when we print the value of `result`, the value is a `list` just like the format of parameter we put into the `.output()`. In Agently framework we've done a lot of work like this to make it easier for application developers to integrate Agent instances into their business code. This will allow application developers to focus on how to build their business logic instead of figure out how to cater to language models or how to keep models satisfied.
InternVL
InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM. It is a vision-language foundation model that can perform various tasks, including: **Visual Perception** - Linear-Probe Image Classification - Semantic Segmentation - Zero-Shot Image Classification - Multilingual Zero-Shot Image Classification - Zero-Shot Video Classification **Cross-Modal Retrieval** - English Zero-Shot Image-Text Retrieval - Chinese Zero-Shot Image-Text Retrieval - Multilingual Zero-Shot Image-Text Retrieval on XTD **Multimodal Dialogue** - Zero-Shot Image Captioning - Multimodal Benchmarks with Frozen LLM - Multimodal Benchmarks with Trainable LLM - Tiny LVLM InternVL has been shown to achieve state-of-the-art results on a variety of benchmarks. For example, on the MMMU image classification benchmark, InternVL achieves a top-1 accuracy of 51.6%, which is higher than GPT-4V and Gemini Pro. On the DocVQA question answering benchmark, InternVL achieves a score of 82.2%, which is also higher than GPT-4V and Gemini Pro. InternVL is open-sourced and available on Hugging Face. It can be used for a variety of applications, including image classification, object detection, semantic segmentation, image captioning, and question answering.
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding _programmable guardrails_ to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.
LazyLLM
LazyLLM is a low-code development tool for building complex AI applications with multiple agents. It assists developers in building AI applications at a low cost and continuously optimizing their performance. The tool provides a convenient workflow for application development and offers standard processes and tools for various stages of application development. Users can quickly prototype applications with LazyLLM, analyze bad cases with scenario task data, and iteratively optimize key components to enhance the overall application performance. LazyLLM aims to simplify the AI application development process and provide flexibility for both beginners and experts to create high-quality applications.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.
MOOSE
MOOSE 2.0 is a leaner, meaner, and stronger tool for 3D medical image segmentation. It is built on the principles of data-centric AI and offers a wide range of segmentation models for both clinical and preclinical settings. MOOSE 2.0 is also versatile, allowing users to use it as a command-line tool for batch processing or as a library package for individual processing in Python projects. With its improved speed, accuracy, and flexibility, MOOSE 2.0 is the go-to tool for segmentation tasks.
Awesome-LLM-Long-Context-Modeling
This repository includes papers and blogs about Efficient Transformers, Length Extrapolation, Long Term Memory, Retrieval Augmented Generation(RAG), and Evaluation for Long Context Modeling.
Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.
20 - OpenAI Gpts
DocuScan and Scribe
Scans and transcribes images into documents, offers downloadable copies in a document and offers to translate into different languages
TipCheck Calculator Pro
Effortlessly calculate your tip and total bill with TipCheck Calculator Pro. Simply scan your restaurant or bar receipt, and get instant suggested tip amounts with an accurate breakdown of your total payment. No more guesswork.
Swapzone
Swapzone is a non-custodial instant crypto exchange aggregator that helps users scan the network of registered exchanges globally and gives them a comprehensive list of those that support a particular trading or swap pair.
Manifestation Mentor GPT
Guides entrepreneurs through 'The Power of Manifestation' with AI-enhanced insights. Scan any page in the book to dive deep in the Manifestation Matrix.
IAC Code Guardian
Introducing IAC Code Guardian: Your Trusted IaC Security Expert in Scanning Opentofu, Terrform, AWS Cloudformation, Pulumi, K8s Yaml & Dockerfile
Free Antivirus Software 2024
Free Antivirus Software : Reviews and Best Free Offers for antivirus software to protect you
🛡️ CodeGuardian Pro+ 🛡️
Your AI-powered sentinel for code! Scans for vulnerabilities, offers security tips, and educates on best practices in cybersecurity. 🔍🔐
Ethical Hacking GPT
Guide to ethical hacking, specializing in NMAP | For Educational Purposes Only | CSV Upload Suggested |
ethicallyHackingspace (eHs)® (Full Spectrum)™
Full Spectrum Space Cybersecurity Professional ™ AI-copilot (BETA)
Business Card Digitizer
Simply take a photo of your business cards and upload it to the chat. I'll take it from there!