Best AI tools for< Filter Data >
20 - AI tool Sites
Tablesmith
Tablesmith is a free, privacy-first, and intuitive spreadsheet automation tool that allows users to build reusable data flows, effortlessly sort, filter, group, format, or split data across files/sheets based on cell values. It is designed to be easy to learn and use, with a focus on privacy and cross-platform compatibility. Tablesmith also offers an AI autofill feature that suggests and fills in information based on the user's prompt.
StockGPT
StockGPT is an AI-powered financial research assistant that provides knowledge of earnings releases, financial reports, and fundamental information for S&P 500 and Nasdaq companies. It offers features like AI search, customizable filters, up-to-date data, industry research, and more to help users analyze companies and markets effectively.
Unlost
Unlost is a memory recall tool designed to help users effortlessly remember and retrieve information using natural language. It acts as a personal memory palace, eliminating the need for extensive note-taking or complex systems. Unlost intelligently records and organizes data, respecting user privacy by capturing content locally and offline. The tool offers quick access, powerful filtering capabilities, and familiar keyboard shortcuts for seamless user experience. With features like searching meeting transcripts, copying text from screenshots, and zero integration requirements, Unlost aims to simplify information retrieval and enhance productivity.
Zelma
Zelma is an AI-powered research assistant that enables users to find, graph, and understand U.S. school testing data using plain English queries. It allows users to search student test data by school district, demographics, grade, and more, and presents the results with graphs, tables, and descriptions. Zelma aims to make education data accessible and understandable for everyone.
AssetLink
AssetLink is a relationship intelligence platform that leverages contextualized data to connect the private wealth ecosystem. It helps financial advisors and asset managers make informed decisions by facilitating connections between the right buyers and sellers through AI and data-driven insights. The platform aims to streamline the process of matching advisors and asset managers, ultimately enhancing business growth and efficiency.
StartupHub AI
StartupHub AI is a comprehensive platform providing data and tech news related to the AI startup ecosystem. It offers information on startups, funding rounds, investors, events, and more. The platform serves as a hub for AI professionals, investors, and startups, with a focus on the Israeli AI startup scene. Users can access original content, statistics, infographics, and press releases to stay updated on the latest trends and developments in the AI industry.
ChoiceChaser
ChoiceChaser is an AI-powered lead generation tool that helps businesses find and connect with potential customers on social media, forums, and other online platforms. It uses natural language processing and machine learning to identify relevant posts and conversations, and then notifies users when there is a match. ChoiceChaser can help businesses save time and energy by automating the process of lead generation, and it can also help them reach a wider audience of potential customers.
YouTube Comment Finder And AI Analysis
The 'YouTube Comment Finder And AI Analysis' is a comprehensive web-based tool designed to simplify the process of searching, filtering, managing, and analyzing comments on YouTube videos. It empowers users to search, filter, sort, and analyze comments with ease, leveraging AI-powered comment analysis to gain insights into sentiment, trending topics, key points, and concise summaries of comments. The tool offers features such as comment search, filtering, sorting, exporting, and random comment picking, making it a valuable asset for content creators, marketers, and individuals looking to navigate the vast sea of comments on YouTube videos.
Pongo
Pongo is an AI-powered tool that helps reduce hallucinations in Large Language Models (LLMs) by up to 80%. It utilizes multiple state-of-the-art semantic similarity models and a proprietary ranking algorithm to ensure accurate and relevant search results. Pongo integrates seamlessly with existing pipelines, whether using a vector database or Elasticsearch, and processes top search results to deliver refined and reliable information. Its distributed architecture ensures consistent latency, handling a wide range of requests without compromising speed. Pongo prioritizes data security, operating at runtime with zero data retention and no data leaving its secure AWS VPC.
Hoop.dev
Hoop.dev is an AI application that provides live AI data masking in Rails console sessions. It offers shield Rails console access, automated employee onboarding & off-boarding, and AI data masking to protect customer data with a plug & play PII filter. The application enables compliant access without disrupting speed, automates HIPAA, SOC 1/2, PCI, GDPR, & other security controls, and reduces Rails Console use by finding repeated operations and turning Ruby scripts into repeatable no-code UIs.
Sightengine
The website offers content moderation and image analysis products using powerful APIs to automatically assess, filter, and moderate images, videos, and text. It provides features such as image moderation, video moderation, text moderation, AI image detection, and video anonymization. The application helps in detecting unwanted content, AI-generated images, and personal information in videos. It also offers tools to identify near-duplicates, spam, and abusive links, and prevent phishing and circumvention attempts. The platform is fast, scalable, accurate, easy to integrate, and privacy compliant, making it suitable for various industries like marketplaces, dating apps, and news platforms.
Angel Match
Angel Match is a comprehensive investor database platform that connects startups with over 110,000 angel investors and venture capitalists. It offers features such as fundraising templates, investor outreach tools, and pitch deck database. Users can search, filter, and track investor engagements, saving time and expanding their network. The platform provides diverse investor profiles, up-to-date data, and industry-specific matching to help startups find the right investors for their business.
Hirebase
Hirebase is an AI-powered job search engine that provides ultra-fresh job market data directly from company pages. It uses AI to scan 100,000 jobs in real-time, ensuring that every job listed is actively hiring on the internet. Users can receive email alerts for new job listings based on their preferences for job title, keywords, location, experience level, date posted, salary range, and more. Hirebase aims to 'unsuckify' the job search process by leveraging AI technology to streamline and enhance the job hunting experience.
Ragie
Ragie is a fully managed RAG-as-a-Service platform designed for developers. It offers easy-to-use APIs and SDKs to help developers get started quickly, with advanced features like LLM re-ranking, summary index, entity extraction, flexible filtering, and hybrid semantic and keyword search. Ragie allows users to connect directly to popular data sources like Google Drive, Notion, Confluence, and more, ensuring accurate and reliable information delivery. The platform is led by Craft Ventures and offers seamless data connectivity through connectors. Ragie simplifies the process of data ingestion, chunking, indexing, and retrieval, making it a valuable tool for AI applications.
Kuration AI
The website is a B2B research AI agent that automates manual B2B research processes by curating, refining, and enriching lead databases with AI agents. It offers features like source, curate, aggregate data points, templates, and custom AI-powered enrichment. The application helps users gather the right data, speed up research processes, and target relevant companies. It provides a range of pricing plans, compliance with ISO 9001, and a mobile application. The AI agent is used by companies like UBS, Microsoft, and Airbnb, and utilizes technologies like MongoDB, Flutter, and Next.js.
Hella Jobs
Hella Jobs is a leading platform for AI, Machine Learning, and Data Science jobs. It connects job seekers with top employers in the field of AI/ML, allowing employers to post open jobs and hire top talent. Job seekers can create profiles, submit resumes, and find new job opportunities. The platform offers features such as job filtering by keywords and location, job category selection, salary range selection, and job type filtering. Hella Jobs aims to streamline the job search process for both employers and job seekers in the AI/ML industry.
Jobs-Scout
Jobs-Scout is an AI-powered job search engine that helps you find your dream job. With Jobs-Scout, you can search for jobs by keyword, location, and industry. You can also filter your search results by salary, experience, and education level. Jobs-Scout also provides personalized job recommendations based on your skills and interests.
Tomat.AI
Tomat.AI is an AI-powered tool designed to help users open and explore large CSV files effortlessly. With features like automated data profiling, merging multiple files, and building reports, Tomat.AI simplifies the process of analyzing and automating Excel and CSV files without the need for coding skills. The tool ensures data security by operating entirely on the user's local machine, offering a user-friendly interface for seamless data manipulation and analysis.
Pinecone
Pinecone is a vector database designed to build knowledgeable AI applications. It offers a serverless platform with high capacity and low cost, enabling users to perform low-latency vector search for various AI tasks. Pinecone is easy to start and scale, allowing users to create an account, upload vector embeddings, and retrieve relevant data quickly. The platform combines vector search with metadata filters and keyword boosting for better application performance. Pinecone is secure, reliable, and cloud-native, making it suitable for powering mission-critical AI applications.
Joby.ai
Joby.ai is an AI-powered job search engine that directly scans 500,000 jobs in real-time from company pages. It uses AI technology to find every company and job that is actively hiring on the internet. Users can search for jobs based on various criteria like job title, keywords, location, experience, date posted, salary range, and more. The platform also offers advanced search capabilities, exact keyword search, and the ability to exclude keywords for more precise results. Joby.ai aims to help users find hidden job opportunities that may not be available on traditional job search platforms like LinkedIn or Indeed, ensuring that all listings are current and actively hiring.
20 - Open Source AI Tools
DeepDanbooru
DeepDanbooru is an anime-style girl image tag estimation system written in Python. It allows users to estimate images using a live demo site. The tool requires specific packages to be installed and provides a structured dataset for training projects. Users can create training projects, download tags, filter datasets, and start training to estimate tags for images. The tool uses a specific dataset structure and project structure to facilitate the training process.
db-ally
db-ally is a library for creating natural language interfaces to data sources. It allows developers to outline specific use cases for a large language model (LLM) to handle, detailing the desired data format and the possible operations to fetch this data. db-ally effectively shields the complexity of the underlying data source from the model, presenting only the essential information needed for solving the specific use cases. Instead of generating arbitrary SQL, the model is asked to generate responses in a simplified query language.
markdowner
Markdowner is a fast tool designed to convert any website into LLM-ready markdown data. It aims to improve the quality of responses in the AI app Supermemory by structuring and predicting data in markdown format. The tool offers features such as website conversion, LLM filtering, detailed markdown mode, auto crawler, text and JSON responses, and easy self-hosting. Markdowner utilizes Cloudflare's Browser rendering and Durable objects for browser instance creation and markdown conversion. Users can self-host the project with the Workers paid plan, following simple steps. Support the project by starring the repository.
VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.
reductstore
ReductStore is a high-performance time series database designed for storing and managing large amounts of unstructured blob data. It offers features such as real-time querying, batching data, and HTTP(S) API for edge computing, computer vision, and IoT applications. The database ensures data integrity, implements retention policies, and provides efficient data access, making it a cost-effective solution for applications requiring unstructured data storage and access at specific time intervals.
data-prep-kit
Data Prep Kit is a community project aimed at democratizing and speeding up unstructured data preparation for LLM app developers. It provides high-level APIs and modules for transforming data (code, language, speech, visual) to optimize LLM performance across different use cases. The toolkit supports Python, Ray, Spark, and Kubeflow Pipelines runtimes, offering scalability from laptop to datacenter-scale processing. Developers can contribute new custom modules and leverage the data processing library for building data pipelines. Automation features include workflow automation with Kubeflow Pipelines for transform execution.
magpie
This is the official repository for 'Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing'. Magpie is a tool designed to synthesize high-quality instruction data at scale by extracting it directly from an aligned Large Language Models (LLMs). It aims to democratize AI by generating large-scale alignment data and enhancing the transparency of model alignment processes. Magpie has been tested on various model families and can be used to fine-tune models for improved performance on alignment benchmarks such as AlpacaEval, ArenaHard, and WildBench.
cambrian
Cambrian-1 is a fully open project focused on exploring multimodal Large Language Models (LLMs) with a vision-centric approach. It offers competitive performance across various benchmarks with models at different parameter levels. The project includes training configurations, model weights, instruction tuning data, and evaluation details. Users can interact with Cambrian-1 through a Gradio web interface for inference. The project is inspired by LLaVA and incorporates contributions from Vicuna, LLaMA, and Yi. Cambrian-1 is licensed under Apache 2.0 and utilizes datasets and checkpoints subject to their respective original licenses.
rpaframework
RPA Framework is an open-source collection of libraries and tools for Robotic Process Automation (RPA), designed to be used with Robot Framework and Python. It offers well-documented core libraries for Software Robot Developers, optimized for Robocorp Control Room and Developer Tools, and accepts external contributions. The project includes various libraries for tasks like archiving, browser automation, date/time manipulations, cloud services integration, encryption operations, database interactions, desktop automation, document processing, email operations, Excel manipulation, file system operations, FTP interactions, web API interactions, image manipulation, AI services, and more. The development of the repository is Python-based and requires Python version 3.8+, with tooling based on poetry and invoke for compiling, building, and running the package. The project is licensed under the Apache License 2.0.
taipy
Taipy is an open-source Python library for easy, end-to-end application development, featuring what-if analyses, smart pipeline execution, built-in scheduling, and deployment tools.
llm-course
The LLM course is divided into three parts: 1. 🧩 **LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. 🧑🔬 **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. 👷 **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * 🤗 **HuggingChat Assistant**: Free version using Mixtral-8x7B. * 🤖 **ChatGPT Assistant**: Requires a premium account. ## 📝 Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | 🧐 LLM AutoEval | Automatically evaluate your LLMs using RunPod | ![Open In Colab](img/colab.svg) | | 🥱 LazyMergekit | Easily merge models using MergeKit in one click. | ![Open In Colab](img/colab.svg) | | 🦎 LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. | ![Open In Colab](img/colab.svg) | | ⚡ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. | ![Open In Colab](img/colab.svg) | | 🌳 Model Family Tree | Visualize the family tree of merged models. | ![Open In Colab](img/colab.svg) | | 🚀 ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. | ![Open In Colab](img/colab.svg) |
LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing
LLM-PowerHouse is a comprehensive and curated guide designed to empower developers, researchers, and enthusiasts to harness the true capabilities of Large Language Models (LLMs) and build intelligent applications that push the boundaries of natural language understanding. This GitHub repository provides in-depth articles, codebase mastery, LLM PlayLab, and resources for cost analysis and network visualization. It covers various aspects of LLMs, including NLP, models, training, evaluation metrics, open LLMs, and more. The repository also includes a collection of code examples and tutorials to help users build and deploy LLM-based applications.
ruby-openai
Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct
llm-datasets
LLM Datasets is a repository containing high-quality datasets, tools, and concepts for LLM fine-tuning. It provides datasets with characteristics like accuracy, diversity, and complexity to train large language models for various tasks. The repository includes datasets for general-purpose, math & logic, code, conversation & role-play, and agent & function calling domains. It also offers guidance on creating high-quality datasets through data deduplication, data quality assessment, data exploration, and data generation techniques.
Chinese-Tiny-LLM
Chinese-Tiny-LLM is a repository containing procedures for cleaning Chinese web corpora and pre-training code. It introduces CT-LLM, a 2B parameter language model focused on the Chinese language. The model primarily uses Chinese data from a 1,200 billion token corpus, showing excellent performance in Chinese language tasks. The repository includes tools for filtering, deduplication, and pre-training, aiming to encourage further research and innovation in language model development.
FedLLM-Bench
FedLLM-Bench is a realistic benchmark for the Federated Learning of Large Language Models community. It includes datasets for federated instruction tuning and preference alignment tasks, exhibiting diversities in language, quality, quantity, instruction, sequence length, embedding, and preference. The repository provides training scripts and code for open-ended evaluation, aiming to facilitate research and development in federated learning of large language models.
llm-rag-workshop
The LLM RAG Workshop repository provides a workshop on using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to generate and understand text in a human-like manner. It includes instructions on setting up the environment, indexing Zoomcamp FAQ documents, creating a Q&A system, and using OpenAI for generation based on retrieved information. The repository focuses on enhancing language model responses with retrieved information from external sources, such as document databases or search engines, to improve factual accuracy and relevance of generated text.
EVE
EVE is an official PyTorch implementation of Unveiling Encoder-Free Vision-Language Models. The project aims to explore the removal of vision encoders from Vision-Language Models (VLMs) and transfer LLMs to encoder-free VLMs efficiently. It also focuses on bridging the performance gap between encoder-free and encoder-based VLMs. EVE offers a superior capability with arbitrary image aspect ratio, data efficiency by utilizing publicly available data for pre-training, and training efficiency with a transparent and practical strategy for developing a pure decoder-only architecture across modalities.
home-gallery
Home-Gallery.org is a self-hosted open-source web gallery for browsing personal photos and videos with tagging, mobile-friendly interface, and AI-powered image and face discovery. It aims to provide a fast user experience on mobile phones and help users browse and rediscover memories from their media archive. The tool allows users to serve their local data without relying on cloud services, view photos and videos from mobile phones, and manage images from multiple media source directories. Features include endless photo stream, video transcoding, reverse image lookup, face detection, GEO location reverse lookups, tagging, and more. The tool runs on NodeJS and supports various platforms like Linux, Mac, and Windows.
20 - OpenAI Gpts
Signal Processing Advisor
Provides expert guidance on signal processing in engineering projects.
Prompt Injection Detector
GPT used to classify prompts as valid inputs or injection attempts. Json output.
Form Filler
Expert in populating Word .docx forms with data from other documents, prioritizing accuracy and formal communication.
ChromaSpectra Filter Creator
Merge a holographic shimmer with RGB splitting for a surreal, digital-art look.
Air Purifier Servicer Assistant
Hello I'm Air Purifier Servicer Assistant! What would you like help with today?
Photo Mentor
Upload photo! I will provide clear, concise photo analysis and improvement advice.
South Parkify
Transform any photo into a visually stunning South Park moment with just a few clicks.