Best AI tools for< Filter Data >
20 - AI tool Sites
Tablesmith
Tablesmith is a free, privacy-first, and intuitive spreadsheet automation tool that allows users to build reusable data flows, effortlessly sort, filter, group, format, or split data across files/sheets based on cell values. It is designed to be easy to learn and use, with a focus on privacy and cross-platform compatibility. Tablesmith also offers an AI autofill feature that suggests and fills in information based on the user's prompt.
StockGPT
StockGPT is an AI-powered financial research assistant that provides knowledge of earnings releases, financial reports, and fundamental information for S&P 500 and Nasdaq companies. It offers features like AI search, customizable filters, up-to-date data, industry research, and more to help users analyze companies and markets efficiently.
Unlost
Unlost is a memory recall tool that allows users to instantly retrieve information with zero effort. It functions as a memory palace, eliminating the need for extensive courses or constant note-taking. Unlost intelligently records and understands screen layouts, ensuring privacy by respecting user space and copyright laws. The tool operates locally and offline, with minimal data collection. Users can exclude specific content and enjoy quick access through discreet background operation. Unlost offers powerful filtering capabilities, familiar keyboard shortcuts, and supports searching meeting transcripts. It simplifies text copying from screenshots and aims to enhance memory delegation and exploration of one's capacity.
Zelma
Zelma is an AI-powered research assistant that enables users to find, graph, and understand U.S. school testing data using plain English. It allows users to search student test data by school district, demographics, grade, and more, and presents the data with graphs, tables, and descriptions. Zelma aims to make education data easily accessible and understandable for everyone.
AssetLink
AssetLink is a relationship intelligence platform that leverages contextualized data to connect the private wealth ecosystem. It helps financial advisors and asset managers make informed decisions by facilitating connections between the right buyers and sellers through AI and data-driven insights. The platform aims to streamline the process of matching advisors and asset managers, ultimately enhancing business growth and efficiency.
StartupHub AI
StartupHub AI is a comprehensive platform providing data and tech news related to the AI startup ecosystem. It offers information on startups, funding rounds, investors, events, and more. The platform serves as a hub for AI professionals, investors, and startups, with a focus on the Israeli AI startup scene. Users can access original content, statistics, infographics, and press releases to stay updated on the latest trends and developments in the AI industry.
ChoiceChaser
ChoiceChaser is an AI-powered lead generation tool that helps businesses find and connect with potential customers on social media, forums, and other online platforms. It uses natural language processing and machine learning to identify relevant posts and conversations, and then notifies users when there is a match. ChoiceChaser can help businesses save time and energy by automating the process of lead generation, and it can also help them reach a wider audience of potential customers.
YouTube Comment Finder And AI Analysis
The 'YouTube Comment Finder And AI Analysis' is a comprehensive web-based tool designed to simplify the process of searching, filtering, managing, and analyzing comments on YouTube videos. It empowers users to search, filter, sort, and analyze comments with ease, leveraging AI-powered comment analysis to gain insights into sentiment, trending topics, key points, and concise summaries of comments. The tool offers features such as comment search, filtering, sorting, exporting, and random comment picking, making it a valuable asset for content creators, marketers, and individuals looking to navigate the vast sea of comments on YouTube videos.
Pongo
Pongo is an AI-powered tool that helps reduce hallucinations in Large Language Models (LLMs) by up to 80%. It utilizes multiple state-of-the-art semantic similarity models and a proprietary ranking algorithm to ensure accurate and relevant search results. Pongo integrates seamlessly with existing pipelines, whether using a vector database or Elasticsearch, and processes top search results to deliver refined and reliable information. Its distributed architecture ensures consistent latency, handling a wide range of requests without compromising speed. Pongo prioritizes data security, operating at runtime with zero data retention and no data leaving its secure AWS VPC.
Sightengine
The website offers content moderation and image analysis products using powerful APIs to automatically assess, filter, and moderate images, videos, and text. It provides features such as image moderation, video moderation, text moderation, AI image detection, and video anonymization. The application helps in detecting unwanted content, AI-generated images, and personal information in videos. It also offers tools to identify near-duplicates, spam, and abusive links, and prevent phishing and circumvention attempts. The platform is fast, scalable, accurate, easy to integrate, and privacy compliant, making it suitable for various industries like marketplaces, dating apps, and news platforms.
Angel Match
Angel Match is a comprehensive investor database platform that connects startups with over 110,000 angel investors and venture capitalists. It helps users expand their network, save time searching for funding, and find the perfect investor for their startup. The platform offers features such as investor search and filtering, diverse investor profiles, tracking investor engagements, and up-to-date data to streamline the fundraising process. Angel Match caters to startups across all industries and provides a hub for connecting with investors globally.
Hirebase
Hirebase is an AI-powered job search engine that provides ultra-fresh job market data directly from company pages. It uses AI to scan 100,000 jobs in real-time, ensuring that every job listed is actively hiring on the internet. Users can receive email alerts for new job listings based on their preferences for job title, keywords, location, experience level, date posted, salary range, and more. Hirebase aims to 'unsuckify' the job search process by leveraging AI technology to streamline and enhance the job hunting experience.
Ragie
Ragie is a fully managed RAG-as-a-Service platform designed for developers. It offers easy-to-use APIs and SDKs to help developers get started quickly, with advanced features like LLM re-ranking, summary index, entity extraction, flexible filtering, and hybrid semantic and keyword search. Ragie allows users to connect directly to popular data sources like Google Drive, Notion, Confluence, and more, ensuring accurate and reliable information delivery. The platform is led by Craft Ventures and offers seamless data connectivity through connectors. Ragie simplifies the process of data ingestion, chunking, indexing, and retrieval, making it a valuable tool for AI applications.
Kuration AI
The website is a B2B research AI agent that automates manual B2B research processes by curating, refining, and enriching lead databases with AI agents. It offers features like source, curate, aggregate data points, templates, and custom AI-powered enrichment. The application helps users gather the right data, speed up research processes, and target relevant companies. It provides a range of pricing plans, compliance with ISO 9001, and a mobile application. The AI agent is used by companies like UBS, Microsoft, and Airbnb, and utilizes technologies like MongoDB, Flutter, and Next.js.
Hella Jobs
Hella Jobs is a leading platform for AI, Machine Learning, and Data Science jobs. It connects job seekers with top employers in the field of AI/ML, allowing employers to post open jobs and hire top talent. Job seekers can create profiles, submit resumes, and find new job opportunities. The platform offers features such as job filtering by keywords and location, job category selection, salary range selection, and job type filtering. Hella Jobs aims to streamline the job search process for both employers and job seekers in the AI/ML industry.
Jobs-Scout
Jobs-Scout is an AI-powered job search engine that helps you find your dream job. With Jobs-Scout, you can search for jobs by keyword, location, and industry. You can also filter your search results by salary, experience, and education level. Jobs-Scout also provides personalized job recommendations based on your skills and interests.
Tomat.AI
Tomat.AI is an AI-powered tool designed to help users open and explore large CSV files effortlessly. With features like automated data profiling, merging multiple files, and building reports, Tomat.AI simplifies the process of analyzing and automating Excel and CSV files without the need for coding skills. The tool ensures data security by operating entirely on the user's local machine, offering a user-friendly interface for seamless data manipulation and analysis.
Pinecone
Pinecone is a vector database designed to build knowledgeable AI applications. It offers a serverless platform with high capacity and low cost, enabling users to perform low-latency vector search for various AI tasks. Pinecone is easy to start and scale, allowing users to create an account, upload vector embeddings, and retrieve relevant data quickly. The platform combines vector search with metadata filters and keyword boosting for better application performance. Pinecone is secure, reliable, and cloud-native, making it suitable for powering mission-critical AI applications.
Joby.ai
Joby.ai is an AI-powered job search engine that directly scans 500,000 jobs in real-time from company pages. It uses AI technology to find every company and job that is actively hiring on the internet. Users can search for jobs based on various criteria like job title, keywords, location, experience, date posted, salary range, and more. The platform also offers advanced search capabilities, exact keyword search, and the ability to exclude keywords for more precise results. Joby.ai aims to help users find hidden job opportunities that may not be available on traditional job search platforms like LinkedIn or Indeed, ensuring that all listings are current and actively hiring.
AI Toolhouse
AI Toolhouse is a comprehensive AI tools catalog and directory that allows users to explore various categories of AI tools and Generative AI advancements. Users can discover the newest additions, stay updated with daily data updates, and access cutting-edge resources in areas such as General Writing, Art, Code Assistant, SQL, Human Resources, E-Commerce, Productivity, Sales, Image Editing, and Developer Tools. The platform offers a wide range of verified filters to help users find the most suitable tools for their needs.
20 - Open Source AI Tools
letsql
LETSQL is a data processing library built on top of Ibis and DataFusion to write multi-engine data workflows. It is currently in development and does not have a stable release. Users can install LETSQL from PyPI and use it to connect to data sources, read data, filter, group, and aggregate data for analysis. Contributions to the project are welcome, and the library is actively maintained with support available for any issues. LETSQL heavily relies on Ibis and DataFusion for its functionality.
DeepDanbooru
DeepDanbooru is an anime-style girl image tag estimation system written in Python. It allows users to estimate images using a live demo site. The tool requires specific packages to be installed and provides a structured dataset for training projects. Users can create training projects, download tags, filter datasets, and start training to estimate tags for images. The tool uses a specific dataset structure and project structure to facilitate the training process.
db-ally
db-ally is a library for creating natural language interfaces to data sources. It allows developers to outline specific use cases for a large language model (LLM) to handle, detailing the desired data format and the possible operations to fetch this data. db-ally effectively shields the complexity of the underlying data source from the model, presenting only the essential information needed for solving the specific use cases. Instead of generating arbitrary SQL, the model is asked to generate responses in a simplified query language.
markdowner
Markdowner is a fast tool designed to convert any website into LLM-ready markdown data. It aims to improve the quality of responses in the AI app Supermemory by structuring and predicting data in markdown format. The tool offers features such as website conversion, LLM filtering, detailed markdown mode, auto crawler, text and JSON responses, and easy self-hosting. Markdowner utilizes Cloudflare's Browser rendering and Durable objects for browser instance creation and markdown conversion. Users can self-host the project with the Workers paid plan, following simple steps. Support the project by starring the repository.
VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.
HuatuoGPT-o1
HuatuoGPT-o1 is a medical language model designed for advanced medical reasoning. It can identify mistakes, explore alternative strategies, and refine answers. The model leverages verifiable medical problems and a specialized medical verifier to guide complex reasoning trajectories and enhance reasoning through reinforcement learning. The repository provides access to models, data, and code for HuatuoGPT-o1, allowing users to deploy the model for medical reasoning tasks.
reductstore
ReductStore is a high-performance time series database designed for storing and managing large amounts of unstructured blob data. It offers features such as real-time querying, batching data, and HTTP(S) API for edge computing, computer vision, and IoT applications. The database ensures data integrity, implements retention policies, and provides efficient data access, making it a cost-effective solution for applications requiring unstructured data storage and access at specific time intervals.
data-prep-kit
Data Prep Kit is a community project aimed at democratizing and speeding up unstructured data preparation for LLM app developers. It provides high-level APIs and modules for transforming data (code, language, speech, visual) to optimize LLM performance across different use cases. The toolkit supports Python, Ray, Spark, and Kubeflow Pipelines runtimes, offering scalability from laptop to datacenter-scale processing. Developers can contribute new custom modules and leverage the data processing library for building data pipelines. Automation features include workflow automation with Kubeflow Pipelines for transform execution.
magpie
This is the official repository for 'Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing'. Magpie is a tool designed to synthesize high-quality instruction data at scale by extracting it directly from an aligned Large Language Models (LLMs). It aims to democratize AI by generating large-scale alignment data and enhancing the transparency of model alignment processes. Magpie has been tested on various model families and can be used to fine-tune models for improved performance on alignment benchmarks such as AlpacaEval, ArenaHard, and WildBench.
litdata
LitData is a tool designed for blazingly fast, distributed streaming of training data from any cloud storage. It allows users to transform and optimize data in cloud storage environments efficiently and intuitively, supporting various data types like images, text, video, audio, geo-spatial, and multimodal data. LitData integrates smoothly with frameworks such as LitGPT and PyTorch, enabling seamless streaming of data to multiple machines. Key features include multi-GPU/multi-node support, easy data mixing, pause & resume functionality, support for profiling, memory footprint reduction, cache size configuration, and on-prem optimizations. The tool also provides benchmarks for measuring streaming speed and conversion efficiency, along with runnable templates for different data types. LitData enables infinite cloud data processing by utilizing the Lightning.ai platform to scale data processing with optimized machines.
cambrian
Cambrian-1 is a fully open project focused on exploring multimodal Large Language Models (LLMs) with a vision-centric approach. It offers competitive performance across various benchmarks with models at different parameter levels. The project includes training configurations, model weights, instruction tuning data, and evaluation details. Users can interact with Cambrian-1 through a Gradio web interface for inference. The project is inspired by LLaVA and incorporates contributions from Vicuna, LLaMA, and Yi. Cambrian-1 is licensed under Apache 2.0 and utilizes datasets and checkpoints subject to their respective original licenses.
rpaframework
RPA Framework is an open-source collection of libraries and tools for Robotic Process Automation (RPA), designed to be used with Robot Framework and Python. It offers well-documented core libraries for Software Robot Developers, optimized for Robocorp Control Room and Developer Tools, and accepts external contributions. The project includes various libraries for tasks like archiving, browser automation, date/time manipulations, cloud services integration, encryption operations, database interactions, desktop automation, document processing, email operations, Excel manipulation, file system operations, FTP interactions, web API interactions, image manipulation, AI services, and more. The development of the repository is Python-based and requires Python version 3.8+, with tooling based on poetry and invoke for compiling, building, and running the package. The project is licensed under the Apache License 2.0.
taipy
Taipy is an open-source Python library for easy, end-to-end application development, featuring what-if analyses, smart pipeline execution, built-in scheduling, and deployment tools.
llm-course
The LLM course is divided into three parts: 1. 🧩 **LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. 🧑🔬 **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. 👷 **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * 🤗 **HuggingChat Assistant**: Free version using Mixtral-8x7B. * 🤖 **ChatGPT Assistant**: Requires a premium account. ## 📝 Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | 🧐 LLM AutoEval | Automatically evaluate your LLMs using RunPod | ![Open In Colab](img/colab.svg) | | 🥱 LazyMergekit | Easily merge models using MergeKit in one click. | ![Open In Colab](img/colab.svg) | | 🦎 LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. | ![Open In Colab](img/colab.svg) | | ⚡ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. | ![Open In Colab](img/colab.svg) | | 🌳 Model Family Tree | Visualize the family tree of merged models. | ![Open In Colab](img/colab.svg) | | 🚀 ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. | ![Open In Colab](img/colab.svg) |
WebRL
WebRL is a self-evolving online curriculum learning framework designed for training web agents in the WebArena environment. It provides model checkpoints, training instructions, and evaluation processes for training the actor and critic models. The tool enables users to generate new instructions and interact with WebArena to configure tasks for training and evaluation.
LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing
LLM-PowerHouse is a comprehensive and curated guide designed to empower developers, researchers, and enthusiasts to harness the true capabilities of Large Language Models (LLMs) and build intelligent applications that push the boundaries of natural language understanding. This GitHub repository provides in-depth articles, codebase mastery, LLM PlayLab, and resources for cost analysis and network visualization. It covers various aspects of LLMs, including NLP, models, training, evaluation metrics, open LLMs, and more. The repository also includes a collection of code examples and tutorials to help users build and deploy LLM-based applications.
ruby-openai
Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct
llm-datasets
LLM Datasets is a repository containing high-quality datasets, tools, and concepts for LLM fine-tuning. It provides datasets with characteristics like accuracy, diversity, and complexity to train large language models for various tasks. The repository includes datasets for general-purpose, math & logic, code, conversation & role-play, and agent & function calling domains. It also offers guidance on creating high-quality datasets through data deduplication, data quality assessment, data exploration, and data generation techniques.
Chinese-Tiny-LLM
Chinese-Tiny-LLM is a repository containing procedures for cleaning Chinese web corpora and pre-training code. It introduces CT-LLM, a 2B parameter language model focused on the Chinese language. The model primarily uses Chinese data from a 1,200 billion token corpus, showing excellent performance in Chinese language tasks. The repository includes tools for filtering, deduplication, and pre-training, aiming to encourage further research and innovation in language model development.
20 - OpenAI Gpts
Signal Processing Advisor
Provides expert guidance on signal processing in engineering projects.
Prompt Injection Detector
GPT used to classify prompts as valid inputs or injection attempts. Json output.
Form Filler
Expert in populating Word .docx forms with data from other documents, prioritizing accuracy and formal communication.
ChromaSpectra Filter Creator
Merge a holographic shimmer with RGB splitting for a surreal, digital-art look.
Air Purifier Servicer Assistant
Hello I'm Air Purifier Servicer Assistant! What would you like help with today?
Photo Mentor
Upload photo! I will provide clear, concise photo analysis and improvement advice.
South Parkify
Transform any photo into a visually stunning South Park moment with just a few clicks.