Best AI tools for< Find Open Data >
20 - AI tool Sites

OpenBuckets
OpenBuckets is a web application designed to help users find and secure open buckets in cloud storage systems. It provides a user-friendly interface for scanning and identifying publicly accessible buckets, allowing users to take necessary actions to secure their data. With OpenBuckets, users can easily detect potential security risks and protect their sensitive information stored in cloud storage. The application is a valuable tool for individuals and organizations looking to enhance their data security measures in the cloud.

Open Knowledge Maps
Open Knowledge Maps is the world's largest AI-based search engine for scientific knowledge. It aims to revolutionize discovery by increasing the visibility of research findings for science and society. The platform is open and nonprofit, based on the principles of open science, with a mission to create an inclusive, sustainable, and equitable infrastructure for all users. Users can map research topics with AI, find documents, and identify concepts to enhance their literature search experience.

Aim
Aim is an open-source, self-hosted AI Metadata tracking tool designed to handle 100,000s of tracked metadata sequences. Two most famous AI metadata applications are: experiment tracking and prompt engineering. Aim provides a performant and beautiful UI for exploring and comparing training runs, prompt sessions.

arXiv
arXiv.org is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.

Talentscreener
Talentscreener is an AI-powered talent assessment platform that helps businesses find the best candidates for their open positions. The platform uses a variety of AI algorithms to assess candidates' skills, experience, and personality, and then provides businesses with a ranked list of the most qualified candidates. Talentscreener also offers a variety of other features, such as job posting, candidate management, and reporting.

Voxel51
Voxel51 is an AI tool that provides open-source computer vision tools for machine learning. It offers solutions for various industries such as agriculture, aviation, driving, healthcare, manufacturing, retail, robotics, and security. Voxel51's main product, FiftyOne, helps users explore, visualize, and curate visual data to improve model performance and accelerate the development of visual AI applications. The platform is trusted by thousands of users and companies, offering both open-source and enterprise-ready solutions to manage and refine data and models for visual AI.

Code Snippets AI
Code Snippets AI is an AI-powered code snippets library for teams. It helps developers master their codebase with contextually-rich AI chats, integrated with a secure code snippets library. Developers can build new features, fix bugs, add comments, and understand their codebase with the help of Code Snippets AI. The tool is trusted by the best development teams and helps developers code smarter than ever. With Code Snippets AI, developers can leverage the power of a codebase aware assistant, helping them write clean, performance optimized code. They can also create documentation, refactor, debug and generate code with full codebase context. This helps developers spend more time creating code and less time debugging errors.

Fiber AI
Fiber AI is an AI-powered platform that specializes in automated SDR & BDR prospecting and outbound sales workflows. The platform helps businesses improve email deliverability rates, avoid common pitfalls that lead to emails being marked as spam, and connect with prospects in a personalized manner. Fiber AI enables users to target companies with hyper-precision, identify ideal prospects within those companies, and find verified contact information. The platform uses data from various providers to streamline the sales process and enhance outreach strategies.

Sku Fetch
Sku Fetch is a powerful tool that helps you fetch, prepare, and list product information from hundreds of suppliers. It provides multiple free templates, helps you find keywords, and can even add UPCs to your products. With Sku Fetch, you can also analyze your competition, add reviews to your listings, and process multiple products with preset settings. Plus, it supports multiple listers, such as Wise Lister, Crazy Lister, eBay Selling Manager, Ink Frog, Shopify, and others.

SeekOut
SeekOut is an AI-powered platform designed to help organizations find the right candidates for open roles, develop their teams, and improve company culture. It offers features such as external talent sourcing, applicant review, pipeline insights, internal talent development, career compass, and talent intelligence. SeekOut is trusted by over 1,000 leading brands to recruit hard-to-find, diverse talent and manage talent acquisition and management in one platform. The platform integrates external data with HR systems to automatically build comprehensive profiles and provides data-driven insights to understand talent needs and prepare for the future.

AI Jobs
AI Jobs is a curated list of the best AI jobs for developers, designers and marketers. It provides a platform for companies to post their AI-related job openings and for job seekers to find their dream AI job. The website also includes a blog with articles on the latest AI trends and technologies.

Hella Jobs
Hella Jobs is a leading platform for AI, Machine Learning, and Data Science jobs. It connects job seekers with top employers in the field of AI/ML, allowing employers to post open jobs and hire top talent. Job seekers can create profiles, submit resumes, and find new job opportunities. The platform offers features such as job filtering by keywords and location, job category selection, salary range selection, and job type filtering. Hella Jobs aims to streamline the job search process for both employers and job seekers in the AI/ML industry.

AI Web Page Analyzer
AI Web Page Analyzer is a free and open-source tool that helps you analyze web pages for SEO. It can check content, keywords, structure, and metatags, and provide recommendations for improving your website's SEO. AI Web Page Analyzer also includes a number of other features, such as SEO optimization, keyword extraction, and content generation.

Wikidata
Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. It acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others. Wikidata also provides support to many other sites and services beyond just Wikimedia projects!

Summarize Paper .com
Summarize Paper .com is an open-source AI tool that provides concise, understandable, and insightful summaries of the latest research articles on arXiv. The tool uses AI to generate key points and layman's summaries of research papers, making it easy for users to stay up-to-date with the latest developments in their field. In addition to its summary service, Summarize Paper .com also offers an AI assistant that can answer questions about arXiv papers. The tool is designed to make it easy for researchers, students, journalists, and anyone else who wants to stay informed about the latest research to access and understand the latest findings.

Unfetch
Unfetch is an online IDE that enables users to generate, deploy, and run AI agents to automate various tasks. It combines coding capabilities with an online deployment platform, making it easy to create AI agents. Unfetch agents are designed specifically for AI tasks and are compatible with tools like Open AI GPT Store and Langchain. Users can build and deploy AI agents to solve a wide range of tasks efficiently.

Healthee
Healthee is an AI-powered employee benefits app that simplifies healthcare navigation for employees and stakeholders. It provides personalized answers to healthcare queries, streamlines open enrollment processes, and offers real-time insights and data-driven preventive care recommendations. With Healthee, employees can access vital health plan information anytime through a user-friendly mobile app.

DocsAI
DocsAI is an AI-powered document companion that helps you organize, search, and chat with your documents. It integrates with various sources, including websites, text files, PDFs, Docx, Notion, and Confluence. You can customize the companion's appearance to match your brand and suggest better answers to improve its accuracy. DocsAI also offers a chat widget that can be embedded on any website, allowing you to chat with your documents and get summaries, insights, and leads. It is mobile and tablet-friendly, and you can export chats and analyze data to identify trends and improve customer satisfaction. DocsAI is open source and offers custom prompts and multi-language support.

Bearly
Bearly is an AI-powered tool that enhances your workflow by providing advanced AI capabilities. It integrates seamlessly with your existing workflow, allowing you to read, write, and create content with ease. With Bearly, you can interact with documents, analyze and ask questions, transcribe audio and video, access real-time web information, and generate meeting minutes. Its open AI platform provides access to various AI models, ensuring you find the perfect fit for your needs. Bearly prioritizes security, with zero logging, chat and document encryption, and a secure infrastructure to safeguard your data.

Sprockets
Sprockets is an AI-powered hiring software designed to help businesses overcome today's unique hiring challenges. It automates manual tasks, reduces employee turnover, and helps businesses hire the best workers every time. Sprockets offers a range of features, including a virtual recruiter, sourcing, screening, applicant tracking, reporting, time to hire, background checks, and tax credits. It also integrates with a variety of other HR systems, making it easy to use alongside your existing tools. With Sprockets, businesses can improve their hiring process, save time and money, and find the best talent for their open positions.
20 - Open Source AI Tools

AutoWebGLM
AutoWebGLM is a project focused on developing a language model-driven automated web navigation agent. It extends the capabilities of the ChatGLM3-6B model to navigate the web more efficiently and address real-world browsing challenges. The project includes features such as an HTML simplification algorithm, hybrid human-AI training, reinforcement learning, rejection sampling, and a bilingual web navigation benchmark for testing AI web navigation agents.

hallucination-leaderboard
This leaderboard evaluates the hallucination rate of various Large Language Models (LLMs) when summarizing documents. It uses a model trained by Vectara to detect hallucinations in LLM outputs. The leaderboard includes models from OpenAI, Anthropic, Google, Microsoft, Amazon, and others. The evaluation is based on 831 documents that were summarized by all the models. The leaderboard shows the hallucination rate, factual consistency rate, answer rate, and average summary length for each model.

Open_Data_QnA
Open Data QnA is a Python library that allows users to interact with their PostgreSQL or BigQuery databases in a conversational manner, without needing to write SQL queries. The library leverages Large Language Models (LLMs) to bridge the gap between human language and database queries, enabling users to ask questions in natural language and receive informative responses. It offers features such as conversational querying with multiturn support, table grouping, multi schema/dataset support, SQL generation, query refinement, natural language responses, visualizations, and extensibility. The library is built on a modular design and supports various components like Database Connectors, Vector Stores, and Agents for SQL generation, validation, debugging, descriptions, embeddings, responses, and visualizations.

upgini
Upgini is an intelligent data search engine with a Python library that helps users find and add relevant features to their ML pipeline from various public, community, and premium external data sources. It automates the optimization of connected data sources by generating an optimal set of machine learning features using large language models, GraphNNs, and recurrent neural networks. The tool aims to simplify feature search and enrichment for external data to make it a standard approach in machine learning pipelines. It democratizes access to data sources for the data science community.

cia
CIA is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for processing large datasets and generating insightful reports. With CIA, users can easily explore data, perform statistical analysis, and create interactive visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, CIA offers a comprehensive set of features to streamline your data analysis workflow and uncover valuable insights.

lantern
Lantern is an open-source PostgreSQL database extension designed to store vector data, generate embeddings, and handle vector search operations efficiently. It introduces a new index type called 'lantern_hnsw' for vector columns, which speeds up 'ORDER BY ... LIMIT' queries. Lantern utilizes the state-of-the-art HNSW implementation called usearch. Users can easily install Lantern using Docker, Homebrew, or precompiled binaries. The tool supports various distance functions, index construction parameters, and operator classes for efficient querying. Lantern offers features like embedding generation, interoperability with pgvector, parallel index creation, and external index graph generation. It aims to provide superior performance metrics compared to other similar tools and has a roadmap for future enhancements such as cloud-hosted version, hardware-accelerated distance metrics, industry-specific application templates, and support for version control and A/B testing of embeddings.

LabelLLM
LabelLLM is an open-source data annotation platform designed to optimize the data annotation process for LLM development. It offers flexible configuration, multimodal data support, comprehensive task management, and AI-assisted annotation. Users can access a suite of annotation tools, enjoy a user-friendly experience, and enhance efficiency. The platform allows real-time monitoring of annotation progress and quality control, ensuring data integrity and timeliness.

llm-on-openshift
This repository provides resources, demos, and recipes for working with Large Language Models (LLMs) on OpenShift using OpenShift AI or Open Data Hub. It includes instructions for deploying inference servers for LLMs, such as vLLM, Hugging Face TGI, Caikit-TGIS-Serving, and Ollama. Additionally, it offers guidance on deploying serving runtimes, such as vLLM Serving Runtime and Hugging Face Text Generation Inference, in the Single-Model Serving stack of Open Data Hub or OpenShift AI. The repository also covers vector databases that can be used as a Vector Store for Retrieval Augmented Generation (RAG) applications, including Milvus, PostgreSQL+pgvector, and Redis. Furthermore, it provides examples of inference and application usage, such as Caikit, Langchain, Langflow, and UI examples.

zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

RTL-Coder
RTL-Coder is a tool designed to outperform GPT-3.5 in RTL code generation by providing a fully open-source dataset and a lightweight solution. It targets Verilog code generation and offers an automated flow to generate a large labeled dataset with over 27,000 diverse Verilog design problems and answers. The tool addresses the data availability challenge in IC design-related tasks and can be used for various applications beyond LLMs. The tool includes four RTL code generation models available on the HuggingFace platform, each with specific features and performance characteristics. Additionally, RTL-Coder introduces a new LLM training scheme based on code quality feedback to further enhance model performance and reduce GPU memory consumption.

wordlift-plugin
WordLift is a plugin that helps online content creators organize posts and pages by adding facts, links, and media to build beautifully structured websites for both humans and search engines. It allows users to create, own, and publish their own knowledge graph, and publishes content as Linked Open Data following Tim Berners-Lee's Linked Data Principles. The plugin supports writers by providing trustworthy and contextual facts, enriching content with images, links, and interactive visualizations, keeping readers engaged with relevant content recommendations, and producing content compatible with schema.org markup for better indexing and display on search engines. It also offers features like creating a personal Wikipedia, publishing metadata to share and distribute content, and supporting content tagging for better SEO.

llms-tools
The 'llms-tools' repository is a comprehensive collection of AI tools, open-source projects, and research related to Large Language Models (LLMs) and Chatbots. It covers a wide range of topics such as AI in various domains, open-source models, chats & assistants, visual language models, evaluation tools, libraries, devices, income models, text-to-image, computer vision, audio & speech, code & math, games, robotics, typography, bio & med, military, climate, finance, and presentation. The repository provides valuable resources for researchers, developers, and enthusiasts interested in exploring the capabilities of LLMs and related technologies.

YuLan-Mini
YuLan-Mini is a lightweight language model with 2.4 billion parameters that achieves performance comparable to industry-leading models despite being pre-trained on only 1.08T tokens. It excels in mathematics and code domains. The repository provides pre-training resources, including data pipeline, optimization methods, and annealing approaches. Users can pre-train their own language models, perform learning rate annealing, fine-tune the model, research training dynamics, and synthesize data. The team behind YuLan-Mini is AI Box at Renmin University of China. The code is released under the MIT License with future updates on model weights usage policies. Users are advised on potential safety concerns and ethical use of the model.

nlp-llms-resources
The 'nlp-llms-resources' repository is a comprehensive resource list for Natural Language Processing (NLP) and Large Language Models (LLMs). It covers a wide range of topics including traditional NLP datasets, data acquisition, libraries for NLP, neural networks, sentiment analysis, optical character recognition, information extraction, semantics, topic modeling, multilingual NLP, domain-specific LLMs, vector databases, ethics, costing, books, courses, surveys, aggregators, newsletters, papers, conferences, and societies. The repository provides valuable information and resources for individuals interested in NLP and LLMs.

awesome-LLM-resourses
A comprehensive repository of resources for Chinese large language models (LLMs), including data processing tools, fine-tuning frameworks, inference libraries, evaluation platforms, RAG engines, agent frameworks, books, courses, tutorials, and tips. The repository covers a wide range of tools and resources for working with LLMs, from data labeling and processing to model fine-tuning, inference, evaluation, and application development. It also includes resources for learning about LLMs through books, courses, and tutorials, as well as insights and strategies from building with LLMs.

awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.

Awesome-GenAI-Unlearning
This repository is a collection of papers on Generative AI Machine Unlearning, categorized based on modality and applications. It includes datasets, benchmarks, and surveys related to unlearning scenarios in generative AI. The repository aims to provide a comprehensive overview of research in the field of machine unlearning for generative models.
20 - OpenAI Gpts
Open Data Italia bot
Fornisce informazioni sulla normativa italiana in materia di open data, con un tono professionale e divulgativo. In modo che sia più facile chiederne e/o pretenderne la pubblicazione.

Open Source Alternative
Find open source alternative to any paid service you can think of

OpenData Explorer
I'll help you access and understand open data published by central government, local authorities and public bodies. You can ask me in your native language.

Toronto Parks and Rec Bot
Helpful Parks and Rec Bot for Toronto, built with Toronto civic open data.

OpenStreetMap Query
Helps get map data from Open Street Map by generating Overpass Turbo queries. Ask me for mapping features like cafes, rivers or highways

AI OSINT
Your AI OSINT assistant. Our tool helps you find the data needle in the internet haystack.

Open Source Starter Guide
Open Source Guide for Everyone: First time contributors, maintainers, and the curious.

EE-GPT
A search engine and troubleshooter for electrical engineers to promote an open-source community. Submit your questions, corrections and feedback to [email protected]

THPSGPT
Curates music from extreme sports games like Tony Hawks Pro skater, MX vs ATV, as well as others. Please use this playlist to explore new kinds of music with an open mind. Song types include punk, classics, rap, and others.