Best AI tools for< Data Librarian >
Infographic
20 - AI tool Sites

Library of Congress Labs
Library of Congress Labs is an AI tool that focuses on experimenting with artificial intelligence and machine learning at the Library of Congress. It encourages innovation with digital collections, research, and events. The platform aims to explore cultural heritage, connect communities, and center the histories and experiences of communities of color.

Elicit
Elicit is a research tool that uses artificial intelligence to help researchers analyze research papers more efficiently. It can summarize papers, extract data, and synthesize findings, saving researchers time and effort. Elicit is used by over 800,000 researchers worldwide and has been featured in publications such as Nature and Science. It is a powerful tool that can help researchers stay up-to-date on the latest research and make new discoveries.

AcademicID
AcademicID is an AI-powered platform that helps students and researchers discover and access academic resources. It provides a comprehensive database of academic papers, journals, and other resources, as well as tools to help users organize and manage their research. AcademicID also offers a variety of features to help users collaborate with others and share their research findings.

Elsevier
Elsevier is an information analytics business that supports researchers and healthcare professionals in advancing science and improving healthcare outcomes. They provide high-quality data and analytics to help researchers, librarians, and research leaders address challenges at every stage of the research journey. Elsevier offers researcher tools, research management solutions, and evaluation services to enhance productivity and research impact.

Wikidata
Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. It acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others. Wikidata also provides support to many other sites and services beyond just Wikimedia projects!

arXiv
arXiv.org is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.

Ex Libris Products & Services
The website is a comprehensive platform offering a suite of software solutions for library management, research, teaching, and learning in the higher education ecosystem. It leverages generative AI, linked open data, and conversational discovery to optimize operations, integration, personalized experiences, and analytic insights. The platform includes various products and services such as Alma, Primo, Leganto, Rapido, Rosetta, and campusM, catering to the unique needs of academic institutions, libraries, and technology powerhouses. The website features success stories, customer testimonials, webinars, learning resources, and community engagement initiatives.

Semantic Scholar
Semantic Scholar is a free, AI-powered research tool for scientific literature. It is based at the Allen Institute for AI and provides access to over 217 million papers from all fields of science. Semantic Scholar uses AI to help users discover and explore scientific literature, and to stay up-to-date on the latest research. The tool also includes a number of features to help users manage their research, such as the ability to save papers, create bibliographies, and share research with others.

SciSpace
SciSpace is an AI-powered tool that helps researchers understand research papers better. It can explain and elaborate most academic texts in simple words. It is a great tool for students, researchers, and anyone who wants to learn more about a particular topic. SciSpace has a user-friendly interface and is easy to use. Simply upload a research paper or enter a URL, and SciSpace will do the rest. It will highlight key concepts, provide definitions, and generate a summary of the paper. SciSpace can also be used to generate citations and find related papers.

Connected Papers
Connected Papers is a search engine for academic papers that uses artificial intelligence to help users find and explore relevant research. It allows users to search for papers by keyword, author, or title, and then explore the connections between them. Connected Papers also provides a variety of tools to help users organize and manage their research, including the ability to create custom collections of papers, add notes and annotations, and share their research with others.

ResearchRabbit
ResearchRabbit is a research tool that helps researchers discover and organize academic papers. It uses artificial intelligence to recommend papers that are relevant to a researcher's interests and to visualize networks of papers and co-authorships. ResearchRabbit also allows researchers to collaborate on collections of papers and to share their findings with others.

OpenRead
OpenRead is an AI-powered research tool that helps users discover, understand, and organize scientific literature. It offers a variety of features to make research more efficient and effective, including semantic search, AI summarization, and note-taking tools. OpenRead is designed to help researchers of all levels, from students to experienced professionals, save time and improve their research outcomes.

Booltool
Booltool is a free online tool that helps you to create and manage boolean searches. With Booltool, you can easily combine multiple search terms using the AND, OR, and NOT operators to create more precise and effective searches. Booltool also provides a variety of other features to help you refine your searches, such as the ability to exclude specific terms, search within a specific domain, and limit your search to a specific date range.

Lumina
Lumina is a research tool that uses artificial intelligence to help researchers find and analyze information more quickly and easily. It can be used to search for articles, books, and other resources, and it can also be used to analyze data and create visualizations. Lumina is designed to make research more efficient and productive.

mypapers.ai
mypapers.ai is an AI tool designed to assist users in managing and analyzing academic papers efficiently. The tool offers features such as exploring papers and authors, toggling between papers and authors, and tracking the journey of research. Users can also access the code on GitHub to further enhance their research capabilities.

Scite
Scite is an award-winning platform for discovering and evaluating scientific articles via Smart Citations. Smart Citations allow users to see how a publication has been cited by providing the context of the citation and a classification describing whether it provides supporting or contrasting evidence for the cited claim.

ArxivPaperAI
ArxivPaperAI is an AI-powered research paper summarizer that helps you quickly and easily understand the key points of academic papers. With ArxivPaperAI, you can:

Open Knowledge Maps
Open Knowledge Maps is the world's largest AI-based search engine for scientific knowledge. It aims to revolutionize discovery by increasing the visibility of research findings for science and society. The platform is open and nonprofit, based on the principles of open science, with a mission to create an inclusive, sustainable, and equitable infrastructure for all users. Users can map research topics with AI, find documents, and identify concepts to enhance their literature search experience.

Dr.Oracle
Dr.Oracle is a personal AI research assistant that helps you find and understand the latest research in your field. With Dr.Oracle, you can search for research papers, track your favorite authors, and get personalized recommendations for new research. Dr.Oracle is the perfect tool for students, researchers, and anyone who wants to stay up-to-date on the latest research in their field.

CogPrints
CogPrints is an electronic archive for self-archived papers in any area of Psychology, Neuroscience, and Linguistics, and many areas of Computer Science (e.g., artificial intelligence, robotics, vision, learning, speech, neural networks), Philosophy (e.g., mind, language, knowledge, science, logic), Biology (e.g., ethology, behavioral ecology, sociobiology, behavior genetics, evolutionary theory), Medicine (e.g., Psychiatry, Neurology, human genetics, Imaging), Anthropology (e.g., primatology, cognitive ethnology, archeology, paleontology), as well as any other portions of the physical, social and mathematical sciences that are pertinent to the study of cognition.
20 - Open Source Tools

opendataeditor
The Open Data Editor (ODE) is a no-code application to explore, validate and publish data in a simple way. It is an open source project powered by the Frictionless Framework. The ODE is currently available for download and testing in beta.

mindsdb
MindsDB is a platform for customizing AI from enterprise data. You can create, serve, and fine-tune models in real-time from your database, vector store, and application data. MindsDB "enhances" SQL syntax with AI capabilities to make it accessible for developers worldwide. With MindsDB’s nearly 200 integrations, any developer can create AI customized for their purpose, faster and more securely. Their AI systems will constantly improve themselves — using companies’ own data, in real-time.

asreview
The ASReview project implements active learning for systematic reviews, utilizing AI-aided pipelines to assist in finding relevant texts for search tasks. It accelerates the screening of textual data with minimal human input, saving time and increasing output quality. The software offers three modes: Oracle for interactive screening, Exploration for teaching purposes, and Simulation for evaluating active learning models. ASReview LAB is designed to support decision-making in any discipline or industry by improving efficiency and transparency in screening large amounts of textual data.

Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)

feeds.fun
Feeds Fun is a self-hosted news reader tool that automatically assigns tags to news entries. Users can create rules to score news based on tags, filter and sort news as needed, and track read news. The tool offers multi/single-user support, feeds management, and various features for personalized news consumption. Users can access the tool's backend as the ffun package on PyPI and the frontend as the feeds-fun package on NPM. Feeds Fun requires setting up OpenAI or Gemini API keys for full tag generation capabilities. The tool uses tag processors to detect tags for news entries, with options for simple and complex processors. Feeds Fun primarily relies on LLM tag processors from OpenAI and Google for tag generation.

OpenAI
OpenAI is a Swift community-maintained implementation over OpenAI public API. It is a non-profit artificial intelligence research organization founded in San Francisco, California in 2015. OpenAI's mission is to ensure safe and responsible use of AI for civic good, economic growth, and other public benefits. The repository provides functionalities for text completions, chats, image generation, audio processing, edits, embeddings, models, moderations, utilities, and Combine extensions.

SirChatalot
A Telegram bot that proves you don't need a body to have a personality. It can use various text and image generation APIs to generate responses to user messages. For text generation, the bot can use: * OpenAI's ChatGPT API (or other compatible API). Vision capabilities can be used with GPT-4 models. Function calling can be used with Function calling. * Anthropic's Claude API. Vision capabilities can be used with Claude 3 models. Function calling can be used with tool use. * YandexGPT API Bot can also generate images with: * OpenAI's DALL-E * Stability AI * Yandex ART This bot can also be used to generate responses to voice messages. Bot will convert the voice message to text and will then generate a response. Speech recognition can be done using the OpenAI's Whisper model. To use this feature, you need to install the ffmpeg library. This bot is also support working with files, see Files section for more details. If function calling is enabled, bot can generate images and search the web (limited).

warc-gpt
WARC-GPT is an experimental retrieval augmented generation pipeline for web archive collections. It allows users to interact with WARC files, extract text, generate text embeddings, visualize embeddings, and interact with a web UI and API. The tool is highly customizable, supporting various LLMs, providers, and embedding models. Users can configure the application using environment variables, ingest WARC files, start the server, and interact with the web UI and API to search for content and generate text completions. WARC-GPT is designed for exploration and experimentation in exploring web archives using AI.

vault-ai
OP Vault is a tool that leverages the OP Stack (OpenAI + Pinecone Vector Database) to allow users to upload custom knowledgebase files and ask questions about their contents. It provides a user-friendly Golang server and React frontend for querying human-readable content like books and documents, making it valuable for knowledge extraction and question-answering. Users can upload entire libraries, receive specific answers with file and section references, and explore the power of the OP Stack in a practical interface.

breadboard
Breadboard is a library for prototyping generative AI applications. It is inspired by the hardware maker community and their boundless creativity. Breadboard makes it easy to wire prototypes and share, remix, reuse, and compose them. The library emphasizes ease and flexibility of wiring, as well as modularity and composability.

data-scientist-roadmap2024
The Data Scientist Roadmap2024 provides a comprehensive guide to mastering essential tools for data science success. It includes programming languages, machine learning libraries, cloud platforms, and concepts categorized by difficulty. The roadmap covers a wide range of topics from programming languages to machine learning techniques, data visualization tools, and DevOps/MLOps tools. It also includes web development frameworks and specific concepts like supervised and unsupervised learning, NLP, deep learning, reinforcement learning, and statistics. Additionally, it delves into DevOps tools like Airflow and MLFlow, data visualization tools like Tableau and Matplotlib, and other topics such as ETL processes, optimization algorithms, and financial modeling.

awesome-generative-ai-data-scientist
A curated list of 50+ resources to help you become a Generative AI Data Scientist. This repository includes resources on building GenAI applications with Large Language Models (LLMs), and deploying LLMs and GenAI with Cloud-based solutions.

femtoGPT
femtoGPT is a pure Rust implementation of a minimal Generative Pretrained Transformer. It can be used for both inference and training of GPT-style language models using CPUs and GPUs. The tool is implemented from scratch, including tensor processing logic and training/inference code of a minimal GPT architecture. It is a great start for those fascinated by LLMs and wanting to understand how these models work at deep levels. The tool uses random generation libraries, data-serialization libraries, and a parallel computing library. It is relatively fast on CPU and correctness of gradients is checked using the gradient-check method.

litdata
LitData is a tool designed for blazingly fast, distributed streaming of training data from any cloud storage. It allows users to transform and optimize data in cloud storage environments efficiently and intuitively, supporting various data types like images, text, video, audio, geo-spatial, and multimodal data. LitData integrates smoothly with frameworks such as LitGPT and PyTorch, enabling seamless streaming of data to multiple machines. Key features include multi-GPU/multi-node support, easy data mixing, pause & resume functionality, support for profiling, memory footprint reduction, cache size configuration, and on-prem optimizations. The tool also provides benchmarks for measuring streaming speed and conversion efficiency, along with runnable templates for different data types. LitData enables infinite cloud data processing by utilizing the Lightning.ai platform to scale data processing with optimized machines.

sql-eval
This repository contains the code that Defog uses for the evaluation of generated SQL. It's based off the schema from the Spider, but with a new set of hand-selected questions and queries grouped by query category. The testing procedure involves generating a SQL query, running both the 'gold' query and the generated query on their respective database to obtain dataframes with the results, comparing the dataframes using an 'exact' and a 'subset' match, logging these alongside other metrics of interest, and aggregating the results for reporting. The repository provides comprehensive instructions for installing dependencies, starting a Postgres instance, importing data into Postgres, importing data into Snowflake, using private data, implementing a query generator, and running the test with different runners.

machine-learning-research
The 'machine-learning-research' repository is a comprehensive collection of resources related to mathematics, machine learning, deep learning, artificial intelligence, data science, and various scientific fields. It includes materials such as courses, tutorials, books, podcasts, communities, online courses, papers, and dissertations. The repository covers topics ranging from fundamental math skills to advanced machine learning concepts, with a focus on applications in healthcare, genetics, computational biology, precision health, and AI in science. It serves as a valuable resource for individuals interested in learning and researching in the fields of machine learning and related disciplines.

sycamore
Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.

llm-course
The LLM course is divided into three parts: 1. 🧩 **LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. 🧑🔬 **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. 👷 **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * 🤗 **HuggingChat Assistant**: Free version using Mixtral-8x7B. * 🤖 **ChatGPT Assistant**: Requires a premium account. ## 📝 Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | 🧐 LLM AutoEval | Automatically evaluate your LLMs using RunPod |  | | 🥱 LazyMergekit | Easily merge models using MergeKit in one click. |  | | 🦎 LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. |  | | ⚡ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. |  | | 🌳 Model Family Tree | Visualize the family tree of merged models. |  | | 🚀 ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. |  |
20 - OpenAI Gpts

Smart Sorter
A versatile, user-friendly Sorting Bot for diverse data types, prioritizing privacy and adaptability.

Theses without Subject Discipline info UK
Expert in analyzing UK theses data without subject info

Ordinals API
Knows the docs and can query official ordinal endpoints—Sat Numbers, Inscription IDs, and more.

Adulting with Social Anxiety
Navigating everyday life with friendly support for social anxiety.

Paper Interpreter (international)
Automatically structure and decode academic papers with ease - simply upload a PDF!

Eureka Research Assessment and Improvement
AI tool for self-evaluating and enhancing scientific research capabilities.