Best AI tools for< Explore Model Datasets >
20 - AI tool Sites

Inspect
Inspect is an open-source framework for large language model evaluations created by the UK AI Safety Institute. It provides built-in components for prompt engineering, tool usage, multi-turn dialog, and model graded evaluations. Users can explore various solvers, tools, scorers, datasets, and models to create advanced evaluations. Inspect supports extensions for new elicitation and scoring techniques through Python packages.

Voxel51
Voxel51 is an AI tool that provides open-source computer vision tools for machine learning. It offers solutions for various industries such as agriculture, aviation, driving, healthcare, manufacturing, retail, robotics, and security. Voxel51's main product, FiftyOne, helps users explore, visualize, and curate visual data to improve model performance and accelerate the development of visual AI applications. The platform is trusted by thousands of users and companies, offering both open-source and enterprise-ready solutions to manage and refine data and models for visual AI.

Macgence AI Training Data Services
Macgence is an AI training data services platform that offers high-quality off-the-shelf structured training data for organizations to build effective AI systems at scale. They provide services such as custom data sourcing, data annotation, data validation, content moderation, and localization. Macgence combines global linguistic, cultural, and technological expertise to create high-quality datasets for AI models, enabling faster time-to-market across the entire model value chain. With more than 5 years of experience, they support and scale AI initiatives of leading global innovators by designing custom data collection programs. Macgence specializes in handling AI training data for text, speech, image, and video data, offering cognitive annotation services to unlock the potential of unstructured textual data.

Hugging Face
Hugging Face is an AI community platform that serves as a collaboration hub for the machine learning community. It allows users to explore and contribute to models, datasets, and applications. The platform offers a wide range of features and tools to facilitate AI development and research.

Sarvam AI
Sarvam AI is an AI application focused on leading transformative research in AI to develop, deploy, and distribute Generative AI applications in India. The platform aims to build efficient large language models for India's diverse linguistic culture and enable new GenAI applications through bespoke enterprise models. Sarvam AI is also developing an enterprise-grade platform for developing and evaluating GenAI apps, while contributing to open-source models and datasets to accelerate AI innovation.

Defined.ai
Defined.ai is a leading provider of high-quality and ethical data for AI applications. Founded in 2015, Defined.ai has a global presence with offices in the US, Europe, and Asia. The company's mission is to make AI more accessible and ethical by providing a marketplace for buying and selling AI data, tools, and models. Defined.ai also offers professional services to help deliver success in complex machine learning projects.

Satlas
Satlas is an AI-powered platform that provides geospatial data generated by AI models. The platform offers insights into changes in marine infrastructure, renewable energy infrastructure, and tree cover on a monthly basis. Users can explore maps showcasing developments such as wind farms, solar farms, deforestation, and more. Satlas employs advanced AI architectures and training algorithms in computer vision to enhance low-resolution satellite imagery and produce high-resolution images globally. The platform's geospatial datasets are freely available for offline analysis, along with AI models and training labels. Developed by the Allen Institute for AI, Satlas aims to advance computer vision technology for better understanding and monitoring of Earth's changes.

Muse AI Art Generator
Muse AI is an advanced AI art generator that utilizes neural networks trained on massive image datasets to create unique digital artwork based on text prompts. Users can easily turn their ideas into stunning visuals by entering detailed descriptions and selecting a style. Muse AI offers a stable user experience and provides full control over the aesthetic, allowing for the generation of unlimited original AI art in various styles. The application excels in converting text to images and offers a variety of models for diverse creative needs.

Factori
Factori is a data intelligence platform designed for an AI-first world, offering a wide range of products and datasets to empower businesses with advanced data insights. It enables users to uncover movement trends, reach customers effectively, drive data-driven insights, personalize experiences, and target the right consumers with unique segments. Factori is trusted by global businesses for all their data requirements, providing high-quality data to improve predictive and causal models.

Garden of AI
Garden of AI is a comprehensive AI-powered platform that provides a wide range of tools and resources to help users explore, learn, and apply AI in their daily lives and work. With a vast collection of AI models, tutorials, datasets, and community forums, Garden of AI empowers users to stay up-to-date with the latest AI advancements and leverage its capabilities to solve real-world problems.

Aicado.ai
Aicado.ai is an AI tool that provides comprehensive comparisons and insights into various AI models such as GPT-4, Llama, Gemini, and more. Users can access detailed reviews and features to find the best AI model for their specific needs. The platform offers a premium dashboard for in-depth analysis and comparison of different AI models.

Stable Diffusion XL
Stable Diffusion XL (SDXL) is the latest AI image generation model that can generate realistic faces, legible text within the images, and better image composition, all while using shorter and simpler prompts. It is an improved version of the previous Stable Diffusion models, with better photorealistic outputs, more detailed imagery, and improved face generation. SDXL is available via DreamStudio and other image generation apps like NightCafe Studio and ClipDrop. It can be used for a variety of tasks, including image generation, image-to-image prompting, inpainting, and outpainting.

Flux LoRA Model Library
Flux LoRA Model Library is an AI tool that provides a platform for finding and using Flux LoRA models suitable for various projects. Users can browse a catalog of popular Flux LoRA models and learn about FLUX models and LoRA (Low-Rank Adaptation) technology. The platform offers resources for fine-tuning models and ensuring responsible use of generated images.

MindpoolAI
MindpoolAI is a tool that allows users to access multiple leading AI models with a single query. This means that users can get the answers they are looking for, spark ideas, and fuel their work, creativity, and curiosity. MindpoolAI is easy to use and does not require any technical expertise. Users simply need to enter their prompt and select the AI models they want to compare. MindpoolAI will then send the query to the selected models and present the results in an easy-to-understand format.

Youtwo.ai
Youtwo.ai is an AI chatbot platform that connects users with Virtual and Real AI adult models for roleplay, sexting, and intimate conversations. It offers personalized interactions 24/7, aiming to revolutionize the future of adult entertainment. The platform provides a safe and discreet environment for users to engage in virtual experiences with AI models.

FLUX dev
FLUX dev is a revolutionary open-weight AI image generation model developed by Black Forest Labs. It empowers researchers, developers, and creative professionals with state-of-the-art technology for text-to-image synthesis. FLUX dev offers unparalleled features such as open-weight architecture, direct distillation from FLUX [pro], exceptional prompt adherence, optimized efficiency, and enhanced typography capabilities. The application stands out in AI image generation through its cutting-edge technology, user-centric design, and advanced capabilities in text rendering, complex compositions, and anatomical accuracy.

Moshi AI
Moshi AI by Kyutai is an advanced native speech AI model that enables natural, expressive conversations. It can be installed locally and run offline, making it suitable for integration into smart home appliances and other local applications. The model, named Helium, has 7 billion parameters and is trained on text and audio codecs. Moshi AI supports native speech input and output, allowing for smooth communication with the AI. The application is community-supported, with plans for continuous improvement and adaptation.

PixelPet
PixelPet is an AI-powered online tool that offers stable diffusion instant access to hundreds of models for generating hyper-realistic images in 1344x768px resolution. It provides universal understanding with auto-translation for all languages and features Prompt Magic to boost prompts for awesome results. Empowered by the latest Stable Diffusion models, PixelPet allows users to create stunning images for free straight from their favorite messenger app.

Stablematic
Stablematic is a web-based platform that allows users to run Stable Diffusion and other machine learning models without the need for local setup or hardware limitations. It provides a user-friendly interface, pre-installed plugins, and dedicated GPU resources for a seamless and efficient workflow. Users can generate images and videos from text prompts, merge multiple models, train custom models, and access a range of pre-trained models, including Dreambooth and CivitAi models. Stablematic also offers API access for developers and dedicated support for users to explore and utilize the capabilities of Stable Diffusion and other machine learning models.

FLUX AI Image Generator
FLUX AI Image Generator is a cutting-edge AI image generation model developed by Black Forest Labs. It offers state-of-the-art performance in prompt following, visual quality, image detail, and output diversity. The application provides multiple model variants, exceptional text rendering capabilities, complex composition mastery, improved hand rendering, and efficient performance. Users can access FLUX AI Image Generator through various platforms and benefit from its open-source availability for research and artistic purposes. The tool is continuously innovating to stay at the forefront of AI image generation technology.
20 - Open Source AI Tools

FuseAI
FuseAI is a repository that focuses on knowledge fusion of large language models. It includes FuseChat, a state-of-the-art 7B LLM on MT-Bench, and FuseLLM, which surpasses Llama-2-7B by fusing three open-source foundation LLMs. The repository provides tech reports, releases, and datasets for FuseChat and FuseLLM, showcasing their performance and advancements in the field of chat models and large language models.

AIforEarthDataSets
The Microsoft AI for Earth program hosts geospatial data on Azure that is important to environmental sustainability and Earth science. This repo hosts documentation and demonstration notebooks for all the data that is managed by AI for Earth. It also serves as a "staging ground" for the Planetary Computer Data Catalog.

NeMo-Curator
NeMo Curator is a GPU-accelerated open-source framework designed for efficient large language model data curation. It provides scalable dataset preparation for tasks like foundation model pretraining, domain-adaptive pretraining, supervised fine-tuning, and parameter-efficient fine-tuning. The library leverages GPUs with Dask and RAPIDS to accelerate data curation, offering customizable and modular interfaces for pipeline expansion and model convergence. Key features include data download, text extraction, quality filtering, deduplication, downstream-task decontamination, distributed data classification, and PII redaction. NeMo Curator is suitable for curating high-quality datasets for large language model training.

LLM-LieDetector
This repository contains code for reproducing experiments on lie detection in black-box LLMs by asking unrelated questions. It includes Q/A datasets, prompts, and fine-tuning datasets for generating lies with language models. The lie detectors rely on asking binary 'elicitation questions' to diagnose whether the model has lied. The code covers generating lies from language models, training and testing lie detectors, and generalization experiments. It requires access to GPUs and OpenAI API calls for running experiments with open-source models. Results are stored in the repository for reproducibility.

awesome-llm
Awesome LLM is a curated list of resources related to Large Language Models (LLMs), including models, projects, datasets, benchmarks, materials, papers, posts, GitHub repositories, HuggingFace repositories, and reading materials. It provides detailed information on various LLMs, their parameter sizes, announcement dates, and contributors. The repository covers a wide range of LLM-related topics and serves as a valuable resource for researchers, developers, and enthusiasts interested in the field of natural language processing and artificial intelligence.

data-juicer
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. It is a systematic & reusable library of 80+ core OPs, 20+ reusable config recipes, and 20+ feature-rich dedicated toolkits, designed to function independently of specific LLM datasets and processing pipelines. Data-Juicer allows detailed data analyses with an automated report generation feature for a deeper understanding of your dataset. Coupled with multi-dimension automatic evaluation capabilities, it supports a timely feedback loop at multiple stages in the LLM development process. Data-Juicer offers tens of pre-built data processing recipes for pre-training, fine-tuning, en, zh, and more scenarios. It provides a speedy data processing pipeline requiring less memory and CPU usage, optimized for maximum productivity. Data-Juicer is flexible & extensible, accommodating most types of data formats and allowing flexible combinations of OPs. It is designed for simplicity, with comprehensive documentation, easy start guides and demo configs, and intuitive configuration with simple adding/removing OPs from existing configs.

llmware
LLMWare is a framework for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely.

SuperKnowa
SuperKnowa is a fast framework to build Enterprise RAG (Retriever Augmented Generation) Pipelines at Scale, powered by watsonx. It accelerates Enterprise Generative AI applications to get prod-ready solutions quickly on private data. The framework provides pluggable components for tackling various Generative AI use cases using Large Language Models (LLMs), allowing users to assemble building blocks to address challenges in AI-driven text generation. SuperKnowa is battle-tested from 1M to 200M private knowledge base & scaled to billions of retriever tokens.

npcsh
`npcsh` is a python-based command-line tool designed to integrate Large Language Models (LLMs) and Agents into one's daily workflow by making them available and easily configurable through the command line shell. It leverages the power of LLMs to understand natural language commands and questions, execute tasks, answer queries, and provide relevant information from local files and the web. Users can also build their own tools and call them like macros from the shell. `npcsh` allows users to take advantage of agents (i.e. NPCs) through a managed system, tailoring NPCs to specific tasks and workflows. The tool is extensible with Python, providing useful functions for interacting with LLMs, including explicit coverage for popular providers like ollama, anthropic, openai, gemini, deepseek, and openai-like providers. Users can set up a flask server to expose their NPC team for use as a backend service, run SQL models defined in their project, execute assembly lines, and verify the integrity of their NPC team's interrelations. Users can execute bash commands directly, use favorite command-line tools like VIM, Emacs, ipython, sqlite3, git, pipe the output of these commands to LLMs, or pass LLM results to bash commands.

Phi-3CookBook
Phi-3CookBook is a manual on how to use the Microsoft Phi-3 family, which consists of open AI models developed by Microsoft. The Phi-3 models are highly capable and cost-effective small language models, outperforming models of similar and larger sizes across various language, reasoning, coding, and math benchmarks. The repository provides detailed information on different Phi-3 models, their performance, availability, and usage scenarios across different platforms like Azure AI Studio, Hugging Face, and Ollama. It also covers topics such as fine-tuning, evaluation, and end-to-end samples for Phi-3-mini and Phi-3-vision models, along with labs, workshops, and contributing guidelines.

videogigagan-pytorch
Video GigaGAN - Pytorch is an implementation of Video GigaGAN, a state-of-the-art video upsampling technique developed by Adobe AI labs. The project aims to provide a Pytorch implementation for researchers and developers interested in video super-resolution. The codebase allows users to replicate the results of the original research paper and experiment with video upscaling techniques. The repository includes the necessary code and resources to train and test the GigaGAN model on video datasets. Researchers can leverage this implementation to enhance the visual quality of low-resolution videos and explore advancements in video super-resolution technology.

CSGHub
CSGHub is an open source, trustworthy large model asset management platform that can assist users in governing the assets involved in the lifecycle of LLM and LLM applications (datasets, model files, codes, etc). With CSGHub, users can perform operations on LLM assets, including uploading, downloading, storing, verifying, and distributing, through Web interface, Git command line, or natural language Chatbot. Meanwhile, the platform provides microservice submodules and standardized OpenAPIs, which could be easily integrated with users' own systems. CSGHub is committed to bringing users an asset management platform that is natively designed for large models and can be deployed On-Premise for fully offline operation. CSGHub offers functionalities similar to a privatized Huggingface(on-premise Huggingface), managing LLM assets in a manner akin to how OpenStack Glance manages virtual machine images, Harbor manages container images, and Sonatype Nexus manages artifacts.

Awesome-LLM-Prune
This repository is dedicated to the pruning of large language models (LLMs). It aims to serve as a comprehensive resource for researchers and practitioners interested in the efficient reduction of model size while maintaining or enhancing performance. The repository contains various papers, summaries, and links related to different pruning approaches for LLMs, along with author information and publication details. It covers a wide range of topics such as structured pruning, unstructured pruning, semi-structured pruning, and benchmarking methods. Researchers and practitioners can explore different pruning techniques, understand their implications, and access relevant resources for further study and implementation.

stm32ai-modelzoo
The STM32 AI model zoo is a collection of reference machine learning models optimized to run on STM32 microcontrollers. It provides a large collection of application-oriented models ready for re-training, scripts for easy retraining from user datasets, pre-trained models on reference datasets, and application code examples generated from user AI models. The project offers training scripts for transfer learning or training custom models from scratch. It includes performances on reference STM32 MCU and MPU for float and quantized models. The project is organized by application, providing step-by-step guides for training and deploying models.

cia
CIA is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for processing large datasets and generating insightful reports. With CIA, users can easily explore data, perform statistical analysis, and create interactive visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, CIA offers a comprehensive set of features to streamline your data analysis workflow and uncover valuable insights.

nixtla
Nixtla is a production-ready generative pretrained transformer for time series forecasting and anomaly detection. It can accurately predict various domains such as retail, electricity, finance, and IoT with just a few lines of code. TimeGPT introduces a paradigm shift with its standout performance, efficiency, and simplicity, making it accessible even to users with minimal coding experience. The model is based on self-attention and is independently trained on a vast time series dataset to minimize forecasting error. It offers features like zero-shot inference, fine-tuning, API access, adding exogenous variables, multiple series forecasting, custom loss function, cross-validation, prediction intervals, and handling irregular timestamps.

ai-financial-agent
AI Financial Agent is a proof of concept project exploring the use of AI for investment research. It provides an AI SDK with a unified API for generating text and structured objects, along with access to real-time and historical stock market data optimized for AI financial agents. The project includes features like dynamic chat interfaces, support for multiple model providers, and styling with Tailwind CSS. Users can deploy their own version of the AI Financial Agent using Vercel and GitHub integration.

AwesomeResponsibleAI
Awesome Responsible AI is a curated list of academic research, books, code of ethics, courses, data sets, frameworks, institutes, newsletters, principles, podcasts, reports, tools, regulations, and standards related to Responsible, Trustworthy, and Human-Centered AI. It covers various concepts such as Responsible AI, Trustworthy AI, Human-Centered AI, Responsible AI frameworks, AI Governance, and more. The repository provides a comprehensive collection of resources for individuals interested in ethical, transparent, and accountable AI development and deployment.

litdata
LitData is a tool designed for blazingly fast, distributed streaming of training data from any cloud storage. It allows users to transform and optimize data in cloud storage environments efficiently and intuitively, supporting various data types like images, text, video, audio, geo-spatial, and multimodal data. LitData integrates smoothly with frameworks such as LitGPT and PyTorch, enabling seamless streaming of data to multiple machines. Key features include multi-GPU/multi-node support, easy data mixing, pause & resume functionality, support for profiling, memory footprint reduction, cache size configuration, and on-prem optimizations. The tool also provides benchmarks for measuring streaming speed and conversion efficiency, along with runnable templates for different data types. LitData enables infinite cloud data processing by utilizing the Lightning.ai platform to scale data processing with optimized machines.
20 - OpenAI Gpts

HuggingFace Helper
A witty yet succinct guide for HuggingFace, offering technical assistance on using the platform - based on their Learning Hub

Octorate Code Companion
I help developers understand and use APIs, referencing a YAML model.

Picture Creator🎨
Model Vibe Picture Creator: Unleash Your Imagination! 🎨📸 Generates detailed, cool prompts for stylized images, perfect for AI tools like DALL-E 3. 🔥👾

Malaysia Vehicle Finder
Searches for vehicles in Malaysia, providing model details, prices, and availability.

Experto en Toxina Botulínica
Este modelo de GPT proporciona información general sobre la toxina botulínica, incluyendo su historia, usos comunes y datos de interés. Está diseñado para educar y ofrecer una visión general basada en fuentes de información públicas y conocidas. No proporciona consejos médicos

FutureScope
A channel for learning about Foresighting, including models, processes, and approaches.

Hybrid Workplace Navigator
Advises organizations on optimizing hybrid work models, blending remote and in-office strategies.

AInatomy
An expert in Human Anatomy, be it for art or science or education, anything relating to the human body, come ask me. I will provide photo-realistic visual aids and AI created models to expound on different parts of the anatomy