Best AI tools for< Introduce Datasets >
20 - AI tool Sites
Ai Drawing Generator
Ai Drawing Generator is a free online tool that revolutionizes drawing generation with AI. It introduces ControlNet, a neural network structure designed to enhance pretrained large diffusion models by incorporating additional input conditions. The tool enables users to convert scribbled drawings into detailed images through deep learning algorithms. It is adaptable for training on personal devices and can handle large datasets ranging from millions to billions. Ai Drawing Generator provides experimental compatibility with various diffusion models, offering users flexibility in choosing models based on their specific needs and preferences.
Phenaki
Phenaki is a model capable of generating realistic videos from a sequence of textual prompts. It is particularly challenging to generate videos from text due to the computational cost, limited quantities of high-quality text-video data, and variable length of videos. To address these issues, Phenaki introduces a new causal model for learning video representation, which compresses the video to a small representation of discrete tokens. This tokenizer uses causal attention in time, which allows it to work with variable-length videos. To generate video tokens from text, Phenaki uses a bidirectional masked transformer conditioned on pre-computed text tokens. The generated video tokens are subsequently de-tokenized to create the actual video. To address data issues, Phenaki demonstrates how joint training on a large corpus of image-text pairs as well as a smaller number of video-text examples can result in generalization beyond what is available in the video datasets. Compared to previous video generation methods, Phenaki can generate arbitrarily long videos conditioned on a sequence of prompts (i.e., time-variable text or a story) in an open domain. To the best of our knowledge, this is the first time a paper studies generating videos from time-variable prompts. In addition, the proposed video encoder-decoder outperforms all per-frame baselines currently used in the literature in terms of spatio-temporal quality and the number of tokens per video.
Santa Cat
Santa Cat is an AI-powered virtual assistant designed to bring holiday cheer and festive fun to users. It allows you to engage in interactive conversations with a virtual feline friend, Santa Cat, who is filled with holiday excitement. Created by Daily About & Help, this fluffy AI helper is perfect for spreading joy and creating memorable holiday moments through playful chats and interactions.
Ordinary Prompts
Ordinary Prompts is a tool that helps users create better prompts for ChatGPT and other AI language models. It provides a library of pre-written prompts that can be used for a variety of tasks, such as generating creative content, getting help with coding, and writing emails. Ordinary Prompts also includes a number of features that make it easy to customize prompts and track your progress.
StoryBee
StoryBee is an AI-powered platform that allows users to create personalized stories for children. With a simple hint or theme, the AI generates a unique tale tailored to the user's preferences. Users can customize the genre, style, and visual aesthetics of their stories, ensuring a captivating and engaging experience for young readers. StoryBee is designed to foster imagination, inspire growth, and provide endless entertainment for children of all ages.
Puppetry
Puppetry is an AI tool that enables video content creators, game artists, educators, and marketers to create engaging and informative videos using AI puppets. It provides a comprehensive toolset for face animation, allowing users to generate talking videos and craft compelling scripts with the power of ChatGPT. With features like AI voice and avatar creation, realistic avatars, advanced technology, and intuitive user interface, Puppetry offers a versatile solution for creating AI-driven avatars and animated faces.
Robovision
Robovision is a central platform to manage vision intelligence inside smart machines. Successfully introduce AI in dynamic environments without the need for AI experts.
Sacred
Sacred is a tool to configure, organize, log and reproduce computational experiments. It is designed to introduce only minimal overhead, while encouraging modularity and configurability of experiments. The ability to conveniently make experiments configurable is at the heart of Sacred. If the parameters of an experiment are exposed in this way, it will help you to: keep track of all the parameters of your experiment easily run your experiment for different settings save configurations for individual runs in files or a database reproduce your results In Sacred we achieve this through the following main mechanisms: Config Scopes are functions with a @ex.config decorator, that turn all local variables into configuration entries. This helps to set up your configuration really easily. Those entries can then be used in captured functions via dependency injection. That way the system takes care of passing parameters around for you, which makes using your config values really easy. The command-line interface can be used to change the parameters, which makes it really easy to run your experiment with modified parameters. Observers log every information about your experiment and the configuration you used, and saves them for example to a Database. This helps to keep track of all your experiments. Automatic seeding helps controlling the randomness in your experiments, such that they stay reproducible.
Wingman
Wingman is an AI dating coach application that offers personalized dating advice to straight men. It provides services such as chatbot coaching, profile optimization, and conversation feedback to help users improve their dating game and increase their chances of finding meaningful connections. Wingman prioritizes user privacy by ensuring all interactions are fully anonymized, and it continuously updates its memory bank to provide tailored advice. The application is currently in beta phase and offers complimentary access to invited users, with plans to introduce a free trial version upon official launch.
Tastewise
Tastewise is an AI platform designed for food and beverage brands to upskill on AI technology. It offers solutions for product innovation, foodservice execution, and digital marketing. The platform provides real-time marketing insights, consumer intelligence, and automation for marketing execution. Tastewise also enables risk-free product innovation and testing, allowing brands to introduce innovative products and adapt consumer messaging effectively. The platform helps brands make data-driven decisions, identify trends before they hit mainstream, and streamline operations for growth and success.
Pirr
Pirr is an AI-powered application that offers a brand new way to create personalized romantic and spicy stories. Users can indulge their senses and spark their imagination by leveraging advanced AI technology to craft narratives that cater to their tastes and desires. The platform allows users to customize characters, storylines, and narrative atmosphere, providing a unique and engaging storytelling experience. Pirr is currently available as a mobile app with plans to introduce a web version in the future. The application is free to use, with the option for premium features to enhance the user experience.
Drooid Social
Drooid Social is a social interactive platform powered by AI technology that reads hundreds of news articles on any topic to provide users with short, accurate, and unique insights in a personalized feed. Users can stay informed with a complete picture, express their opinions, and connect with like-minded individuals. The platform is free to use and plans to introduce non-intrusive ads starting in May 2025. Drooid Social aims to revolutionize how users consume news and engage with content online.
SEOmatic
SEOmatic is an AI-powered tool designed to boost website traffic through programmatic SEO features. It helps users automate and scale web pages for SEO and PPC strategies on any CMS platform, leading to a significant increase in traffic, leads, and sales. With SEOmatic, users can create personalized, data-driven content marketing at scale without the need for coding skills. The tool offers friendly pricing, a 7-day free trial, and the flexibility to cancel anytime, making it a valuable asset for marketing teams looking to drive high-quality, targeted leads to their websites.
Wiz Attendant
Wiz Attendant is a virtual assistant that helps businesses automate their customer service and support operations. It uses artificial intelligence (AI) to understand customer queries and provide relevant answers, 24/7. Wiz Attendant can be integrated with a business's website, messaging apps, and social media channels, making it easy for customers to get help whenever and wherever they need it.
AI Slide Maker
AI Slide Maker is an innovative tool that utilizes artificial intelligence to create visually appealing and professional presentations in a matter of minutes. With its advanced algorithms, the application can analyze content, suggest relevant design templates, and generate slides automatically. Users can simply input their text and images, and the AI Slide Maker will take care of the rest, saving time and effort. Whether for business meetings, educational purposes, or personal projects, this tool streamlines the presentation creation process and ensures high-quality results.
BharatGPT
BharatGPT is an AI-powered conversational AI platform designed for the Indian market. It offers generative text, voice, and video capabilities, supporting over 12 Indian languages. The platform focuses on fostering domestic AI development and ensuring data localization in India. BharatGPT is optimized for Indian users, providing features like custom knowledge base integration, omni-channel support, and dialogue management.
Kidgeni
Kidgeni is an AI tool designed for kids to unleash their creativity by turning inspirations into art, stories, and more. It offers a platform where children can create unique images, transform their drawings into art pieces, craft stories, and write personalized books. With Kidgeni, kids can explore unlimited creativity through various features and plans that cater to their artistic needs.
DealPage
DealPage is an AI Sales Engineer platform that introduces Paige, the first AI Sales Engineer. Paige assists sales engineers in onboarding, automating administrative tasks, providing technical assistance, and improving efficiency. The platform offers features such as automating RFP responses, generating personalized proposals, answering security questionnaires, and curating a knowledge base. DealPage aims to streamline sales processes, enhance productivity, and provide valuable insights into buyer behavior.
Open Agent Studio
Open Agent Studio is a powerful no-code agent editor that introduces new automation concepts like Semantic Targets and Semantic Triggers in simple language, enabling the creation of future-proof agents that are robust to design changes. It is designed to target markets untouched by AI, offering subscribers a free 4-week course to launch custom agents with enterprise-grade white label. The tool includes an Agent Recorder for easy building of agents by recording keyboard and mouse actions, scraping data, and detecting the start node. Open Agent Studio is powered by Cheat Layer, a platform that leverages GPT-3 for automation and aims to democratize access to AI for rebuilding businesses online.
Playground AI
Playground AI is a free-to-use online AI image creator that allows users to create and edit images like a professional without requiring advanced skills. The platform introduces Mixed Image Editing, enabling the combination of real and synthetic images to produce stunning works of art and photorealistic images limited only by the user's imagination. Users can edit images as they imagine, step outside the box, grow images beyond their edges, erase unnecessary elements, and fit objects into any scene. Playground AI fosters a creative community where users can share their creations, collaborate with others, and bring their ideas to life. With a user-friendly interface and powerful AI capabilities, Playground AI empowers users to unleash their creativity and design graphics effortlessly.
20 - Open Source AI Tools
rllm
rLLM (relationLLM) is a Pytorch library for Relational Table Learning (RTL) with LLMs. It breaks down state-of-the-art GNNs, LLMs, and TNNs as standardized modules and facilitates novel model building in a 'combine, align, and co-train' way using these modules. The library is LLM-friendly, processes various graphs as multiple tables linked by foreign keys, introduces new relational table datasets, and is supported by students and teachers from Shanghai Jiao Tong University and Tsinghua University.
llm-datasets
LLM Datasets is a repository containing high-quality datasets, tools, and concepts for LLM fine-tuning. It provides datasets with characteristics like accuracy, diversity, and complexity to train large language models for various tasks. The repository includes datasets for general-purpose, math & logic, code, conversation & role-play, and agent & function calling domains. It also offers guidance on creating high-quality datasets through data deduplication, data quality assessment, data exploration, and data generation techniques.
ai-audio-datasets
AI Audio Datasets List (AI-ADL) is a comprehensive collection of datasets consisting of speech, music, and sound effects, used for Generative AI, AIGC, AI model training, and audio applications. It includes datasets for speech recognition, speech synthesis, music information retrieval, music generation, audio processing, sound synthesis, and more. The repository provides a curated list of diverse datasets suitable for various AI audio tasks.
lightning-bolts
Bolts package provides a variety of components to extend PyTorch Lightning, such as callbacks & datasets, for applied research and production. Users can accelerate Lightning training with the Torch ORT Callback to optimize ONNX graph for faster training & inference. Additionally, users can introduce sparsity with the SparseMLCallback to accelerate inference by leveraging the DeepSparse engine. Specific research implementations are encouraged, with contributions that help train SSL models and integrate with Lightning Flash for state-of-the-art models in applied research.
aistore
AIStore is a lightweight object storage system designed for AI applications. It is highly scalable, reliable, and easy to use. AIStore can be deployed on any commodity hardware, and it can be used to store and manage large datasets for deep learning and other AI applications.
CSGHub
CSGHub is an open source, trustworthy large model asset management platform that can assist users in governing the assets involved in the lifecycle of LLM and LLM applications (datasets, model files, codes, etc). With CSGHub, users can perform operations on LLM assets, including uploading, downloading, storing, verifying, and distributing, through Web interface, Git command line, or natural language Chatbot. Meanwhile, the platform provides microservice submodules and standardized OpenAPIs, which could be easily integrated with users' own systems. CSGHub is committed to bringing users an asset management platform that is natively designed for large models and can be deployed On-Premise for fully offline operation. CSGHub offers functionalities similar to a privatized Huggingface(on-premise Huggingface), managing LLM assets in a manner akin to how OpenStack Glance manages virtual machine images, Harbor manages container images, and Sonatype Nexus manages artifacts.
GPT4Point
GPT4Point is a unified framework for point-language understanding and generation. It aligns 3D point clouds with language, providing a comprehensive solution for tasks such as 3D captioning and controlled 3D generation. The project includes an automated point-language dataset annotation engine, a novel object-level point cloud benchmark, and a 3D multi-modality model. Users can train and evaluate models using the provided code and datasets, with a focus on improving models' understanding capabilities and facilitating the generation of 3D objects.
InstructGraph
InstructGraph is a framework designed to enhance large language models (LLMs) for graph-centric tasks by utilizing graph instruction tuning and preference alignment. The tool collects and decomposes 29 standard graph datasets into four groups, enabling LLMs to better understand and generate graph data. It introduces a structured format verbalizer to transform graph data into a code-like format, facilitating code understanding and generation. Additionally, it addresses hallucination problems in graph reasoning and generation through direct preference optimization (DPO). The tool aims to bridge the gap between textual LLMs and graph data, offering a comprehensive solution for graph-related tasks.
MMC
This repository, MMC, focuses on advancing multimodal chart understanding through large-scale instruction tuning. It introduces a dataset supporting various tasks and chart types, a benchmark for evaluating reasoning capabilities over charts, and an assistant achieving state-of-the-art performance on chart QA benchmarks. The repository provides data for chart-text alignment, benchmarking, and instruction tuning, along with existing datasets used in experiments. Additionally, it offers a Gradio demo for the MMCA model.
aiverify
AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against a set of internationally recognised principles through standardised tests. AI Verify is consistent with international AI governance frameworks such as those from European Union, OECD and Singapore. It is a single integrated toolkit that operates within an enterprise environment. It can perform technical tests on common supervised learning classification and regression models for most tabular and image datasets. It however does not define AI ethical standards and does not guarantee that any AI system tested will be free from risks or biases or is completely safe.
MathCoder
MathCoder is a repository focused on enhancing mathematical reasoning by fine-tuning open-source language models to use code for modeling and deriving math equations. It introduces MathCodeInstruct dataset with solutions interleaving natural language, code, and execution results. The repository provides MathCoder models capable of generating code-based solutions for challenging math problems, achieving state-of-the-art scores on MATH and GSM8K datasets. It offers tools for model deployment, inference, and evaluation, along with a citation for referencing the work.
aiverify
AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against internationally recognised principles through standardised tests. It offers a new API Connector feature to bypass size limitations, test various AI frameworks, and configure connection settings for batch requests. The toolkit operates within an enterprise environment, conducting technical tests on common supervised learning models for tabular and image datasets. It does not define AI ethical standards or guarantee complete safety from risks or biases.
LLM-LieDetector
This repository contains code for reproducing experiments on lie detection in black-box LLMs by asking unrelated questions. It includes Q/A datasets, prompts, and fine-tuning datasets for generating lies with language models. The lie detectors rely on asking binary 'elicitation questions' to diagnose whether the model has lied. The code covers generating lies from language models, training and testing lie detectors, and generalization experiments. It requires access to GPUs and OpenAI API calls for running experiments with open-source models. Results are stored in the repository for reproducibility.
agentic_security
Agentic Security is an open-source vulnerability scanner designed for safety scanning, offering customizable rule sets and agent-based attacks. It provides comprehensive fuzzing for any LLMs, LLM API integration, and stress testing with a wide range of fuzzing and attack techniques. The tool is not a foolproof solution but aims to enhance security measures against potential threats. It offers installation via pip and supports quick start commands for easy setup. Users can utilize the tool for LLM integration, adding custom datasets, running CI checks, extending dataset collections, and dynamic datasets with mutations. The tool also includes a probe endpoint for integration testing. The roadmap includes expanding dataset variety, introducing new attack vectors, developing an attacker LLM, and integrating OWASP Top 10 classification.
LAMBDA
LAMBDA is a code-free multi-agent data analysis system that utilizes large models to address data analysis challenges in complex data-driven applications. It allows users to perform complex data analysis tasks through human language instruction, seamlessly generate and debug code using two key agent roles, integrate external models and algorithms, and automatically generate reports. The system has demonstrated strong performance on various machine learning datasets, enhancing data science practice by integrating human and artificial intelligence.
opencompass
OpenCompass is a one-stop platform for large model evaluation, aiming to provide a fair, open, and reproducible benchmark for large model evaluation. Its main features include: * Comprehensive support for models and datasets: Pre-support for 20+ HuggingFace and API models, a model evaluation scheme of 70+ datasets with about 400,000 questions, comprehensively evaluating the capabilities of the models in five dimensions. * Efficient distributed evaluation: One line command to implement task division and distributed evaluation, completing the full evaluation of billion-scale models in just a few hours. * Diversified evaluation paradigms: Support for zero-shot, few-shot, and chain-of-thought evaluations, combined with standard or dialogue-type prompt templates, to easily stimulate the maximum performance of various models. * Modular design with high extensibility: Want to add new models or datasets, customize an advanced task division strategy, or even support a new cluster management system? Everything about OpenCompass can be easily expanded! * Experiment management and reporting mechanism: Use config files to fully record each experiment, and support real-time reporting of results.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
Qwen
Qwen is a series of large language models developed by Alibaba DAMO Academy. It outperforms the baseline models of similar model sizes on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, etc., which evaluate the models’ capabilities on natural language understanding, mathematic problem solving, coding, etc. Qwen models outperform the baseline models of similar model sizes on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, etc., which evaluate the models’ capabilities on natural language understanding, mathematic problem solving, coding, etc. Qwen-72B achieves better performance than LLaMA2-70B on all tasks and outperforms GPT-3.5 on 7 out of 10 tasks.
Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.
HuggingFists
HuggingFists is a low-code data flow tool that enables convenient use of LLM and HuggingFace models. It provides functionalities similar to Langchain, allowing users to design, debug, and manage data processing workflows, create and schedule workflow jobs, manage resources environment, and handle various data artifact resources. The tool also offers account management for users, allowing centralized management of data source accounts and API accounts. Users can access Hugging Face models through the Inference API or locally deployed models, as well as datasets on Hugging Face. HuggingFists supports breakpoint debugging, branch selection, function calls, workflow variables, and more to assist users in developing complex data processing workflows.
3 - OpenAI Gpts
WVA
Web Vulnerability Academy (WVA) is an interactive tutor designed to introduce users to web vulnerabilities while also providing them with opportunities to assess and enhance their knowledge through testing.
AI EDU Phonologie Principe Alphabétique Cycle 1
Assistant pédagogique pour développer la conscience phonologie et le principe alphabétique.
Venture Capital CoFounder Pal
Complete a detailed VC Readiness assessment with the help of CoFounder Pal. Receive a report at the end, get introduced to our global VC network and receive dynamic mentorship for optimized growth.