Best AI tools for< Add Custom Datasets >
20 - AI tool Sites
Tabula
Tabula is a visual data analytics tool that uses AI to help businesses get insights from their data. It is easy to use and can be used by anyone, regardless of their technical expertise. Tabula can be used to access and unify data from a variety of sources, standardize and blend datasets, add custom metrics, build stunning reports, and automate repetitive tasks. Tabula is integrated with a variety of data sources and platforms, making it easy to get started.
WP Dev AI
WP Dev AI is an AI-powered tool that allows users to build custom features for their WordPress websites without having to code. With WP Dev AI, users can simply describe the feature they want to create in plain English, and the tool will generate the necessary code. WP Dev AI also provides step-by-step instructions on how to implement the code, making it easy for even non-technical users to add custom features to their websites.
Love Tunes
Love Tunes is an AI-powered platform that allows users to create custom AI-generated love songs for any special occasion or just for fun. Users can personalize songs with memories, choose genres, and even add custom lyrics. With Love Tunes, users can easily create high-quality, personalized songs tailored to their preferences in just a few clicks.
AI Email Name Generator
The AI Email Name Generator is a tool that leverages artificial intelligence to create unique email addresses based on first and last names, with the option to add custom or random suffixes. It ensures a high level of uniqueness and relevance by analyzing and combining inputs using AI algorithms. Users can generate both personal and professional email addresses quickly and easily, suitable for various communication purposes.
Slick
Slick is an AI-powered video editing tool that helps you create and edit viral short videos. With Slick, you can add trendy captions, cut silences and umms, snap b-rolls, add sound effects, use magic zooms, and more. Slick supports all aspect ratios and up to 4k resolution. You can also add custom background music and sound effects, and remove filler words in one click. Slick is available in over 30 languages, including English, French, Spanish, German, Hindi, and more. New caption styles are added every week, and all captions are 100% customizable. With Slick, you can trim and extend clips, and adjust clip duration. All of these features are available without lifting a finger, thanks to Slick's AI technology.
Hentai Generator
Hentai Generator is a website that allows users to create AI-generated hentai images. Users can generate unique and high-quality hentai characters for free. The website also provides a variety of features, such as the ability to share generations on Twitter and the ability to add custom tags. Hentai Generator is a powerful tool that can be used to create a wide variety of hentai images.
Text2SQL.AI
Text2SQL.AI is an AI-powered SQL query builder that helps users generate optimized SQL queries effortlessly. It supports various AI-powered services, including SQL query building from textual instructions, SQL query explanation to plain English, SQL query error fixation, adding custom database schemas, SQL dialects for various database types, Microsoft Excel and Google Sheets formula generation and explanation, and Regex expression generation and explanation. The tool is designed to improve SQL skills, save time, and assist beginners, data analysts, data scientists, data engineers, and software developers in their work.
Globify
Globify is an AI application designed to streamline the app localization process for iOS developers. It offers a user-friendly platform that allows developers to easily manage target languages, edit individual localizations, work on multiple projects, add custom tones and styles, create glossaries, and sync string catalog files. With the help of GPT-4 technology, Globify enables auto localization with just a single click, making the entire localization process smooth and efficient. The application aims to improve the global reach of iOS apps by providing a seamless localization experience.
Muzaic
Muzaic is a generative AI Soundtrack-as-a-Service. It lets you automatically add custom soundtracks to your videos, presentations, or even games. Muzaic works on the parameters that describe music: intensity, tempo, rhythm, tone and variation. Not only can it adapt to the preset levels of these parameters, but it can also change them over time on command. At the same time, Muzaic works on high quality music.
Unicorn Platform
Unicorn Platform is an AI-powered website builder that helps users create websites quickly and easily, without the need for design or development skills. It offers a variety of features, including pre-built templates, drag-and-drop functionality, and the ability to add custom code. Unicorn Platform is suitable for a variety of website types, including SaaS, apps, directories, blogs, and personal pages.
My Cheeky Bot
My Cheeky Bot is an AI tool that allows users to create advanced AI bots in minutes to add custom lead gen chat assistants to their business websites. It offers a solution for effortless customer engagement by providing personalized customer service assistants. The tool aims to help small businesses and freelance developers manage customer queries and provide instant assistance without the need for any coding skills. With innovative chatbot technology, My Cheeky Bot enables users to enhance their website's customer engagement experience and stay connected with their audience in today's fast-paced digital landscape.
jsonAI
jsonAI is an AI tool that allows users to easily transform data into structured JSON format. Users can define their schema, add custom prompts, and receive AI-structured JSON responses. The tool enables users to create complex schemas with nested objects, control the response JSON on the fly, and test their JSON data in real-time. jsonAI offers a free trial plan, seamless integration with existing apps, and ensures data security by not storing user data on their servers.
Loom
Loom is a free screen recorder for Mac and PC that allows users to easily record and share AI-powered video messages with their teammates and customers. With Loom, users can quickly record their screen and camera, and then share their videos anywhere they work, including Google Workspace, Slack, and more. Loom also offers a variety of features to help users edit and personalize their videos, including the ability to trim and stitch video clips, add custom logos and thumbnails, and add tasks, CTAs, comments, and emojis. Loom is used by over 25 million people across 400,000 companies, and is a valuable tool for sales, engineering, customer support, design, and more.
REOK AI Headshot Studio
REOK AI Headshot Studio is an AI-powered tool that allows you to create professional headshots and portraits with just a few clicks. With its advanced algorithms, REOK AI Headshot Studio can automatically remove the background from your photos, adjust the lighting, and enhance your features to create a polished and professional look. You can also use REOK AI Headshot Studio to add custom backgrounds, text, and effects to your photos.
VidGenesis
VidGenesis is an AI-powered video generator that allows users to create engaging videos in minutes. With its user-friendly interface and powerful AI technology, VidGenesis makes it easy for anyone to create high-quality videos for a variety of purposes, including marketing, education, and entertainment. Some of the key features of VidGenesis include the ability to choose from a variety of video templates, add custom text and images, and select from a range of AI-generated voices. VidGenesis also offers a variety of advanced features, such as the ability to add custom branding and download videos in HD quality.
Magika
Magika is a universal platform for creating AI-powered content. It offers a wide range of tools for generating unique text, images, code, chatbots, and more. Magika's advanced monitoring dashboard provides valuable user insights, analytics, and activity tracking. It supports multiple languages and allows users to add custom prompts. Magika's platform support provides access to and management of support tickets from the dashboard.
Insyte
Insyte is an AI-powered website builder that allows users to create landing pages in seconds. It is designed to be easy to use and intuitive, so you can focus on what matters most: your business. With Insyte, you can create a website for any purpose, from a simple landing page to a full-fledged online store. Insyte offers a variety of features to help you create a website that is both visually appealing and engaging. You can choose from a variety of templates, add your own content, and customize the look and feel of your site. Insyte also offers a number of advanced features, such as the ability to download the source code of your website and add custom domains. Insyte is a powerful tool that can help you create a website that will help you grow your business.
Prodvana
Prodvana is an intelligent deployment platform that helps businesses automate and streamline their software deployment process. It provides a variety of features to help businesses improve the speed, reliability, and security of their deployments. Prodvana is a cloud-based platform that can be used with any type of infrastructure, including on-premises, hybrid, and multi-cloud environments. It is also compatible with a wide range of DevOps tools and technologies. Prodvana's key features include: Intent-based deployments: Prodvana uses intent-based deployment technology to automate the deployment process. This means that businesses can simply specify their deployment goals, and Prodvana will automatically generate and execute the necessary steps to achieve those goals. This can save businesses a significant amount of time and effort. Guardrails for deployments: Prodvana provides a variety of guardrails to help businesses ensure the security and reliability of their deployments. These guardrails include approvals, database validations, automatic deployment validation, and simple interfaces to add custom guardrails. This helps businesses to prevent errors and reduce the risk of outages. Frictionless DevEx: Prodvana provides a frictionless developer experience by tracking commits through the infrastructure, ensuring complete visibility beyond just Docker images. This helps developers to quickly identify and resolve issues, and it also makes it easier to collaborate with other team members. Intelligence with Clairvoyance: Prodvana's Clairvoyance feature provides businesses with insights into the impact of their deployments before they are executed. This helps businesses to make more informed decisions about their deployments and to avoid potential problems. Easy integrations: Prodvana integrates seamlessly with a variety of DevOps tools and technologies. This makes it easy for businesses to use Prodvana with their existing workflows and processes.
Meya
Meya is a chatbot platform that allows users to build and launch custom chatbots. It provides a variety of features, including a visual flow editor, a code editor, and a variety of integrations. Meya is designed to be easy to use, even for non-technical users. It is also highly extensible, allowing users to add their own custom code and integrations.
CVAT
CVAT is an open-source data annotation platform that helps teams of any size annotate data for machine learning. It is used by companies big and small in a variety of industries, including healthcare, retail, and automotive. CVAT is known for its intuitive user interface, advanced features, and support for a wide range of data formats. It is also highly extensible, allowing users to add their own custom features and integrations.
20 - Open Source AI Tools
agentic_security
Agentic Security is an open-source vulnerability scanner designed for safety scanning, offering customizable rule sets and agent-based attacks. It provides comprehensive fuzzing for any LLMs, LLM API integration, and stress testing with a wide range of fuzzing and attack techniques. The tool is not a foolproof solution but aims to enhance security measures against potential threats. It offers installation via pip and supports quick start commands for easy setup. Users can utilize the tool for LLM integration, adding custom datasets, running CI checks, extending dataset collections, and dynamic datasets with mutations. The tool also includes a probe endpoint for integration testing. The roadmap includes expanding dataset variety, introducing new attack vectors, developing an attacker LLM, and integrating OWASP Top 10 classification.
prometheus-eval
Prometheus-Eval is a repository dedicated to evaluating large language models (LLMs) in generation tasks. It provides state-of-the-art language models like Prometheus 2 (7B & 8x7B) for assessing in pairwise ranking formats and achieving high correlation scores with benchmarks. The repository includes tools for training, evaluating, and using these models, along with scripts for fine-tuning on custom datasets. Prometheus aims to address issues like fairness, controllability, and affordability in evaluations by simulating human judgments and proprietary LM-based assessments.
datadreamer
DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.
Auto-Data
Auto Data is a library designed for the automatic generation of realistic datasets, essential for the fine-tuning of Large Language Models (LLMs). This highly efficient and lightweight library enables the swift and effortless creation of comprehensive datasets across various topics, regardless of their size. It addresses challenges encountered during model fine-tuning due to data scarcity and imbalance, ensuring models are trained with sufficient examples.
deeplake
Deep Lake is a Database for AI powered by a storage format optimized for deep-learning applications. Deep Lake can be used for: 1. Storing data and vectors while building LLM applications 2. Managing datasets while training deep learning models Deep Lake simplifies the deployment of enterprise-grade LLM-based products by offering storage for all data types (embeddings, audio, text, videos, images, pdfs, annotations, etc.), querying and vector search, data streaming while training models at scale, data versioning and lineage, and integrations with popular tools such as LangChain, LlamaIndex, Weights & Biases, and many more. Deep Lake works with data of any size, it is serverless, and it enables you to store all of your data in your own cloud and in one place. Deep Lake is used by Intel, Bayer Radiology, Matterport, ZERO Systems, Red Cross, Yale, & Oxford.
Qwen
Qwen is a series of large language models developed by Alibaba DAMO Academy. It outperforms the baseline models of similar model sizes on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, etc., which evaluate the models’ capabilities on natural language understanding, mathematic problem solving, coding, etc. Qwen models outperform the baseline models of similar model sizes on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, etc., which evaluate the models’ capabilities on natural language understanding, mathematic problem solving, coding, etc. Qwen-72B achieves better performance than LLaMA2-70B on all tasks and outperforms GPT-3.5 on 7 out of 10 tasks.
evidently
Evidently is an open-source Python library designed for evaluating, testing, and monitoring machine learning (ML) and large language model (LLM) powered systems. It offers a wide range of functionalities, including working with tabular, text data, and embeddings, supporting predictive and generative systems, providing over 100 built-in metrics for data drift detection and LLM evaluation, allowing for custom metrics and tests, enabling both offline evaluations and live monitoring, and offering an open architecture for easy data export and integration with existing tools. Users can utilize Evidently for one-off evaluations using Reports or Test Suites in Python, or opt for real-time monitoring through the Dashboard service.
ollama-ebook-summary
The 'ollama-ebook-summary' repository is a Python project that creates bulleted notes summaries of books and long texts, particularly in epub and pdf formats with ToC metadata. It automates the extraction of chapters, splits them into ~2000 token chunks, and allows for asking arbitrary questions to parts of the text for improved granularity of response. The tool aims to provide summaries for each page of a book rather than a one-page summary of the entire document, enhancing content curation and knowledge sharing capabilities.
lighteval
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron. We're releasing it with the community in the spirit of building in the open. Note that it is still very much early so don't expect 100% stability ^^' In case of problems or question, feel free to open an issue!
llmware
LLMWare is a framework for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely.
HippoRAG
HippoRAG is a novel retrieval augmented generation (RAG) framework inspired by the neurobiology of human long-term memory that enables Large Language Models (LLMs) to continuously integrate knowledge across external documents. It provides RAG systems with capabilities that usually require a costly and high-latency iterative LLM pipeline for only a fraction of the computational cost. The tool facilitates setting up retrieval corpus, indexing, and retrieval processes for LLMs, offering flexibility in choosing different online LLM APIs or offline LLM deployments through LangChain integration. Users can run retrieval on pre-defined queries or integrate directly with the HippoRAG API. The tool also supports reproducibility of experiments and provides data, baselines, and hyperparameter tuning scripts for research purposes.
aides-jeunes
The user interface (and the main server) of the simulator of aids and social benefits for young people. It is based on the free socio-fiscal simulator Openfisca.
llm-finetuning
llm-finetuning is a repository that provides a serverless twist to the popular axolotl fine-tuning library using Modal's serverless infrastructure. It allows users to quickly fine-tune any LLM model with state-of-the-art optimizations like Deepspeed ZeRO, LoRA adapters, Flash attention, and Gradient checkpointing. The repository simplifies the fine-tuning process by not exposing all CLI arguments, instead allowing users to specify options in a config file. It supports efficient training and scaling across multiple GPUs, making it suitable for production-ready fine-tuning jobs.
DataFrame
DataFrame is a C++ analytical library designed for data analysis similar to libraries in Python and R. It allows you to slice, join, merge, group-by, and perform various statistical, summarization, financial, and ML algorithms on your data. DataFrame also includes a large collection of analytical algorithms in form of visitors, ranging from basic stats to more involved analysis. You can easily add your own algorithms as well. DataFrame employs extensive multithreading in almost all its APIs, making it suitable for analyzing large datasets. Key principles followed in the library include supporting any type without needing new code, avoiding pointer chasing, having all column data in contiguous memory space, minimizing space usage, avoiding data copying, using multi-threading judiciously, and not protecting the user against garbage in, garbage out.
swift
SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) supports training, inference, evaluation and deployment of nearly **200 LLMs and MLLMs** (multimodal large models). Developers can directly apply our framework to their own research and production environments to realize the complete workflow from model training and evaluation to application. In addition to supporting the lightweight training solutions provided by [PEFT](https://github.com/huggingface/peft), we also provide a complete **Adapters library** to support the latest training techniques such as NEFTune, LoRA+, LLaMA-PRO, etc. This adapter library can be used directly in your own custom workflow without our training scripts. To facilitate use by users unfamiliar with deep learning, we provide a Gradio web-ui for controlling training and inference, as well as accompanying deep learning courses and best practices for beginners. Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
LLaMA-Factory
LLaMA Factory is a unified framework for fine-tuning 100+ large language models (LLMs) with various methods, including pre-training, supervised fine-tuning, reward modeling, PPO, DPO and ORPO. It features integrated algorithms like GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, LoRA+, LoftQ and Agent tuning, as well as practical tricks like FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA. LLaMA Factory provides experiment monitors like LlamaBoard, TensorBoard, Wandb, MLflow, etc., and supports faster inference with OpenAI-style API, Gradio UI and CLI with vLLM worker. Compared to ChatGLM's P-Tuning, LLaMA Factory's LoRA tuning offers up to 3.7 times faster training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.
MARS5-TTS
MARS5 is a novel English speech model (TTS) developed by CAMB.AI, featuring a two-stage AR-NAR pipeline with a unique NAR component. The model can generate speech for various scenarios like sports commentary and anime with just 5 seconds of audio and a text snippet. It allows steering prosody using punctuation and capitalization in the transcript. Speaker identity is specified using an audio reference file, enabling 'deep clone' for improved quality. The model can be used via torch.hub or HuggingFace, supporting both shallow and deep cloning for inference. Checkpoints are provided for AR and NAR models, with hardware requirements of 750M+450M params on GPU. Contributions to improve model stability, performance, and reference audio selection are welcome.
20 - OpenAI Gpts
WP coding assistant
Friendly WordPress expert that will help you write custom plugins, functions, add custom fields and enhance your WordPress website.
Webflow JS Wizard
A Javascript Developer for Webflow, specializing in jQuery and debugging guidance.
Ask Cris about File Maker
An experiment in personal FileMaker guidance from the collective works of lifetime award-winning FileMaker trainer, Cris Ippolite. Not just links to resources, but direct access to 20+ years of custom training curriculum combined with expert AI instruction without the noise of external web links.
Invoicing Assistant
Efficiently crafts custom invoices & receipts with a friendly Aussie touch.
Customized Cartoon Beer Cans
Create cartoon style label designs on a beer cans using an image and prompt provided by the user.
AIProductGPT: Add AI to your Product and get a PRD
With simple prompts, AIProductGPT instantly crafts detailed AI-powered requirements (PRD) and mocks so that you team can hit the ground running
GroceriesGPT
I manage your grocery lists to help you stay organized. *1/ Tell me what to add to a list. 2/ Ask me to add all ingredients for a receipe. 3/ Upload a receipt to remove items from your lists 4/ Add an item by simply uploading a picture. 5/ Ask me what items I would recommend you add to your lists.*
SpintaxGPT
I add spintax to emails for Instantly.ai. For more cold email tips, follow me on Twitter/𝕏 at @kenautoup