Best AI tools for< create custom data pipelines >
20 - AI tool Sites
Isomeric
Isomeric is a tool that uses artificial intelligence to transform unstructured text into structured JSON data. It can be used for a variety of purposes, including web scraping, browser extensions, customer support, data platforms, and legal document processing. Isomeric works by first analyzing the unstructured text to identify the key data points. It then uses a machine learning model to extract the data and convert it into JSON format. The resulting JSON data can then be used for a variety of purposes, such as data analysis, reporting, and machine learning.
FastBots.ai
FastBots.ai is an AI chatbot builder that allows users to create custom chatbots trained on their own data. These chatbots can be integrated into websites to provide customer support, sales assistance, and other services. FastBots.ai is easy to use and requires no coding. It supports a wide range of content types, including text, PDFs, and YouTube videos. FastBots.ai also offers a variety of features, such as customization options, chat history storage, and Zapier integration.
Quickchat AI
Quickchat AI is a custom AI assistant designed to automate customer support, lead generation, and more. It allows users to design, tweak, and deploy their own AI assistant trained on their data. Quickchat AI offers a range of features including customizable AI assistant name, conversation style, knowledge and actions, and deployment options. It also provides integrations with popular tools and systems, making it easy to use AI in everyday workflows.
ChatFast
ChatFast is a platform that allows businesses to create custom GPT chatbots using their own data. These chatbots can be used to answer customer questions, capture leads, and schedule appointments. ChatFast is easy to use and requires no coding. It is trusted by thousands of businesses and provides a range of powerful features, including the ability to train chatbots on multiple data sources, revise responses, capture leads, and create smart forms.
EmbedAI
EmbedAI is a platform that allows users to create custom AI chatbots powered by ChatGPT and trained on their own data. These chatbots can be embedded on websites and used to answer customer questions, provide support, or generate leads. EmbedAI is designed to be easy to use, even for those with no coding experience. It offers a variety of features to help users create and customize their chatbots, including a user-friendly interface, pre-built templates, and advanced training options.
EmbedAI
EmbedAI is a platform that allows users to create custom AI chatbots powered by ChatGPT and trained on their own data. It offers a range of features such as the ability to customize the look and feel of the chatbot, share it with others, and integrate it with other tools and applications. EmbedAI is designed to help businesses and individuals automate customer service, provide personalized support, and enhance user engagement on their websites.
AIChatbot
AIChatbot is a customer service chatbot builder powered by AI. It allows users to create custom chatbots trained on their own data, enabling them to provide personalized and automated support to their customers. With its advanced natural language processing capabilities, AIChatbot can understand the underlying meaning of user inputs, address spelling and grammatical errors, and generate human-like responses. It also offers multilingual support, making it accessible to users worldwide. Additionally, AIChatbot can be easily embedded into websites and integrated with various platforms, making it a versatile tool for businesses looking to enhance their customer service.
Glide
Glide is a no-code app builder that allows users to create custom business software without writing any code. It is powered by AI and offers a variety of features such as data sync, workflow automation, and integrations with other tools. Glide is used by over 100,000 companies to build apps for a variety of use cases, including field operations, event management, customer portals, and inventory management.
Chaindesk
Chaindesk is a no-code platform that allows businesses to train custom ChatGPT chatbots on their own data. With Chaindesk, businesses can automate customer support, lead generation, and more. Chaindesk's chatbots are secure, precise, and can be deployed on a variety of platforms, including websites, WhatsApp, and Slack.
Bonfire
Bonfire is an AI-powered chatbot platform that enables businesses to create personalized, human-like chatbots trained on their own data. With Bonfire, companies can enhance customer interactions, provide personalized product recommendations, score leads, and allow users to submit files. Bonfire's chatbots are designed to understand user intent and provide relevant responses, making them an ideal solution for customer service, lead generation, and sales.
Passarel
Passarel is an AI-powered tool that helps businesses streamline employee onboarding by creating custom GPT-like models that provide instant and accurate answers to new teammates' questions. By centralizing all knowledge bases into a single, accessible chat interface, Passarel eliminates wait times and ensures that new hires have the information they need to succeed. Additionally, Passarel's ability to handle various knowledge formats and parse out contradictions ensures that teams receive the most accurate and relevant information.
Insighto.ai
Insighto.ai is a powerful AI agent builder that allows users to build, customize, and deploy AI-powered chatbots and voice agents. These agents can be trained on your own data, and can be used for a variety of tasks, such as lead generation, customer support, and HR. Insighto.ai's agents are omnichannel, multilingual, and multimodal, and offer a range of features, including smart lead capture, auto-intent categorization, and voice support.
Mirage
Mirage is a custom AI platform that builds custom LLMs to accelerate productivity. It is backed by Sequoia and offers a variety of features, including the ability to create custom AI models, train models on your own data, and deploy models to the cloud or on-premises.
Bothatch
Bothatch is a platform that allows users to create custom chatbots powered by OpenAI's GPT technology. With Bothatch, users can upload their own data and documents to train their chatbots, which can then be used to engage in meaningful and productive conversations. Bothatch is designed to be easy to use, with no coding or technical skills required. It is also affordable, with pricing plans starting at $0 per month.
OpenServ
OpenServ is a platform that empowers autonomy by providing a hub to find, curate, and employ teams of autonomous agents. Users can hire custom AI workforces to enhance productivity and revolutionize the way value is created. The platform allows users to browse autonomous AI agents, create custom teams, integrate favorite apps, leverage AI workforce, and monetize skills. OpenServ offers a developer-friendly environment with customizable and technology-agnostic features, enabling users to host, create, and monetize agents. The platform aims to streamline tasks, enhance collaboration, and maximize flexibility in utilizing AI technologies.
AI Math Coach
AI Math Coach is a web-based application that uses artificial intelligence to help students learn math. The app provides personalized math worksheets that are aligned with classroom learning, and it offers a variety of features to help students practice and improve their math skills. AI Math Coach is designed to be easy to use for both parents and students, and it can be accessed from any device with an internet connection.
Booth AI
Booth AI is a platform that allows users to create custom AI solutions in minutes, not months. It is enterprise-ready, scale-ready, and disruption-ready. Booth AI offers a variety of features, including integration with over 100 apps, workplace tools, project management tools, marketing automation tools, and more. Booth AI can be used to solve a variety of business problems, including automating tasks, improving customer service, and increasing sales.
Adzviser
Adzviser is an AI-powered marketing data connector that seamlessly integrates with ChatGPT, Google Sheets, and Looker Studio. It offers an intuitive and cost-effective solution for analyzing cross-platform data, providing users with valuable insights to optimize their marketing strategies. Adzviser simplifies data extraction and analysis, making it accessible to users of all skill levels, without the need for technical expertise. The application is designed to enhance marketing analytics endeavors for businesses of all scales, from small in-house teams to large agencies managing multiple accounts.
Speck
Speck is a web automation tool that simplifies web data extraction using AI technology. It allows users to record their workflows and then automate the process with the help of an AI copilot. Speck learns from user interactions, ensuring efficient data extraction without the need for constant manual adjustments. The tool offers features such as custom workflow automation, web data supercharger, smart browser navigation, intelligent form filler, and interactive web tutorials. Speck is designed to streamline web tasks and enhance productivity by automating repetitive processes.
Macgence AI Training Data Services
Macgence is an AI training data services platform that offers high-quality off-the-shelf structured training data for organizations to build effective AI systems at scale. They provide services such as custom data sourcing, data annotation, data validation, content moderation, and localization. Macgence combines global linguistic, cultural, and technological expertise to create high-quality datasets for AI models, enabling faster time-to-market across the entire model value chain. With more than 5 years of experience, they support and scale AI initiatives of leading global innovators by designing custom data collection programs. Macgence specializes in handling AI training data for text, speech, image, and video data, offering cognitive annotation services to unlock the potential of unstructured textual data.
20 - Open Source AI Tools
kernel-memory
Kernel Memory (KM) is a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for Retrieval Augmented Generation (RAG), synthetic memory, prompt engineering, and custom semantic memory processing. KM is available as a Web Service, as a Docker container, a Plugin for ChatGPT/Copilot/Semantic Kernel, and as a .NET library for embedded applications. Utilizing advanced embeddings and LLMs, the system enables Natural Language querying for obtaining answers from the indexed data, complete with citations and links to the original sources. Designed for seamless integration as a Plugin with Semantic Kernel, Microsoft Copilot and ChatGPT, Kernel Memory enhances data-driven features in applications built for most popular AI platforms.
unitxt
Unitxt is a customizable library for textual data preparation and evaluation tailored to generative language models. It natively integrates with common libraries like HuggingFace and LM-eval-harness and deconstructs processing flows into modular components, enabling easy customization and sharing between practitioners. These components encompass model-specific formats, task prompts, and many other comprehensive dataset processing definitions. The Unitxt-Catalog centralizes these components, fostering collaboration and exploration in modern textual data workflows. Beyond being a tool, Unitxt is a community-driven platform, empowering users to build, share, and advance their pipelines collaboratively.
awesome-langchain
LangChain is an amazing framework to get LLM projects done in a matter of no time, and the ecosystem is growing fast. Here is an attempt to keep track of the initiatives around LangChain. Subscribe to the newsletter to stay informed about the Awesome LangChain. We send a couple of emails per month about the articles, videos, projects, and tools that grabbed our attention Contributions welcome. Add links through pull requests or create an issue to start a discussion. Please read the contribution guidelines before contributing.
llmware
LLMWare is a framework for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely.
awesome-generative-ai
A curated list of Generative AI projects, tools, artworks, and models
llm-twin-course
The LLM Twin Course is a free, end-to-end framework for building production-ready LLM systems. It teaches you how to design, train, and deploy a production-ready LLM twin of yourself powered by LLMs, vector DBs, and LLMOps good practices. The course is split into 11 hands-on written lessons and the open-source code you can access on GitHub. You can read everything and try out the code at your own pace.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
airbyte-platform
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's low-code Connector Development Kit (CDK). Airbyte is used by data engineers and analysts at companies of all sizes to move data for a variety of purposes, including data warehousing, data analysis, and machine learning.
indexify
Indexify is an open-source engine for building fast data pipelines for unstructured data (video, audio, images, and documents) using reusable extractors for embedding, transformation, and feature extraction. LLM Applications can query transformed content friendly to LLMs by semantic search and SQL queries. Indexify keeps vector databases and structured databases (PostgreSQL) updated by automatically invoking the pipelines as new data is ingested into the system from external data sources. **Why use Indexify** * Makes Unstructured Data **Queryable** with **SQL** and **Semantic Search** * **Real-Time** Extraction Engine to keep indexes **automatically** updated as new data is ingested. * Create **Extraction Graph** to describe **data transformation** and extraction of **embedding** and **structured extraction**. * **Incremental Extraction** and **Selective Deletion** when content is deleted or updated. * **Extractor SDK** allows adding new extraction capabilities, and many readily available extractors for **PDF**, **Image**, and **Video** indexing and extraction. * Works with **any LLM Framework** including **Langchain**, **DSPy**, etc. * Runs on your laptop during **prototyping** and also scales to **1000s of machines** on the cloud. * Works with many **Blob Stores**, **Vector Stores**, and **Structured Databases** * We have even **Open Sourced Automation** to deploy to Kubernetes in production.
promptpanel
Prompt Panel is a tool designed to accelerate the adoption of AI agents by providing a platform where users can run large language models across any inference provider, create custom agent plugins, and use their own data safely. The tool allows users to break free from walled-gardens and have full control over their models, conversations, and logic. With Prompt Panel, users can pair their data with any language model, online or offline, and customize the system to meet their unique business needs without any restrictions.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.
rank_llm
RankLLM is a suite of prompt-decoders compatible with open source LLMs like Vicuna and Zephyr. It allows users to create custom ranking models for various NLP tasks, such as document reranking, question answering, and summarization. The tool offers a variety of features, including the ability to fine-tune models on custom datasets, use different retrieval methods, and control the context size and variable passages. RankLLM is easy to use and can be integrated into existing NLP pipelines.
patchwork
PatchWork is an open-source framework designed for automating development tasks using large language models. It enables users to automate workflows such as PR reviews, bug fixing, security patching, and more through a self-hosted CLI agent and preferred LLMs. The framework consists of reusable atomic actions called Steps, customizable LLM prompts known as Prompt Templates, and LLM-assisted automations called Patchflows. Users can run Patchflows locally in their CLI/IDE or as part of CI/CD pipelines. PatchWork offers predefined patchflows like AutoFix, PRReview, GenerateREADME, DependencyUpgrade, and ResolveIssue, with the flexibility to create custom patchflows. Prompt templates are used to pass queries to LLMs and can be customized. Contributions to new patchflows, steps, and the core framework are encouraged, with chat assistants available to aid in the process. The roadmap includes expanding the patchflow library, introducing a debugger and validation module, supporting large-scale code embeddings, parallelization, fine-tuned models, and an open-source GUI. PatchWork is licensed under AGPL-3.0 terms, while custom patchflows and steps can be shared using the Apache-2.0 licensed patchwork template repository.
pathway
Pathway is a Python data processing framework for analytics and AI pipelines over data streams. It's the ideal solution for real-time processing use cases like streaming ETL or RAG pipelines for unstructured data. Pathway comes with an **easy-to-use Python API** , allowing you to seamlessly integrate your favorite Python ML libraries. Pathway code is versatile and robust: **you can use it in both development and production environments, handling both batch and streaming data effectively**. The same code can be used for local development, CI/CD tests, running batch jobs, handling stream replays, and processing data streams. Pathway is powered by a **scalable Rust engine** based on Differential Dataflow and performs incremental computation. Your Pathway code, despite being written in Python, is run by the Rust engine, enabling multithreading, multiprocessing, and distributed computations. All the pipeline is kept in memory and can be easily deployed with **Docker and Kubernetes**. You can install Pathway with pip: `pip install -U pathway` For any questions, you will find the community and team behind the project on Discord.
qb
QANTA is a system and dataset for question answering tasks. It provides a script to download datasets, preprocesses questions, and matches them with Wikipedia pages. The system includes various datasets, training, dev, and test data in JSON and SQLite formats. Dependencies include Python 3.6, `click`, and NLTK models. Elastic Search 5.6 is needed for the Guesser component. Configuration is managed through environment variables and YAML files. QANTA supports multiple guesser implementations that can be enabled/disabled. Running QANTA involves using `cli.py` and Luigi pipelines. The system accesses raw Wikipedia dumps for data processing. The QANTA ID numbering scheme categorizes datasets based on events and competitions.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
bionic-gpt
BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality. BionicGPT can run on your laptop or scale into the data center.
20 - OpenAI Gpts
Auto Custom Actions GPT
This GPT help you on one single task, generating valid OpenAI Schemas for Custom Actions in GPTs
3Commas API Expert
Python-focused expert on the 3Commas API, friendly and encouraging experimentation.
Apple Activity Kit Complete Code Expert
A detailed expert trained on all 1,337 pages of Apple ActivityKit, offering complete coding solutions. Saving time? https://www.buymeacoffee.com/parkerrex ☕️❤️
data trip
Dalle + custom corrupted data from every artist in the world. This is an experiment. (beta)
Custom GPT Builder
Create personalized GPTs with my simple builder. Click the conversation starter (starting with ###) to begin.
OpenAPI Wizard
Your guide for OpenAPI specs for helping make custom GPTs with reach easily!
Ask Cris about File Maker
An experiment in personal FileMaker guidance from the collective works of lifetime award-winning FileMaker trainer, Cris Ippolite. Not just links to resources, but direct access to 20+ years of custom training curriculum combined with expert AI instruction without the noise of external web links.