Best AI tools for< Load Models >
18 - AI tool Sites
Milo
Milo is an AI-powered co-pilot for parents, designed to help them manage the chaos of family life. It uses GPT-4, the latest in large-language models, to sort and organize information, send reminders, and provide updates. Milo is designed to be accurate and solve complex problems, and it learns and gets better based on user feedback. It can be used to manage tasks such as adding items to a grocery list, getting updates on the week's schedule, and sending screenshots of birthday invitations.
VoiceGPT
VoiceGPT is an Android app that provides a voice-based interface to interact with AI language models like ChatGPT, Bing AI, and Bard. It offers features such as unlimited free messages, voice input and output in 67+ languages, a floating bubble for easy switching between apps, OCR text recognition, code execution, image generation with DALL-E 2, and support for ChatGPT Plus accounts. VoiceGPT is designed to be accessible for users with visual impairments, dyslexia, or other conditions, and it can be set as the default assistant to be activated hands-free with a custom hotword.
Kolank
Kolank is an AI tool that provides a unified API for accessing a wide range of Language Model Models (LLMs) and providers. It offers features such as model comparison based on price, latency, output, context, and throughput, OpenAI compatible API integration, transparency in tracking API calls and token expenditure, cost reduction by paying for performance, load balancing with fallbacks, and easy integration with preferred LLMs using Python, Javascript, and Curl.
Avaturn
Avaturn is a realistic 3D avatar creator that uses generative AI to turn a 2D photo into a recognizable and realistic 3D avatar. With endless options for avatar customization, you can create a unique look for each and everyone. Export your avatar as a 3D model and load it in Blender, Unity, Unreal Engine, Maya, Cinema4D, or any other 3D environment. The avatars come with a standard humanoid body rig, ARKit blendshapes, and visemes. They are compatible with Mixamo animations and VTubing software.
Ermine.ai
Ermine.ai is an AI tool that provides local audio recording and transcription services. Users can easily transcribe audio files into text using this tool. The application currently supports Chrome browser and is working on adding support for Firefox. It requires the browser to load and initialize the transcription model, which may take a few minutes during the first use. The tool is designed to offer fast transcription services with support for English language only.
Vendorful
Vendorful is an AI-powered tool designed to assist users in responding to Requests for Proposals (RFPs), Requests for Information (RFIs), and Security Questionnaires. The tool leverages artificial intelligence to generate draft responses with better-than-human accuracy based on the user's content. Vendorful aims to streamline the RFP response process, save time, and increase the chances of winning deals by providing contextual smarts and flexible answers.
WebPilot
WebPilot is an AI tool designed to enhance your GPTs by enabling them to perform various tasks such as opening URL/file links, using multiple search engines, accessing all types of websites, loading dynamic web content, and providing enhanced answers. It offers a super easy way to interact with webpages, assisting in tasks like responding to emails, writing in forms, and solving quizzes. WebPilot is free, open-source, and has been featured by Google Extension Store as an established publisher.
Parade
Parade is a capacity management platform designed for freight brokerages and 3PLs to streamline operations, automate bookings, and improve margins. The platform leverages advanced AI to optimize pricing, bidding, and carrier management, helping users book more loads efficiently. Parade integrates seamlessly with existing tech stacks, offering precise pricing, optimized bidding, and enhanced shipper connectivity. The platform boasts a range of features and benefits aimed at increasing efficiency, reducing costs, and boosting margins for freight businesses.
SwapFans
The website offers an AI-powered tool called SwapFans that allows users to load balance and receive discounts. Users can easily FaceSwap any social media videos and swap entire Instagram and TikTok accounts with high-speed FaceSwap AI. The tool is designed to help users manage their social media presence effectively and efficiently.
PixieBrix
PixieBrix is an AI engagement platform that allows users to build, deploy, and manage internal AI tools to drive team productivity. It unifies AI landscapes with oversight and governance for enterprise scale. The platform is enterprise-ready and fully customizable to meet unique needs, and can be deployed on any site, making it easy to integrate into existing systems. PixieBrix leverages the power of AI and automation to harness the latest technology to streamline workflows and take productivity to new heights.
TLDRai
TLDRai.com is an AI tool designed to help users summarize any text into concise and easy-to-digest content, enabling them to free themselves from information overload. The tool utilizes AI technology to provide efficient text summarization services, making it a valuable resource for individuals seeking quick and accurate summaries of lengthy texts.
promptsplitter.com
The website promptsplitter.com is experiencing an Argo Tunnel error on the Cloudflare network. Users encountering this error are advised to wait a few minutes if they are visitors, or ensure that cloudflared is running and can reach the network if they are the website owners. The error message provides guidance on troubleshooting steps to resolve the issue.
Merlin AI
Merlin AI is a YouTube transcript tool that allows users to create summaries of YouTube videos. It is easy to use and can be added to Chrome as an extension. Merlin AI is powered by an undocumented API and features the latest build.
Daxtra
Daxtra is an AI-powered recruitment technology tool designed to help staffing and recruiting professionals find, parse, match, and engage the best candidates quickly and efficiently. The tool offers a suite of products that seamlessly integrate with existing ATS or CRM systems, automating various recruitment processes such as candidate data loading, CV/resume formatting, information extraction, and job matching. Daxtra's solutions cater to corporates, vendors, job boards, and social media partners, providing a comprehensive set of developer components to enhance recruitment workflows.
Lex Fridman
Lex Fridman is an AI tool developed by Lex Fridman, a Research Scientist at MIT, focusing on human-robot interaction and machine learning. The tool offers various resources such as podcasts, research publications, and studies related to AI-assisted driving data collection, autonomous vehicle systems, gaze estimation, and cognitive load estimation. It aims to provide insights into the safe and enjoyable interaction between humans and AI in driving scenarios.
LiteLLM
LiteLLM is an AI tool that offers a Unified API for Azure OpenAI Vertex AI Bedrock. It provides a proxy server for managing authentication, load balancing, and spend tracking across a wide range of LLMs. LiteLLM is designed to simplify the integration and management of various AI services in the OpenAI format. With features like cloud deployment, open-source availability, and extensive provider integrations, LiteLLM aims to streamline AI development workflows and enhance operational efficiency.
Epicflow
Epicflow is an AI-based multi-project and resource management software designed to help organizations deliver more projects on time with available resources, increase profitability, and make informed project decisions using real-time data and predictive analytics. The software bridges demand and supply by matching talent based on competencies, experience, and availability. It offers features like AI assistant, What-If Analysis, Future Load Graph, Historical Load Graph, Task List, and Competence Management Pipeline. Epicflow is trusted by leading companies in various industries for high performance and flawless project delivery.
Knowbo
Knowbo is a custom chatbot tool that allows users to create a chatbot for their website in just 2 minutes. The chatbot learns directly from the website or documentation, providing up-to-date information to users. With features like easy deployment, chat history tracking, and customization options, Knowbo aims to revolutionize customer experience by reducing the load on support teams and offering a seamless way for users to get their questions answered quickly.
20 - Open Source AI Tools
wllama
Wllama is a WebAssembly binding for llama.cpp, a high-performance and lightweight language model library. It enables you to run inference directly on the browser without the need for a backend or GPU. Wllama provides both high-level and low-level APIs, allowing you to perform various tasks such as completions, embeddings, tokenization, and more. It also supports model splitting, enabling you to load large models in parallel for faster download. With its Typescript support and pre-built npm package, Wllama is easy to integrate into your React Typescript projects.
catalyst
Catalyst is a C# Natural Language Processing library designed for speed, inspired by spaCy's design. It provides pre-trained models, support for training word and document embeddings, and flexible entity recognition models. The library is fast, modern, and pure-C#, supporting .NET standard 2.0. It is cross-platform, running on Windows, Linux, macOS, and ARM. Catalyst offers non-destructive tokenization, named entity recognition, part-of-speech tagging, language detection, and efficient binary serialization. It includes pre-built models for language packages and lemmatization. Users can store and load models using streams. Getting started with Catalyst involves installing its NuGet Package and setting the storage to use the online repository. The library supports lazy loading of models from disk or online. Users can take advantage of C# lazy evaluation and native multi-threading support to process documents in parallel. Training a new FastText word2vec embedding model is straightforward, and Catalyst also provides algorithms for fast embedding search and dimensionality reduction.
autoscraper
AutoScraper is a smart, automatic, fast, and lightweight web scraping tool for Python. It simplifies the process of web scraping by learning scraping rules based on sample data provided by the user. The tool can extract text, URLs, or HTML tag values from web pages and return similar elements. Users can utilize the learned object to scrape similar content or exact elements from new pages. AutoScraper is compatible with Python 3 and offers easy installation from various sources. It provides functionalities for fetching similar and exact results from web pages, such as extracting post titles from Stack Overflow or live stock prices from Yahoo Finance. The tool allows customization with custom requests module parameters like proxies or headers. Users can save and load models for future use and explore advanced usages through tutorials and examples.
spandrel
Spandrel is a library for loading and running pre-trained PyTorch models. It automatically detects the model architecture and hyperparameters from model files, and provides a unified interface for running models.
lmstudio.js
lmstudio.js is a pre-release alpha client SDK for LM Studio, allowing users to use local LLMs in JS/TS/Node. It is currently undergoing rapid development with breaking changes expected. Users can follow LM Studio's announcements on Twitter and Discord. The SDK provides API usage for loading models, predicting text, setting up the local LLM server, and more. It supports features like custom loading progress tracking, model unloading, structured output prediction, and cancellation of predictions. Users can interact with LM Studio through the CLI tool 'lms' and perform tasks like text completion, conversation, and getting prediction statistics.
ServerlessLLM
ServerlessLLM is a fast, affordable, and easy-to-use library designed for multi-LLM serving, optimized for environments with limited GPU resources. It supports loading various leading LLM inference libraries, achieving fast load times, and reducing model switching overhead. The library facilitates easy deployment via Ray Cluster and Kubernetes, integrates with the OpenAI Query API, and is actively maintained by contributors.
fsdp_qlora
The fsdp_qlora repository provides a script for training Large Language Models (LLMs) with Quantized LoRA and Fully Sharded Data Parallelism (FSDP). It integrates FSDP+QLoRA into the Axolotl platform and offers installation instructions for dependencies like llama-recipes, fastcore, and PyTorch. Users can finetune Llama-2 70B on Dual 24GB GPUs using the provided command. The script supports various training options including full params fine-tuning, LoRA fine-tuning, custom LoRA fine-tuning, quantized LoRA fine-tuning, and more. It also discusses low memory loading, mixed precision training, and comparisons to existing trainers. The repository addresses limitations and provides examples for training with different configurations, including BnB QLoRA and HQQ QLoRA. Additionally, it offers SLURM training support and instructions for adding support for a new model.
llmware
LLMWare is a framework for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely.
horde-worker-reGen
This repository provides the latest implementation for the AI Horde Worker, allowing users to utilize their graphics card(s) to generate, post-process, or analyze images for others. It offers a platform where users can create images and earn 'kudos' in return, granting priority for their own image generations. The repository includes important details for setup, recommendations for system configurations, instructions for installation on Windows and Linux, basic usage guidelines, and information on updating the AI Horde Worker. Users can also run the worker with multiple GPUs and receive notifications for updates through Discord. Additionally, the repository contains models that are licensed under the CreativeML OpenRAIL License.
LocalAI
LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.
MARS5-TTS
MARS5 is a novel English speech model (TTS) developed by CAMB.AI, featuring a two-stage AR-NAR pipeline with a unique NAR component. The model can generate speech for various scenarios like sports commentary and anime with just 5 seconds of audio and a text snippet. It allows steering prosody using punctuation and capitalization in the transcript. Speaker identity is specified using an audio reference file, enabling 'deep clone' for improved quality. The model can be used via torch.hub or HuggingFace, supporting both shallow and deep cloning for inference. Checkpoints are provided for AR and NAR models, with hardware requirements of 750M+450M params on GPU. Contributions to improve model stability, performance, and reference audio selection are welcome.
mistral.rs
Mistral.rs is a fast LLM inference platform written in Rust. We support inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings.
ragoon
RAGoon is a high-level library designed for batched embeddings generation, fast web-based RAG (Retrieval-Augmented Generation) processing, and quantized indexes processing. It provides NLP utilities for multi-model embedding production, high-dimensional vector visualization, and enhancing language model performance through search-based querying, web scraping, and data augmentation techniques.
Awesome-ChatTTS
Awesome-ChatTTS is an official recommended guide for ChatTTS beginners, compiling common questions and related resources. It provides a comprehensive overview of the project, including official introduction, quick experience options, popular branches, parameter explanations, voice seed details, installation guides, FAQs, and error troubleshooting. The repository also includes video tutorials, discussion community links, and project trends analysis. Users can explore various branches for different functionalities and enhancements related to ChatTTS.
LLMFlex
LLMFlex is a python package designed for developing AI applications with local Large Language Models (LLMs). It provides classes to load LLM models, embedding models, and vector databases to create AI-powered solutions with prompt engineering and RAG techniques. The package supports multiple LLMs with different generation configurations, embedding toolkits, vector databases, chat memories, prompt templates, custom tools, and a chatbot frontend interface. Users can easily create LLMs, load embeddings toolkit, use tools, chat with models in a Streamlit web app, and serve an OpenAI API with a GGUF model. LLMFlex aims to offer a simple interface for developers to work with LLMs and build private AI solutions using local resources.
flow-prompt
Flow Prompt is a dynamic library for managing and optimizing prompts for large language models. It facilitates budget-aware operations, dynamic data integration, and efficient load distribution. Features include CI/CD testing, dynamic prompt development, multi-model support, real-time insights, and prompt testing and evolution.
fastc
Fastc is a tool focused on CPU execution, using efficient models for embedding generation and cosine similarity classification. It allows for efficient multi-classifier execution without extra overhead. Users can easily train text classifiers, export models, publish to HuggingFace, load existing models, make class predictions, use instruct templates, and launch an inference server. The tool provides an HTTP API for text classification with JSON payloads and supports multiple languages for language identification.
r2ai
r2ai is a tool designed to run a language model locally without internet access. It can be used to entertain users or assist in answering questions related to radare2 or reverse engineering. The tool allows users to prompt the language model, index large codebases, slurp file contents, embed the output of an r2 command, define different system-level assistant roles, set environment variables, and more. It is accessible as an r2lang-python plugin and can be scripted from various languages. Users can use different models, adjust query templates dynamically, load multiple models, and make them communicate with each other.
keras-llm-robot
The Keras-llm-robot Web UI project is an open-source tool designed for offline deployment and testing of various open-source models from the Hugging Face website. It allows users to combine multiple models through configuration to achieve functionalities like multimodal, RAG, Agent, and more. The project consists of three main interfaces: chat interface for language models, configuration interface for loading models, and tools & agent interface for auxiliary models. Users can interact with the language model through text, voice, and image inputs, and the tool supports features like model loading, quantization, fine-tuning, role-playing, code interpretation, speech recognition, image recognition, network search engine, and function calling.
aphrodite-engine
Aphrodite is the official backend engine for PygmalionAI, serving as the inference endpoint for the website. It allows serving Hugging Face-compatible models with fast speeds. Features include continuous batching, efficient K/V management, optimized CUDA kernels, quantization support, distributed inference, and 8-bit KV Cache. The engine requires Linux OS and Python 3.8 to 3.12, with CUDA >= 11 for build requirements. It supports various GPUs, CPUs, TPUs, and Inferentia. Users can limit GPU memory utilization and access full commands via CLI.