Best AI tools for< Improve Throughput >
20 - AI tool Sites
ThroughPut AI
ThroughPut AI is a supply chain decision intelligence and analytics platform designed for outcome-driven supply chain decision-makers. It provides accurate demand forecasting, capacity planning, logistics management, and financial insights to drive business results. ThroughPut AI offers a single source of truth for supply chain professionals, enabling them to make faster, better, and confident decisions. The platform helps unlock efficiency, profitability, and growth in day-to-day operations by providing intelligent data-driven recommendations.
Lamini
Lamini is an enterprise-level LLM platform that offers precise recall with Memory Tuning, enabling teams to achieve over 95% accuracy even with large amounts of specific data. It guarantees JSON output and delivers massive throughput for inference. Lamini is designed to be deployed anywhere, including air-gapped environments, and supports training and inference on Nvidia or AMD GPUs. The platform is known for its factual LLMs and reengineered decoder that ensures 100% schema accuracy in the JSON output.
Spot AI
Spot AI is a Video AI platform that transforms cameras into intelligent tools to secure, protect, and optimize operations. It offers features such as real-time visibility, incident resolution, worker safety, and training. The platform includes AI agents, semantic search, and state-of-the-art video AI models to drive business outcomes and enhance productivity. Spot AI is trusted by over 1,000 organizations to reduce workplace injuries, improve incident resolution time, and increase operational throughput.
Just Walk Out technology
Just Walk Out technology is a checkout-free shopping experience that allows customers to enter a store, grab whatever they want, and quickly get back to their day, without having to wait in a checkout line or stop at a cashier. The technology uses camera vision and sensor fusion, or RFID technology which allows them to simply walk away with their items. Just Walk Out technology is designed to increase revenue with cost-optimized technology, maximize space productivity, increase throughput, optimize operational costs, and improve shopper loyalty.
Hippo Video
Hippo Video is an AI-powered video platform designed for Go-To-Market (GTM) teams. It offers a comprehensive solution for sales, marketing, campaigns, customer support, and communications. The platform enables users to create interactive videos easily and quickly, transform text into videos at scale, and personalize video campaigns. With features like Text-to-Video, AI Avatar Video Generator, Video Flows, and AI Editor, Hippo Video helps businesses enhance engagement, accelerate video production, and improve customer self-service.
Rupa.AI
Rupa.AI is an AI-powered photo enhancement tool that leverages the latest advancements in artificial intelligence to enhance your photos effortlessly. With Rupa.AI, you can transform your ordinary photos into stunning visuals with just a few clicks. Whether you want to improve the lighting, colors, or overall quality of your images, Rupa.AI provides intuitive tools to help you achieve professional-level results. Say goodbye to complex editing software and hello to a seamless photo enhancement experience with Rupa.AI.
Weekly Workout
Weekly Workout is an AI-powered fitness platform that provides personalized workout plans and tracks your progress. With Weekly Workout, you can get four days of invigorating exercise routines every week, tailored to your fitness level and goals. The platform also offers a weekly newsletter with tips and advice from fitness experts.
Conversica
Conversica is a leading provider of AI-powered conversation automation solutions for revenue teams. Its platform enables businesses to engage with prospects and customers in personalized, two-way conversations at scale, helping them to generate more leads, close more deals, and improve customer satisfaction. Conversica's Revenue Digital Assistants are equipped with a library of revenue-hunting skills and conversations, and they can be customized to fit the specific needs of each business. The platform is easy to use and integrates with a variety of CRM and marketing automation systems.
Trevor AI
Trevor AI is a daily planner and task scheduling co-pilot that helps users organize, schedule, and automate their tasks. It features a task hub, calendar integration, AI scheduling suggestions, focus mode, and daily planning insights. Trevor AI is designed to help users improve their productivity, clarity, and focus.
Tabnine
Tabnine is an AI code assistant that accelerates and simplifies software development by providing best-in-class AI code generation, personalized AI chat support throughout the software development life cycle, and context-aware coding assistance. It ensures total code privacy and zero data retention, protecting the confidentiality and integrity of your codebase. Tabnine offers complete protection from intellectual property issues and is trusted by millions of developers and thousands of companies worldwide.
NeuralCam
NeuralCam is an AI-powered photography application that leverages the power of AI throughout the photography process to help users capture better photos. The app offers a 3-step AI photography system that includes composition guidance, smart capturing modes, and professional-level auto-editing features. NeuralCam provides real-time guidance to help users create better compositions, identify compelling subjects, and use lighting to their advantage. The app also offers smart capture modes like SmartHDR and DeepFusion, as well as AI upscaling technology for enhancing image quality. With features like AI bokeh effects, intelligent illumination, and professional color grading, NeuralCam aims to provide users with professional-grade photography tools in a user-friendly interface.
Beyz AI
Beyz AI is an AI Interview Assistant application that provides real-time answers during interviews. It offers features such as auto-translate, tailored interview prep modes, and universal meeting compatibility. The application aims to help users improve their interview performance and ensure success by leveraging AI technology. With a focus on enhancing interview skills and providing instant, tailored help, Beyz AI is designed to support users throughout the interview process.
Phenom
Phenom is an AI-powered talent experience platform that connects people, data, and interactions to deliver amazing experiences throughout the journey using intelligence and automation. It helps in hyper-personalizing candidate engagement, developing and retaining employees with intelligence, improving recruiter productivity through automation, and hiring talent faster with AI. Phenom offers a range of features and benefits to streamline the talent acquisition process and enhance the overall recruitment experience.
DeploySaaS
DeploySaaS is an AI tool designed to assist users in launching their SaaS products more effectively and efficiently. It provides guidance and support throughout the entire process, from idea validation to product launch. By leveraging AI technology, DeploySaaS aims to help users avoid common pitfalls in SaaS development and make data-driven decisions to achieve product-market fit.
Faraday
Faraday is a no-code AI platform that helps businesses make better predictions about their customers. With Faraday, businesses can embed AI into their workflows throughout their stack to improve the performance of their favorite tools. Faraday offers a variety of features, including propensity modeling, persona creation, and churn prediction. These features can be used to improve marketing campaigns, customer service, and product development. Faraday is easy to use and requires no coding experience. It is also affordable and offers a free-forever plan.
Harness
Harness is an AI-driven software delivery platform that empowers software engineering teams with AI-infused technology for seamless software delivery. It offers a single platform for all software delivery needs, including DevOps modernization, continuous delivery, GitOps, feature flags, infrastructure as code management, chaos engineering, service reliability management, secure software delivery, cloud cost optimization, and more. Harness aims to simplify the developer experience by providing actionable insights on SDLC, secure software supply chain assurance, and AI development assistance throughout the software delivery lifecycle.
Intelligencia AI
Intelligencia AI is a leading provider of AI-powered solutions for the pharmaceutical industry. Our suite of solutions helps de-risk and enhance clinical development and decision-making. We use a combination of data, AI, and machine learning to provide insights into the probability of success for drugs across multiple therapeutic areas. Our solutions are used by many of the top global pharmaceutical companies to improve their R&D productivity and make more informed decisions.
Inspecti
Inspecti is an AI-powered platform that simplifies property inspections and reporting. It uses AI to analyze photos and videos, categorize items, and generate accurate reports in minutes. By reducing manual work and minimizing errors, Inspecti enhances efficiency, reduces disputes, and builds trust between landlords and tenants. The platform streamlines the entire inspection process from start to finish, delivering accurate, AI-driven assessments for every property. Inspecti is efficient, transparent, and trusted, providing consistent, detailed insights that empower users to maintain top-quality service throughout the property's lifecycle.
Nabubit
Nabubit is an AI-powered tool designed to assist users in database design. It serves as a virtual copilot, providing guidance and suggestions throughout the database design process. With Nabubit, users can streamline their database creation, optimize performance, and ensure data integrity. The tool leverages artificial intelligence to analyze data requirements, suggest schema designs, and enhance overall database efficiency. Nabubit is a valuable resource for developers, data analysts, and businesses looking to improve their database management practices.
Deep Space AI
Deep Space AI is an innovative platform that revolutionizes the construction industry by providing intelligent solutions for design and construction workflows. The platform offers collaborative data management, actionable insights, and coordination tools to streamline construction processes and improve operational efficiency. Deep Space AI enhances transparency, boosts productivity, and empowers teams to make informed decisions throughout all project phases. Trusted by leading teams in the design and construction industry, Deep Space AI is a game-changer in the AECO sector.
20 - Open Source AI Tools
Liger-Kernel
Liger Kernel is a collection of Triton kernels designed for LLM training, increasing training throughput by 20% and reducing memory usage by 60%. It includes Hugging Face Compatible modules like RMSNorm, RoPE, SwiGLU, CrossEntropy, and FusedLinearCrossEntropy. The tool works with Flash Attention, PyTorch FSDP, and Microsoft DeepSpeed, aiming to enhance model efficiency and performance for researchers, ML practitioners, and curious novices.
litserve
LitServe is a high-throughput serving engine for deploying AI models at scale. It generates an API endpoint for a model, handles batching, streaming, autoscaling across CPU/GPUs, and more. Built for enterprise scale, it supports every framework like PyTorch, JAX, Tensorflow, and more. LitServe is designed to let users focus on model performance, not the serving boilerplate. It is like PyTorch Lightning for model serving but with broader framework support and scalability.
LMCache
LMCache is a serving engine extension designed to reduce time to first token (TTFT) and increase throughput, particularly in long-context scenarios. It stores key-value caches of reusable texts across different locations like GPU, CPU DRAM, and Local Disk, allowing the reuse of any text in any serving engine instance. By combining LMCache with vLLM, significant delay savings and GPU cycle reduction are achieved in various large language model (LLM) use cases, such as multi-round question answering and retrieval-augmented generation (RAG). LMCache provides integration with the latest vLLM version, offering both online serving and offline inference capabilities. It supports sharing key-value caches across multiple vLLM instances and aims to provide stable support for non-prefix key-value caches along with user and developer documentation.
ScaleLLM
ScaleLLM is a cutting-edge inference system engineered for large language models (LLMs), meticulously designed to meet the demands of production environments. It extends its support to a wide range of popular open-source models, including Llama3, Gemma, Bloom, GPT-NeoX, and more. ScaleLLM is currently undergoing active development. We are fully committed to consistently enhancing its efficiency while also incorporating additional features. Feel free to explore our **_Roadmap_** for more details. ## Key Features * High Efficiency: Excels in high-performance LLM inference, leveraging state-of-the-art techniques and technologies like Flash Attention, Paged Attention, Continuous batching, and more. * Tensor Parallelism: Utilizes tensor parallelism for efficient model execution. * OpenAI-compatible API: An efficient golang rest api server that compatible with OpenAI. * Huggingface models: Seamless integration with most popular HF models, supporting safetensors. * Customizable: Offers flexibility for customization to meet your specific needs, and provides an easy way to add new models. * Production Ready: Engineered with production environments in mind, ScaleLLM is equipped with robust system monitoring and management features to ensure a seamless deployment experience.
Awesome-LLM-Quantization
Awesome-LLM-Quantization is a curated list of resources related to quantization techniques for Large Language Models (LLMs). Quantization is a crucial step in deploying LLMs on resource-constrained devices, such as mobile phones or edge devices, by reducing the model's size and computational requirements.
glake
GLake is an acceleration library and utilities designed to optimize GPU memory management and IO transmission for AI large model training and inference. It addresses challenges such as GPU memory bottleneck and IO transmission bottleneck by providing efficient memory pooling, sharing, and tiering, as well as multi-path acceleration for CPU-GPU transmission. GLake is easy to use, open for extension, and focuses on improving training throughput, saving inference memory, and accelerating IO transmission. It offers features like memory fragmentation reduction, memory deduplication, and built-in security mechanisms for troubleshooting GPU memory issues.
Mooncake
Mooncake is a serving platform for Kimi, a leading LLM service provided by Moonshot AI. It features a KVCache-centric disaggregated architecture that separates prefill and decoding clusters, leveraging underutilized CPU, DRAM, and SSD resources of the GPU cluster. Mooncake's scheduler balances throughput and latency-related SLOs, with a prediction-based early rejection policy for highly overloaded scenarios. It excels in long-context scenarios, achieving up to a 525% increase in throughput while handling 75% more requests under real workloads.
LLaMa2lang
LLaMa2lang is a repository containing convenience scripts to finetune LLaMa3-8B (or any other foundation model) for chat towards any language that isn't English. The repository aims to improve the performance of LLaMa3 for non-English languages by combining fine-tuning with RAG. Users can translate datasets, extract threads, turn threads into prompts, and finetune models using QLoRA and PEFT. Additionally, the repository supports translation models like OPUS, M2M, MADLAD, and base datasets like OASST1 and OASST2. The process involves loading datasets, translating them, combining checkpoints, and running inference using the newly trained model. The repository also provides benchmarking scripts to choose the right translation model for a target language.
archgw
Arch is an intelligent Layer 7 gateway designed to protect, observe, and personalize AI agents with APIs. It handles tasks related to prompts, including detecting jailbreak attempts, calling backend APIs, routing between LLMs, and managing observability. Built on Envoy Proxy, it offers features like function calling, prompt guardrails, traffic management, and observability. Users can build fast, observable, and personalized AI agents using Arch to improve speed, security, and personalization of GenAI apps.
lantern
Lantern is an open-source PostgreSQL database extension designed to store vector data, generate embeddings, and handle vector search operations efficiently. It introduces a new index type called 'lantern_hnsw' for vector columns, which speeds up 'ORDER BY ... LIMIT' queries. Lantern utilizes the state-of-the-art HNSW implementation called usearch. Users can easily install Lantern using Docker, Homebrew, or precompiled binaries. The tool supports various distance functions, index construction parameters, and operator classes for efficient querying. Lantern offers features like embedding generation, interoperability with pgvector, parallel index creation, and external index graph generation. It aims to provide superior performance metrics compared to other similar tools and has a roadmap for future enhancements such as cloud-hosted version, hardware-accelerated distance metrics, industry-specific application templates, and support for version control and A/B testing of embeddings.
lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework known for its lightweight design, scalability, and high-speed performance. It offers features like tri-process asynchronous collaboration, Nopad for efficient attention operations, dynamic batch scheduling, FlashAttention integration, tensor parallelism, Token Attention for zero memory waste, and Int8KV Cache. The tool supports various models like BLOOM, LLaMA, StarCoder, Qwen-7b, ChatGLM2-6b, Baichuan-7b, Baichuan2-7b, Baichuan2-13b, InternLM-7b, Yi-34b, Qwen-VL, Llava-7b, Mixtral, Stablelm, and MiniCPM. Users can deploy and query models using the provided server launch commands and interact with multimodal models like QWen-VL and Llava using specific queries and images.
AzureOpenAI-with-APIM
AzureOpenAI-with-APIM is a repository that provides a one-button deploy solution for Azure API Management (APIM), Key Vault, and Log Analytics to work seamlessly with Azure OpenAI endpoints. It enables organizations to scale and manage their Azure OpenAI service efficiently by issuing subscription keys via APIM, delivering usage metrics, and implementing policies for access control and cost management. The repository offers detailed guidance on implementing APIM to enhance Azure OpenAI resiliency, scalability, performance, monitoring, and chargeback capabilities.
zipnn
ZipNN is a lossless and near-lossless compression library optimized for numbers/tensors in the Foundation Models environment. It automatically prepares data for compression based on its type, allowing users to focus on core tasks without worrying about compression complexities. The library delivers effective compression techniques for different data types and structures, achieving high compression ratios and rates. ZipNN supports various compression methods like ZSTD, lz4, and snappy, and provides ready-made scripts for file compression/decompression. Users can also manually import the package to compress and decompress data. The library offers advanced configuration options for customization and validation tests for different input and compression types.
awesome-transformer-nlp
This repository contains a hand-curated list of great machine (deep) learning resources for Natural Language Processing (NLP) with a focus on Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT), attention mechanism, Transformer architectures/networks, Chatbot, and transfer learning in NLP.
Efficient-LLMs-Survey
This repository provides a systematic and comprehensive review of efficient LLMs research. We organize the literature in a taxonomy consisting of three main categories, covering distinct yet interconnected efficient LLMs topics from **model-centric** , **data-centric** , and **framework-centric** perspective, respectively. We hope our survey and this GitHub repository can serve as valuable resources to help researchers and practitioners gain a systematic understanding of the research developments in efficient LLMs and inspire them to contribute to this important and exciting field.
ray-llm
RayLLM (formerly known as Aviary) is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs, built on Ray Serve. It provides an extensive suite of pre-configured open source LLMs, with defaults that work out of the box. RayLLM supports Transformer models hosted on Hugging Face Hub or present on local disk. It simplifies the deployment of multiple LLMs, the addition of new LLMs, and offers unique autoscaling support, including scale-to-zero. RayLLM fully supports multi-GPU & multi-node model deployments and offers high performance features like continuous batching, quantization and streaming. It provides a REST API that is similar to OpenAI's to make it easy to migrate and cross test them. RayLLM supports multiple LLM backends out of the box, including vLLM and TensorRT-LLM.
Awesome-Segment-Anything
Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.
Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.
20 - OpenAI Gpts
UX & UI
Gives you tips and suggestions on how you can improve your application for your users.
Memory Enhancer
Offers exercises and techniques to improve memory retention and cognitive functions.
English Conversation Role Play Creator
Generates conversation examples and chunks for specified situations. Improve your instantaneous conversational skills through repetitive practice!
Customer Retention Consultant
Analyzes customer churn and provides strategies to improve loyalty and retention.
Agile Coach Expert
Agile expert providing practical, step-by-step advice with the agile way of working of your team and organisation. Whether you're looking to improve your Agile skills or find solutions to specific problems. Including Scrum, Kanban and SAFe knowledge.
Kemi - Research & Creative Assistant
I improve marketing effectiveness by designing stunning research-led assets in a flash!
Quickest Feedback for Language Learner
Helps improve language skills through interactive scenarios and feedback.
Le VPN - Your Secure Internet Proxy
Bypass Internet censorship & improve your security online
実践スキルが身につく営業ロールプレイング:【エキスパートクラス】
実践スキル向上のための対話型学習アシスタント (Interactive learning assistant to improve practical skills)
Your personal GRC & Security Tutor
A training tool for infosec professionals to improve their skills in GRC & security and help obtain related certifications.
Anna, the Ethical Essay Guide
Guides in structuring essays to improve writing skills, adapting to skill levels.
MetaGPT : Meta Ads AI Marketing Co-Pilot
Expert in Meta advertising that can improve your ROI. Official Meta GPT built by dicer.ai