free-llm-api-resources
A list of free LLM inference resources accessible via API.
Stars: 1189
The 'Free LLM API resources' repository provides a comprehensive list of services offering free access or credits for API-based LLM usage. It includes various providers with details on model names, limits, and notes. Users can find information on legitimate services and their respective usage restrictions to leverage LLM capabilities without incurring costs. The repository aims to assist developers and researchers in accessing AI models for experimentation, development, and learning purposes.
README:
This lists various services that provide free access or credits towards API-based LLM usage.
[!NOTE]
Please don't abuse these services, else we might lose them.
[!WARNING]
This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot)
Provider | Provider Limits/Notes | Model Name | Model Limits |
---|---|---|---|
OpenRouter | 20 requests/minute 200 requests/day |
Gemma 2 9B Instruct | |
Llama 3 8B Instruct | |||
Llama 3.1 405B Instruct | |||
Llama 3.1 70B Instruct | |||
Llama 3.1 8B Instruct | |||
Llama 3.2 11B Vision Instruct | |||
Llama 3.2 1B Instruct | |||
Llama 3.2 3B Instruct | |||
Llama 3.2 90B Vision Instruct | |||
Mistral 7B Instruct | |||
Mythomax L2 13B | |||
OpenChat 7B | |||
Phi-3 Medium 128k Instruct | |||
Phi-3 Mini 128k Instruct | |||
Qwen 2 7B Instruct | |||
Toppy M 7B | |||
Zephyr 7B Beta | |||
Google AI Studio | Data is used for training (when used outside of the UK/CH/EEA/EU). | Gemini 2.0 Flash Experimental | 4,000,000 tokens/minute 10 requests/minute |
Gemini 1.5 Flash | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
||
Gemini 1.5 Flash (Experimental) | 1,000,000 tokens/minute 1,500 requests/day 5 requests/minute |
||
Gemini 1.5 Flash-8B | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
||
Gemini 1.5 Flash-8B (Experimental) | 1,000,000 tokens/minute 1,500 requests/day 15 requests/minute |
||
Gemini 1.5 Pro | 32,000 tokens/minute 50 requests/day 2 requests/minute |
||
Gemini 1.5 Pro (Experimental) | 1,000,000 tokens/minute 100 requests/day 5 requests/minute |
||
LearnLM 1.5 Pro (Experimental) | 1,500 requests/day 15 requests/minute |
||
Gemini 1.0 Pro | 32,000 tokens/minute 1,500 requests/day 15 requests/minute |
||
text-embedding-004 | 150 batch requests/minute 1,500 requests/minute 100 content/batch |
||
embedding-001 | |||
Mistral (La Plateforme) | Free tier (Experiment plan) requires opting into data training, requires phone number verification. | Open and Proprietary Mistral models | 1 request/second 500,000 tokens/minute 1,000,000,000 tokens/month |
Mistral (Codestral) | Currently free to use, monthly subscription based, requires phone number verification. | Codestral | 30 requests/minute 2,000 requests/day |
HuggingFace Serverless Inference | Limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB. |
Various open models | 1,000 requests/day (with an account) |
SambaNova Cloud | Llama 3.1 8B | 30 requests/minute | |
Llama 3.1 70B | 20 requests/minute | ||
Llama 3.1 405B | 10 requests/minute | ||
Llama 3.2 1B | 30 requests/minute | ||
Llama 3.2 3B | 30 requests/minute | ||
Llama 3.2 11B | 10 requests/minute | ||
Llama 3.2 90B | 1 requests/minute | ||
Llama 3.3 70B | 20 requests/minute | ||
Llama Guard 3 8B | 30 requests/minute | ||
Qwen 2.5 72B | 20 requests/minute | ||
Qwen 2.5 Coder 32B | 20 requests/minute | ||
QwQ 32B Preview | 10 requests/minute | ||
Cerebras | Free tier restricted to 8K context | Llama 3.1 8B | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
Llama 3.1 70B | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
||
Llama 3.3 70B | 30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day |
||
Groq | Distil Whisper Large v3 | 7,200 audio-seconds/minute 2,000 requests/day |
|
Gemma 2 9B Instruct | 14,400 requests/day 15,000 tokens/minute |
||
Llama 3 70B | 14,400 requests/day 6,000 tokens/minute |
||
Llama 3 8B | 14,400 requests/day 30,000 tokens/minute |
||
Llama 3.1 70B | 14,400 requests/day 6,000 tokens/minute |
||
Llama 3.1 8B | 14,400 requests/day 20,000 tokens/minute |
||
Llama 3.2 11B Vision | 7,000 requests/day 7,000 tokens/minute |
||
Llama 3.2 1B | 7,000 requests/day 7,000 tokens/minute |
||
Llama 3.2 3B | 7,000 requests/day 7,000 tokens/minute |
||
Llama 3.2 90B Vision | 3,500 requests/day 7,000 tokens/minute |
||
Llama 3.3 70B | 1,000 requests/day 6,000 tokens/minute |
||
Llama 3.3 70B (Speculative Decoding) | 1,000 requests/day 6,000 tokens/minute |
||
Llama Guard 3 8B | 14,400 requests/day 15,000 tokens/minute |
||
Mixtral 8x7B | 14,400 requests/day 5,000 tokens/minute |
||
Whisper Large v3 | 7,200 audio-seconds/minute 2,000 requests/day |
||
Whisper Large v3 Turbo | 7,200 audio-seconds/minute 2,000 requests/day |
||
Scaleway Generative APIs (Free Beta) | BGE-Multilingual-Gemma2 | 600 requests/minute 1,000,000 tokens/minute |
|
Llama 3.1 70B Instruct | 300 requests/minute 100,000 tokens/minute |
||
Llama 3.1 8B Instruct | 300 requests/minute 100,000 tokens/minute |
||
Mistral Nemo 2407 | 300 requests/minute 100,000 tokens/minute |
||
Pixtral 12B (2409) | 300 requests/minute 100,000 tokens/minute |
||
Qwen2.5 Coder 32B Instruct | |||
llama-3.3-70b-instruct | |||
sentence-t5-xxl | 600 requests/minute 1,000,000 tokens/minute |
||
OVH AI Endpoints (Free Beta) | CodeLlama 13B Instruct | 12 requests/minute | |
Codestral Mamba 7B v0.1 | 12 requests/minute | ||
Llama 2 13B Chat | 12 requests/minute | ||
Llama 3 70B Instruct | 12 requests/minute | ||
Llama 3 8B Instruct | 12 requests/minute | ||
Llama 3.1 70B Instruct | 12 requests/minute | ||
Mathstral 7B v0.1 | 12 requests/minute | ||
Mistral 7B Instruct | 12 requests/minute | ||
Mistral Nemo 2407 | 12 requests/minute | ||
Mixtral 8x22B Instruct | 12 requests/minute | ||
Mixtral 8x7B Instruct | 12 requests/minute | ||
llava-next-mistral-7b | 12 requests/minute | ||
Together | Llama 3.2 11B Vision Instruct | Free for 2024 | |
Cohere | 20 requests/min 1,000 requests/month |
Command-R | Shared Limit |
Command-R+ | |||
GitHub Models | Extremely restrictive input/output token limits. Rate limits dependent on Copilot subscription tier (Free/Pro/Business/Enterprise) |
AI21 Jamba 1.5 Large | |
AI21 Jamba 1.5 Mini | |||
Codestral 25.01 | |||
Cohere Command R | |||
Cohere Command R 08-2024 | |||
Cohere Command R+ | |||
Cohere Command R+ 08-2024 | |||
Cohere Embed v3 English | |||
Cohere Embed v3 Multilingual | |||
JAIS 30b Chat | |||
Llama-3.2-11B-Vision-Instruct | |||
Llama-3.2-90B-Vision-Instruct | |||
Llama-3.3-70B-Instruct | |||
Meta-Llama-3-70B-Instruct | |||
Meta-Llama-3-8B-Instruct | |||
Meta-Llama-3.1-405B-Instruct | |||
Meta-Llama-3.1-70B-Instruct | |||
Meta-Llama-3.1-8B-Instruct | |||
Ministral 3B | |||
Mistral Large | |||
Mistral Large (2407) | |||
Mistral Large 24.11 | |||
Mistral Nemo | |||
Mistral Small | |||
OpenAI GPT-4o | |||
OpenAI GPT-4o mini | |||
OpenAI Text Embedding 3 (large) | |||
OpenAI Text Embedding 3 (small) | |||
OpenAI o1 | |||
OpenAI o1-mini | |||
OpenAI o1-preview | |||
Phi-3-medium instruct (128k) | |||
Phi-3-medium instruct (4k) | |||
Phi-3-mini instruct (128k) | |||
Phi-3-mini instruct (4k) | |||
Phi-3-small instruct (128k) | |||
Phi-3-small instruct (8k) | |||
Phi-3.5-MoE instruct (128k) | |||
Phi-3.5-mini instruct (128k) | |||
Phi-3.5-vision instruct (128k) | |||
Phi-4 | |||
Cloudflare Workers AI | 10,000 tokens/day | Deepseek Coder 6.7B Base (AWQ) | |
Deepseek Coder 6.7B Instruct (AWQ) | |||
Deepseek Math 7B Instruct | |||
Discolm German 7B v1 (AWQ) | |||
Falcom 7B Instruct | |||
Gemma 2B Instruct (LoRA) | |||
Gemma 7B Instruct | |||
Gemma 7B Instruct (LoRA) | |||
Hermes 2 Pro Mistral 7B | |||
Llama 2 13B Chat (AWQ) | |||
Llama 2 7B Chat (FP16) | |||
Llama 2 7B Chat (INT8) | |||
Llama 2 7B Chat (LoRA) | |||
Llama 3 8B Instruct | |||
Llama 3 8B Instruct | |||
Llama 3 8B Instruct (AWQ) | |||
Llama 3.1 8B Instruct | |||
Llama 3.1 8B Instruct (AWQ) | |||
Llama 3.1 8B Instruct (FP8) | |||
Llama 3.2 11B Vision Instruct | |||
Llama 3.2 1B Instruct | |||
Llama 3.2 3B Instruct | |||
Llama 3.3 70B Instruct (FP8) | |||
LlamaGuard 7B (AWQ) | |||
Mistral 7B Instruct v0.1 | |||
Mistral 7B Instruct v0.1 (AWQ) | |||
Mistral 7B Instruct v0.2 | |||
Mistral 7B Instruct v0.2 (LoRA) | |||
Neural Chat 7B v3.1 (AWQ) | |||
OpenChat 3.5 0106 | |||
OpenHermes 2.5 Mistral 7B (AWQ) | |||
Phi-2 | |||
Qwen 1.5 0.5B Chat | |||
Qwen 1.5 1.8B Chat | |||
Qwen 1.5 14B Chat (AWQ) | |||
Qwen 1.5 7B Chat (AWQ) | |||
SQLCoder 7B 2 | |||
Starling LM 7B Beta | |||
TinyLlama 1.1B Chat v1.0 | |||
Una Cybertron 7B v2 (BF16) | |||
Zephyr 7B Beta (AWQ) | |||
Google Cloud Vertex AI | Very stringent payment verification for Google Cloud. | Llama 3.1 70B Instruct | Llama 3.1 API Service free during preview. 60 requests/minute |
Llama 3.1 8B Instruct | Llama 3.1 API Service free during preview. 60 requests/minute |
||
Llama 3.2 90B Vision Instruct | Llama 3.2 API Service free during preview. 30 requests/minute |
||
Gemini 2.0 Flash Experimental | Experimental Gemini model. 10 requests/minute |
||
Gemini Flash Experimental | |||
Gemini Pro Experimental |
Provider | Credits | Requirements | Models |
---|---|---|---|
Together | $1 when you add a payment method | Various open models | |
Fireworks | $1 | Various open models | |
Unify | $5 when you add a payment method | Routes to other providers, various open models and proprietary models (OpenAI, Gemini, Anthropic, Mistral, Perplexity, etc) | |
NVIDIA NIM | 1,000 API calls for 1 month | Various open models | |
Baseten | $30 | Any supported model - pay by compute time | |
Nebius | $1 | Various open models | |
Novita | $0.5 | Various open models | |
Hyperbolic | $10 | DeepSeek V2.5 | |
DeepSeek V3 | |||
Hermes 3 Llama 3.1 70B | |||
Llama 3 70B Instruct | |||
Llama 3.1 405B Base | |||
Llama 3.1 405B Base (FP8) | |||
Llama 3.1 405B Instruct | |||
Llama 3.1 405B Instruct Virtuals | |||
Llama 3.1 70B Instruct | |||
Llama 3.1 8B Instruct | |||
Llama 3.2 3B Instruct | |||
Llama 3.3 70B Instruct | |||
Pixtral 12B (2409) | |||
Qwen QwQ 32B Preview | |||
Qwen2-VL 72B Instruct | |||
Qwen2-VL 7B Instruct | |||
Qwen2.5 72B Instruct | |||
Qwen2.5 Coder 32B Instruct | |||
AI21 | $10 for 3 months | Jamba/Jurrasic-2 | |
Upstage | $10 for 3 months | Solar Pro/Mini | |
NLP Cloud | $15 | Phone number verification | Various open models |
Alibaba Cloud (International) Model Studio | Token/time-limited trials on a per-model basis | Various open and proprietary Qwen models |
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for free-llm-api-resources
Similar Open Source Tools
free-llm-api-resources
The 'Free LLM API resources' repository provides a comprehensive list of services offering free access or credits for API-based LLM usage. It includes various providers with details on model names, limits, and notes. Users can find information on legitimate services and their respective usage restrictions to leverage LLM capabilities without incurring costs. The repository aims to assist developers and researchers in accessing AI models for experimentation, development, and learning purposes.
Tutorial
The Bookworm·Puyu large model training camp aims to promote the implementation of large models in more industries and provide developers with a more efficient platform for learning the development and application of large models. Within two weeks, you will learn the entire process of fine-tuning, deploying, and evaluating large models.
helicone
Helicone is an open-source observability platform designed for Language Learning Models (LLMs). It logs requests to OpenAI in a user-friendly UI, offers caching, rate limits, and retries, tracks costs and latencies, provides a playground for iterating on prompts and chat conversations, supports collaboration, and will soon have APIs for feedback and evaluation. The platform is deployed on Cloudflare and consists of services like Web (NextJs), Worker (Cloudflare Workers), Jawn (Express), Supabase, and ClickHouse. Users can interact with Helicone locally by setting up the required services and environment variables. The platform encourages contributions and provides resources for learning, documentation, and integrations.
swift
SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) supports training, inference, evaluation and deployment of nearly **200 LLMs and MLLMs** (multimodal large models). Developers can directly apply our framework to their own research and production environments to realize the complete workflow from model training and evaluation to application. In addition to supporting the lightweight training solutions provided by [PEFT](https://github.com/huggingface/peft), we also provide a complete **Adapters library** to support the latest training techniques such as NEFTune, LoRA+, LLaMA-PRO, etc. This adapter library can be used directly in your own custom workflow without our training scripts. To facilitate use by users unfamiliar with deep learning, we provide a Gradio web-ui for controlling training and inference, as well as accompanying deep learning courses and best practices for beginners. Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
intel-extension-for-transformers
Intel® Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU. The toolkit provides the below key features and examples: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) and [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114), and NeurIPS 2021's paper [Prune Once for All: Sparse Pre-Trained Language Models](https://arxiv.org/abs/2111.05754)) * Optimized Transformer-based model packages such as [Stable Diffusion](examples/huggingface/pytorch/text-to-image/deployment/stable_diffusion), [GPT-J-6B](examples/huggingface/pytorch/text-generation/deployment), [GPT-NEOX](examples/huggingface/pytorch/language-modeling/quantization#2-validated-model-list), [BLOOM-176B](examples/huggingface/pytorch/language-modeling/inference#BLOOM-176B), [T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), [Flan-T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), and end-to-end workflows such as [SetFit-based text classification](docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb) and [document level sentiment analysis (DLSA)](workflows/dlsa) * [NeuralChat](intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of [plugins](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/docs/advanced_features.md) such as [Knowledge Retrieval](./intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/README.md), [Speech Interaction](./intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md), [Query Caching](./intel_extension_for_transformers/neural_chat/pipeline/plugins/caching/README.md), and [Security Guardrail](./intel_extension_for_transformers/neural_chat/pipeline/plugins/security/README.md). This framework supports Intel Gaudi2/CPU/GPU. * [Inference](https://github.com/intel/neural-speed/tree/main) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels for Intel CPU and Intel GPU (TBD), supporting [GPT-NEOX](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox), [LLAMA](https://github.com/intel/neural-speed/tree/main/neural_speed/models/llama), [MPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/mpt), [FALCON](https://github.com/intel/neural-speed/tree/main/neural_speed/models/falcon), [BLOOM-7B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/bloom), [OPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/opt), [ChatGLM2-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/chatglm), [GPT-J-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptj), and [Dolly-v2-3B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox). Support AMX, VNNI, AVX512F and AVX2 instruction set. We've boosted the performance of Intel CPUs, with a particular focus on the 4th generation Intel Xeon Scalable processor, codenamed [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html).
awesome-llm
Awesome LLM is a curated list of resources related to Large Language Models (LLMs), including models, projects, datasets, benchmarks, materials, papers, posts, GitHub repositories, HuggingFace repositories, and reading materials. It provides detailed information on various LLMs, their parameter sizes, announcement dates, and contributors. The repository covers a wide range of LLM-related topics and serves as a valuable resource for researchers, developers, and enthusiasts interested in the field of natural language processing and artificial intelligence.
stm32ai-modelzoo
The STM32 AI model zoo is a collection of reference machine learning models optimized to run on STM32 microcontrollers. It provides a large collection of application-oriented models ready for re-training, scripts for easy retraining from user datasets, pre-trained models on reference datasets, and application code examples generated from user AI models. The project offers training scripts for transfer learning or training custom models from scratch. It includes performances on reference STM32 MCU and MPU for float and quantized models. The project is organized by application, providing step-by-step guides for training and deploying models.
Olares
Olares is an open-source sovereign cloud OS designed for local AI, enabling users to build their own AI assistants, sync data across devices, self-host their workspace, stream media, and more within a sovereign cloud environment. Users can effortlessly run leading AI models, deploy open-source AI apps, access AI apps and models anywhere, and benefit from integrated AI for personalized interactions. Olares offers features like edge AI, personal data repository, self-hosted workspace, private media server, smart home hub, and user-owned decentralized social media. The platform provides enterprise-grade security, secure application ecosystem, unified file system and database, single sign-on, AI capabilities, built-in applications, seamless access, and development tools. Olares is compatible with Linux, Raspberry Pi, Mac, and Windows, and offers a wide range of system-level applications, third-party components and services, and additional libraries and components.
vectordb-recipes
This repository contains examples, applications, starter code, & tutorials to help you kickstart your GenAI projects. * These are built using LanceDB, a free, open-source, serverless vectorDB that **requires no setup**. * It **integrates into python data ecosystem** so you can simply start using these in your existing data pipelines in pandas, arrow, pydantic etc. * LanceDB has **native Typescript SDK** using which you can **run vector search** in serverless functions! This repository is divided into 3 sections: - Examples - Get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes! - Applications - Ready to use Python and web apps using applied LLMs, VectorDB and GenAI tools - Tutorials - A curated list of tutorials, blogs, Colabs and courses to get you started with GenAI in greater depth.
AITreasureBox
AITreasureBox is a comprehensive collection of AI tools and resources designed to simplify and accelerate the development of AI projects. It provides a wide range of pre-trained models, datasets, and utilities that can be easily integrated into various AI applications. With AITreasureBox, developers can quickly prototype, test, and deploy AI solutions without having to build everything from scratch. Whether you are working on computer vision, natural language processing, or reinforcement learning projects, AITreasureBox has something to offer for everyone. The repository is regularly updated with new tools and resources to keep up with the latest advancements in the field of artificial intelligence.
haystack-core-integrations
This repository contains integrations to extend the capabilities of Haystack version 2.0 and onwards. The code in this repo is maintained by deepset, see each integration's `README` file for details around installation, usage and support.
SLR-FC
This repository provides a comprehensive collection of AI tools and resources to enhance literature reviews. It includes a curated list of AI tools for various tasks, such as identifying research gaps, discovering relevant papers, visualizing paper content, and summarizing text. Additionally, the repository offers materials on generative AI, effective prompts, copywriting, image creation, and showcases of AI capabilities. By leveraging these tools and resources, researchers can streamline their literature review process, gain deeper insights from scholarly literature, and improve the quality of their research outputs.
txtai
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. It combines vector indexes, graph networks, and relational databases to enable vector search with SQL, topic modeling, retrieval augmented generation, and more. Txtai can stand alone or serve as a knowledge source for large language models (LLMs). Key features include vector search with SQL, object storage, topic modeling, graph analysis, multimodal indexing, embedding creation for various data types, pipelines powered by language models, workflows to connect pipelines, and support for Python, JavaScript, Java, Rust, and Go. Txtai is open-source under the Apache 2.0 license.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
For similar tasks
free-llm-api-resources
The 'Free LLM API resources' repository provides a comprehensive list of services offering free access or credits for API-based LLM usage. It includes various providers with details on model names, limits, and notes. Users can find information on legitimate services and their respective usage restrictions to leverage LLM capabilities without incurring costs. The repository aims to assist developers and researchers in accessing AI models for experimentation, development, and learning purposes.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
sorrentum
Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.
tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.
zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
mojo
Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.
pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.