free-llm-api-resources

A list of free LLM inference resources accessible via API.

Stars: 2554

Visit

The 'Free LLM API resources' repository provides a comprehensive list of services offering free access or credits for API-based LLM usage. It includes various providers with details on model names, limits, and notes. Users can find information on legitimate services and their respective usage restrictions to leverage LLM capabilities without incurring costs. The repository aims to assist developers and researchers in accessing AI models for experimentation, development, and learning purposes.

README:

Free LLM API resources

This lists various services that provide free access or credits towards API-based LLM usage.

[!NOTE]
Please don't abuse these services, else we might lose them.

[!WARNING]
This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot)

Free Providers

Provider	Provider Limits/Notes	Model Name	Model Limits
OpenRouter	20 requests/minute 200 requests/day	Bytedance UI Tars 72B
		DeepHermes 3 Llama 3 8B Preview
		DeepSeek R1
		DeepSeek R1 Distill Llama 70B
		DeepSeek R1 Distill Qwen 14B
		DeepSeek R1 Distill Qwen 32B
		DeepSeek R1 Zero
		DeepSeek V3
		DeepSeek V3 0324
		DeepSeek V3 Base
		Dolphin 3.0 Mistral 24B
		Dolphin 3.0 R1 Mistral 24B
		Featherless Qwerky 72B
		Gemini 2.5 Pro Experimental 03-25
		Gemma 2 9B Instruct
		Gemma 3 12B Instruct
		Gemma 3 1B Instruct
		Gemma 3 27B Instruct
		Gemma 3 4B Instruct
		Llama 3.1 8B Instruct
		Llama 3.1 Nemotron 70B Instruct
		Llama 3.2 11B Vision Instruct
		Llama 3.2 1B Instruct
		Llama 3.2 3B Instruct
		Llama 3.3 70B Instruct
		Llama 4 Maverick
		Llama 4 Scout
		Mistral 7B Instruct
		Mistral Nemo
		Mistral Small 24B Instruct 2501
		Mistral Small 3.1 24B Instruct
		Molmo 7B D
		Moonlight-16B-A3B-Instruct
		OlympicCoder 32B
		OlympicCoder 7B
		OpenChat 7B
		Phi-3 Medium 128k Instruct
		Phi-3 Mini 128k Instruct
		Qwen 2.5 72B Instruct
		Qwen 2.5 7B Instruct
		Qwen 2.5 VL 32B Instruct
		Qwen 2.5 VL 3B Instruct
		Qwen 2.5 VL 7B Instruct
		Qwen QwQ 32B
		Qwen QwQ 32B Preview
		Qwen2.5 Coder 32B Instruct
		Qwen2.5 VL 72B Instruct
		Reka Flash 3
		Rogue Rose 103B v0.2
		Toppy M 7B
		Zephyr 7B Beta
Google AI Studio	Data is used for training (when used outside of the UK/CH/EEA/EU).	Gemini 2.5 Pro (Experimental)	5,000,000 tokens/day 1,000,000 tokens/minute 25 requests/day 5 requests/minute
		Gemini 2.0 Flash	1,000,000 tokens/minute 1,500 requests/day 15 requests/minute
		Gemini 2.0 Flash-Lite	1,000,000 tokens/minute 1,500 requests/day 30 requests/minute
		Gemini 2.0 Flash (Experimental)	4,000,000 tokens/minute 1,500 requests/day 10 requests/minute
		Gemini 1.5 Flash	1,000,000 tokens/minute 1,500 requests/day 15 requests/minute
		Gemini 1.5 Flash-8B	1,000,000 tokens/minute 1,500 requests/day 15 requests/minute
		Gemini 1.5 Pro	32,000 tokens/minute 50 requests/day 2 requests/minute
		LearnLM 1.5 Pro (Experimental)	1,500 requests/day 15 requests/minute
		Gemma 3 27B Instruct	15,000 tokens/minute 14,400 requests/day 30 requests/minute
		text-embedding-004	150 batch requests/minute 1,500 requests/minute 100 content/batch
		embedding-001
Mistral (La Plateforme)	Free tier (Experiment plan) requires opting into data training, requires phone number verification.	Open and Proprietary Mistral models	1 request/second 500,000 tokens/minute 1,000,000,000 tokens/month
Mistral (Codestral)	Currently free to use, monthly subscription based, requires phone number verification.	Codestral	30 requests/minute 2,000 requests/day
HuggingFace Serverless Inference	Limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB.	Various open models	Variable credits per month, currently $0.10
Cerebras	Free tier restricted to 8K context	Llama 3.1 8B	30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Cerebras	Free tier restricted to 8K context	Llama 3.3 70B	30 requests/minute 60,000 tokens/minute 900 requests/hour 1,000,000 tokens/hour 14,400 requests/day 1,000,000 tokens/day
Groq		Allam 2 7B	7,000 requests/day 6,000 tokens/minute
		DeepSeek R1 Distill Llama 70B	1,000 requests/day 6,000 tokens/minute
		DeepSeek R1 Distill Qwen 32B	1,000 requests/day 6,000 tokens/minute
		Distil Whisper Large v3	7,200 audio-seconds/minute 2,000 requests/day
		Gemma 2 9B Instruct	14,400 requests/day 15,000 tokens/minute
		Llama 3 70B	14,400 requests/day 6,000 tokens/minute
		Llama 3 8B	14,400 requests/day 6,000 tokens/minute
		Llama 3.1 8B	14,400 requests/day 6,000 tokens/minute
		Llama 3.2 11B Vision	7,000 requests/day 7,000 tokens/minute
		Llama 3.2 1B	7,000 requests/day 7,000 tokens/minute
		Llama 3.2 3B	7,000 requests/day 7,000 tokens/minute
		Llama 3.2 90B Vision	3,500 requests/day 7,000 tokens/minute
		Llama 3.3 70B	1,000 requests/day 6,000 tokens/minute
		Llama 3.3 70B (Speculative Decoding)	1,000 requests/day 6,000 tokens/minute
		Llama 4 Scout Instruct	1,000 requests/day 6,000 tokens/minute
		Llama Guard 3 8B	14,400 requests/day 15,000 tokens/minute
		Mistral Saba 24B	1,000 requests/day 6,000 tokens/minute
		Qwen 2.5 32B	1,000 requests/day 6,000 tokens/minute
		Qwen 2.5 Coder 32B	1,000 requests/day 6,000 tokens/minute
		Qwen QwQ 32B	1,000 requests/day 6,000 tokens/minute
		Whisper Large v3	7,200 audio-seconds/minute 2,000 requests/day
		Whisper Large v3 Turbo	7,200 audio-seconds/minute 2,000 requests/day
OVH AI Endpoints (Free Beta)		DeepSeek R1 Distill Llama 70B	12 requests/minute
		Llama 3.1 70B Instruct	12 requests/minute
		Llama 3.1 8B Instruct	12 requests/minute
		Llama 3.3 70B Instruct	12 requests/minute
		Llava Next Mistral 7B	12 requests/minute
		Mamba Codestral 7B v0.1	12 requests/minute
		Mistral 7B Instruct v0.3	12 requests/minute
		Mistral Nemo 2407	12 requests/minute
		Mixtral 8x7B Instruct v0.1	12 requests/minute
		Qwen 2.5 VL 72B Instruct	12 requests/minute
		Qwen2.5 Coder 32B Instruct	12 requests/minute
Together		Llama 3.2 11B Vision Instruct
		Llama 3.3 70B Instruct
		DeepSeek R1 Distil Llama 70B
Cohere	20 requests/min 1,000 requests/month	Command-R	Shared Limit
		Command-R+
		Command-A
GitHub Models	Extremely restrictive input/output token limits. Rate limits dependent on Copilot subscription tier (Free/Pro/Business/Enterprise)	AI21 Jamba 1.5 Large
		AI21 Jamba 1.5 Mini
		Codestral 25.01
		Cohere Command R
		Cohere Command R 08-2024
		Cohere Command R+
		Cohere Command R+ 08-2024
		Cohere Embed v3 English
		Cohere Embed v3 Multilingual
		DeepSeek-R1
		DeepSeek-V3
		JAIS 30b Chat
		Llama-3.2-11B-Vision-Instruct
		Llama-3.2-90B-Vision-Instruct
		Llama-3.3-70B-Instruct
		Meta-Llama-3-70B-Instruct
		Meta-Llama-3-8B-Instruct
		Meta-Llama-3.1-405B-Instruct
		Meta-Llama-3.1-70B-Instruct
		Meta-Llama-3.1-8B-Instruct
		Ministral 3B
		Mistral Large
		Mistral Large (2407)
		Mistral Large 24.11
		Mistral Nemo
		Mistral Small
		Mistral Small 3.1
		OpenAI GPT-4o
		OpenAI GPT-4o mini
		OpenAI Text Embedding 3 (large)
		OpenAI Text Embedding 3 (small)
		OpenAI o1
		OpenAI o1-mini
		OpenAI o1-preview
		OpenAI o3-mini
		Phi-3-medium instruct (128k)
		Phi-3-medium instruct (4k)
		Phi-3-mini instruct (128k)
		Phi-3-mini instruct (4k)
		Phi-3-small instruct (128k)
		Phi-3-small instruct (8k)
		Phi-3.5-MoE instruct (128k)
		Phi-3.5-mini instruct (128k)
		Phi-3.5-vision instruct (128k)
		Phi-4
		Phi-4-mini-instruct
		Phi-4-multimodal-instruct
Chutes	Distributed, decentralized crypto-based compute. Data is sent to individual hosts.	DeepHermes 3 Llama 3 8B Preview
		DeepSeek V3 0324
		DeepSeek V3 Base
		Dolphin 3.0 Mistral 24B
		Dolphin 3.0 R1 Mistral 24B
		Gemma 3 12B Instruct
		Gemma 3 1B Instruct
		Gemma 3 4B Instruct
		Llama 4 Maverick 17B 128E Instruct FP8
		Llama 4 Scout 17B 16E Instruct
		Mistral Small 3.1 24B Instruct 2503
		OlympicCoder 32B
		OlympicCoder 7B
		Qwen 2.5 VL 32B Instruct
		Reka Flash 3
Cloudflare Workers AI	10,000 neurons/day	DeepSeek R1 Distill Qwen 32B
		Deepseek Coder 6.7B Base (AWQ)
		Deepseek Coder 6.7B Instruct (AWQ)
		Deepseek Math 7B Instruct
		Discolm German 7B v1 (AWQ)
		Falcom 7B Instruct
		Gemma 2B Instruct (LoRA)
		Gemma 7B Instruct
		Gemma 7B Instruct (LoRA)
		Hermes 2 Pro Mistral 7B
		Llama 2 13B Chat (AWQ)
		Llama 2 7B Chat (FP16)
		Llama 2 7B Chat (INT8)
		Llama 2 7B Chat (LoRA)
		Llama 3 8B Instruct
		Llama 3 8B Instruct
		Llama 3 8B Instruct (AWQ)
		Llama 3.1 8B Instruct
		Llama 3.1 8B Instruct (AWQ)
		Llama 3.1 8B Instruct (FP8)
		Llama 3.2 11B Vision Instruct
		Llama 3.2 1B Instruct
		Llama 3.2 3B Instruct
		Llama 3.3 70B Instruct (FP8)
		Llama 4 Scout Instruct
		Llama Guard 3 8B
		LlamaGuard 7B (AWQ)
		Mistral 7B Instruct v0.1
		Mistral 7B Instruct v0.1 (AWQ)
		Mistral 7B Instruct v0.2
		Mistral 7B Instruct v0.2 (LoRA)
		Neural Chat 7B v3.1 (AWQ)
		OpenChat 3.5 0106
		OpenHermes 2.5 Mistral 7B (AWQ)
		Phi-2
		Qwen 1.5 0.5B Chat
		Qwen 1.5 1.8B Chat
		Qwen 1.5 14B Chat (AWQ)
		Qwen 1.5 7B Chat (AWQ)
		SQLCoder 7B 2
		Starling LM 7B Beta
		TinyLlama 1.1B Chat v1.0
		Una Cybertron 7B v2 (BF16)
		Zephyr 7B Beta (AWQ)
Google Cloud Vertex AI	Very stringent payment verification for Google Cloud.	Llama 3.1 70B Instruct	Llama 3.1 API Service free during preview. 60 requests/minute
		Llama 3.1 8B Instruct	Llama 3.1 API Service free during preview. 60 requests/minute
		Llama 3.2 90B Vision Instruct	Llama 3.2 API Service free during preview. 30 requests/minute
		Llama 3.3 70B Instruct	Llama 3.3 API Service free during preview. 30 requests/minute
		Gemini 2.5 Pro Experimental	Experimental Gemini model. 10 requests/minute
		Gemini 2.0 Flash Experimental
		Gemini 2.0 Flash Thinking Experimental
		Gemini 2.0 Pro Experimental
NVIDIA NIM	Phone number verification required.	Various open models	40 requests/minute

Providers with trial credits

Provider	Credits	Requirements	Models
Together	$1 when you add a payment method		Various open models
Fireworks	$1		Various open models
Unify	$5 when you add a payment method		Routes to other providers, various open models and proprietary models (OpenAI, Gemini, Anthropic, Mistral, Perplexity, etc)
Baseten	$30		Any supported model - pay by compute time
Nebius	$1		Various open models
Novita	$0.5 for 1 year $20 for 3 months for DeepSeek models with referral code + GitHub account connection		Various open models
Hyperbolic	$1		DeepSeek V3
			DeepSeek V3 0324
			Hermes 3 Llama 3.1 70B
			Llama 3 70B Instruct
			Llama 3.1 405B Base
			Llama 3.1 405B Base (FP8)
			Llama 3.1 405B Instruct
			Llama 3.1 70B Instruct
			Llama 3.1 8B Instruct
			Llama 3.2 3B Instruct
			Llama 3.3 70B Instruct
			Pixtral 12B (2409)
			Qwen QwQ 32B
			Qwen QwQ 32B Preview
			Qwen2.5 72B Instruct
			Qwen2.5 Coder 32B Instruct
			Qwen2.5 VL 72B Instruct
			Qwen2.5 VL 7B Instruct
SambaNova Cloud	$5 for 3 months		E5-Mistral-7B-Instruct
			Llama 3.1 405B
			Llama 3.1 70B
			Llama 3.1 8B
			Llama 3.2 11B Vision
			Llama 3.2 1B
			Llama 3.2 3B
			Llama 3.2 90B Vision
			Llama 3.3 70B
			Llama-Guard-3-8B
			Qwen/QwQ-32B
			Qwen/QwQ-32B-Preview
			Qwen/Qwen2-Audio-7B-Instruct
			Qwen/Qwen2.5-72B-Instruct
			Qwen/Qwen2.5-Coder-32B-Instruct
			allenai/Llama-3.1-Tulu-3-405B
			deepseek-ai/DeepSeek-R1
			deepseek-ai/DeepSeek-R1-Distill-Llama-70B
			deepseek-ai/DeepSeek-V3-0324
			tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3
			tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3
Scaleway Generative APIs	1,000,000 free tokens		BGE-Multilingual-Gemma2
			DeepSeek R1 Distill Llama 70B
			DeepSeek R1 Distill Llama 8B
			Llama 3.1 70B Instruct
			Llama 3.1 8B Instruct
			Llama 3.3 70B Instruct
			Mistral Nemo 2407
			Pixtral 12B (2409)
			Qwen2.5 Coder 32B Instruct
			sentence-t5-xxl
AI21	$10 for 3 months		Jamba/Jurrasic-2
Upstage	$10 for 3 months		Solar Pro/Mini
NLP Cloud	$15	Phone number verification	Various open models
Alibaba Cloud (International) Model Studio	Token/time-limited trials on a per-model basis		Various open and proprietary Qwen models
Modal	$30/month		Any supported model - pay by compute time

For Tasks:

Click tags to check more tools for each tasks

generate text analyze data train models access ai services experiment with ai

For Jobs:

data scientist machine learning engineer research scientist ai developer natural language processing specialist

Alternative AI tools for free-llm-api-resources

Similar Open Source Tools

free-llm-api-resources

github

: 2.6k

Tutorial

The Bookworm·Puyu large model training camp aims to promote the implementation of large models in more industries and provide developers with a more efficient platform for learning the development and application of large models. Within two weeks, you will learn the entire process of fine-tuning, deploying, and evaluating large models.

github

: 1.6k

Awesome-LLMOps

github

: 4.3k

swift

SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) supports training, inference, evaluation and deployment of nearly **200 LLMs and MLLMs** (multimodal large models). Developers can directly apply our framework to their own research and production environments to realize the complete workflow from model training and evaluation to application. In addition to supporting the lightweight training solutions provided by [PEFT](https://github.com/huggingface/peft), we also provide a complete **Adapters library** to support the latest training techniques such as NEFTune, LoRA+, LLaMA-PRO, etc. This adapter library can be used directly in your own custom workflow without our training scripts. To facilitate use by users unfamiliar with deep learning, we provide a Gradio web-ui for controlling training and inference, as well as accompanying deep learning courses and best practices for beginners. Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.

github

: 2.7k

helicone

Helicone is an open-source observability platform designed for Language Learning Models (LLMs). It logs requests to OpenAI in a user-friendly UI, offers caching, rate limits, and retries, tracks costs and latencies, provides a playground for iterating on prompts and chat conversations, supports collaboration, and will soon have APIs for feedback and evaluation. The platform is deployed on Cloudflare and consists of services like Web (NextJs), Worker (Cloudflare Workers), Jawn (Express), Supabase, and ClickHouse. Users can interact with Helicone locally by setting up the required services and environment variables. The platform encourages contributions and provides resources for learning, documentation, and integrations.

github

: 3.5k

vectordb-recipes

This repository contains examples, applications, starter code, & tutorials to help you kickstart your GenAI projects. * These are built using LanceDB, a free, open-source, serverless vectorDB that **requires no setup**. * It **integrates into python data ecosystem** so you can simply start using these in your existing data pipelines in pandas, arrow, pydantic etc. * LanceDB has **native Typescript SDK** using which you can **run vector search** in serverless functions! This repository is divided into 3 sections: - Examples - Get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes! - Applications - Ready to use Python and web apps using applied LLMs, VectorDB and GenAI tools - Tutorials - A curated list of tutorials, blogs, Colabs and courses to get you started with GenAI in greater depth.

github

: 733

stm32ai-modelzoo

The STM32 AI model zoo is a collection of reference machine learning models optimized to run on STM32 microcontrollers. It provides a large collection of application-oriented models ready for re-training, scripts for easy retraining from user datasets, pre-trained models on reference datasets, and application code examples generated from user AI models. The project offers training scripts for transfer learning or training custom models from scratch. It includes performances on reference STM32 MCU and MPU for float and quantized models. The project is organized by application, providing step-by-step guides for training and deploying models.

github

: 255

Olares

Olares is an open-source sovereign cloud OS designed for local AI, enabling users to build their own AI assistants, sync data across devices, self-host their workspace, stream media, and more within a sovereign cloud environment. Users can effortlessly run leading AI models, deploy open-source AI apps, access AI apps and models anywhere, and benefit from integrated AI for personalized interactions. Olares offers features like edge AI, personal data repository, self-hosted workspace, private media server, smart home hub, and user-owned decentralized social media. The platform provides enterprise-grade security, secure application ecosystem, unified file system and database, single sign-on, AI capabilities, built-in applications, seamless access, and development tools. Olares is compatible with Linux, Raspberry Pi, Mac, and Windows, and offers a wide range of system-level applications, third-party components and services, and additional libraries and components.

github

: 1.9k

AITreasureBox

AITreasureBox is a comprehensive collection of AI tools and resources designed to simplify and accelerate the development of AI projects. It provides a wide range of pre-trained models, datasets, and utilities that can be easily integrated into various AI applications. With AITreasureBox, developers can quickly prototype, test, and deploy AI solutions without having to build everything from scratch. Whether you are working on computer vision, natural language processing, or reinforcement learning projects, AITreasureBox has something to offer for everyone. The repository is regularly updated with new tools and resources to keep up with the latest advancements in the field of artificial intelligence.

github

: 560

AiTreasureBox

AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

github

: 368

awesome-llm-webapps

This repository is a curated list of open-source, actively maintained web applications that leverage large language models (LLMs) for various use cases, including chatbots, natural language interfaces, assistants, and question answering systems. The projects are evaluated based on key criteria such as licensing, maintenance status, complexity, and features, to help users select the most suitable starting point for their LLM-based applications. The repository welcomes contributions and encourages users to submit projects that meet the criteria or suggest improvements to the existing list.

github

: 173

LitServe

LitServe is a high-throughput serving engine designed for deploying AI models at scale. It generates an API endpoint for models, handles batching, streaming, and autoscaling across CPU/GPUs. LitServe is built for enterprise scale with a focus on minimal, hackable code-base without bloat. It supports various model types like LLMs, vision, time-series, and works with frameworks like PyTorch, JAX, Tensorflow, and more. The tool allows users to focus on model performance rather than serving boilerplate, providing full control and flexibility.

github

: 3.0k

SLR-FC

This repository provides a comprehensive collection of AI tools and resources to enhance literature reviews. It includes a curated list of AI tools for various tasks, such as identifying research gaps, discovering relevant papers, visualizing paper content, and summarizing text. Additionally, the repository offers materials on generative AI, effective prompts, copywriting, image creation, and showcases of AI capabilities. By leveraging these tools and resources, researchers can streamline their literature review process, gain deeper insights from scholarly literature, and improve the quality of their research outputs.

github

: 131

Hands-On-Large-Language-Models-CN

Hands-On Large Language Models CN(ZH) is a Chinese version of the book 'Hands-On Large Language Models' by Jay Alammar and Maarten Grootendorst. It provides detailed code annotations and additional insights, offers Notebook versions suitable for Chinese network environments, utilizes openbayes for free GPU access, allows convenient environment setup with vscode, and includes accompanying Chinese language videos on platforms like Bilibili and YouTube. The book covers various chapters on topics like Tokens and Embeddings, Transformer LLMs, Text Classification, Text Clustering, Prompt Engineering, Text Generation, Semantic Search, Multimodal LLMs, Text Embedding Models, Fine-tuning Models, and more.

github

: 244

litgpt

LitGPT is a command-line tool designed to easily finetune, pretrain, evaluate, and deploy 20+ LLMs **on your own data**. It features highly-optimized training recipes for the world's most powerful open-source large-language-models (LLMs).

github

: 11.9k

txtai

Txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. It combines vector indexes, graph networks, and relational databases to enable vector search with SQL, topic modeling, retrieval augmented generation, and more. Txtai can stand alone or serve as a knowledge source for large language models (LLMs). Key features include vector search with SQL, object storage, topic modeling, graph analysis, multimodal indexing, embedding creation for various data types, pipelines powered by language models, workflows to connect pipelines, and support for Python, JavaScript, Java, Rust, and Go. Txtai is open-source under the Apache 2.0 license.

github

: 10.7k

For similar tasks

free-llm-api-resources

github

: 2.6k

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136

sorrentum

Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.

github

: 89

tidb

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

github

: 37.1k

zep-python

Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

github

: 60

telemetry-airflow

This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

github

: 185

mojo

Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.

github

: 23.0k

pandas-ai

PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

github

: 14.0k

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529