req_llm

Req plugin to query AI providers

Stars: 64

Visit

ReqLLM is a Req-based library for LLM interactions, offering a unified interface to AI providers through a plugin-based architecture. It brings composability and middleware advantages to LLM interactions, with features like auto-synced providers/models, typed data structures, ergonomic helpers, streaming capabilities, usage & cost extraction, and a plugin-based provider system. Users can easily generate text, structured data, embeddings, and track usage costs. The tool supports various AI providers like Anthropic, OpenAI, Groq, Google, and xAI, and allows for easy addition of new providers. ReqLLM also provides API key management, detailed documentation, and a roadmap for future enhancements.

README:

ReqLLM

A Req-based library for LLM interactions, providing a unified interface to AI providers through a plugin-based architecture.

Why ReqLLM?

ReqLLM brings the composability and middleware advantages of the Req ecosystem to LLM interactions. With its plugin architecture, provider/model auto-sync, typed data structures, and ergonomic helpers, it provides a robust foundation for building AI-powered applications in Elixir while leveraging Req's powerful middleware, tracing, and instrumentation capabilities.

Installation

Add req_llm to your list of dependencies in mix.exs:

def deps do
  [
    {:req_llm, "~> 1.0-rc"}
  ]
end

Requirements: Elixir ~> 1.15, OTP 24+

Features

45 providers / 665+ models auto-synced from models.dev (mix req_llm.models.sync)
- Cost, context length, modality and capability metadata included
Typed data structures for every call
- Context, Message, ContentPart, StreamChunk, Tool
- All structs are Jason.Encoders and can be inspected / persisted
Two ergonomic client layers
- Low-level Req plugin interface with full HTTP + model metadata
- Vercel AI-style helpers (generate_text/3, stream_text/3, bang ! variants)
Streaming built in (ReqLLM.stream_text/3) — each chunk is a StreamChunk
Usage & cost extraction on every response (response.usage)
Plugin-based provider system
- Anthropic, OpenAI, Groq, Google, xAI and OpenRouter included
- Automatic parameter translation for model-specific requirements
- Easily extendable with new providers (see Adding a Provider Guide)
Context Codec protocol converts ReqLLM structs to provider wire formats
Extensive test matrix (local fixtures + optional live calls)

Quick Start

# Configure API keys (recommended approach)
ReqLLM.put_key(:anthropic_api_key, "sk-ant-...")

# Override per-request (highest priority)
# Per-request: ReqLLM.generate_text(model, "hi", api_key: "sk-live-...")

# Alternative methods:
# Environment: System.put_env("ANTHROPIC_API_KEY", "sk-ant-...")  
# Application: Application.put_env(:req_llm, :anthropic_api_key, "sk-ant-...")
# .env files are auto-loaded via JidoKeys integration

model = "anthropic:claude-3-sonnet"

# Simple text generation
ReqLLM.generate_text!(model, "Hello world")
#=> "Hello! How can I assist you today?"

# Structured data generation
schema = [name: [type: :string, required: true], age: [type: :pos_integer]]
person = ReqLLM.generate_object!(model, "Generate a person", schema)
#=> %{name: "John Doe", age: 30}

# With system prompts and parameters
{:ok, response} = ReqLLM.generate_text(
  model,
  [
    system("You are a helpful coding assistant"),
    user("Explain recursion in Elixir")
  ],
  temperature: 0.7,
  max_tokens: 200
)

# Tool calling
weather_tool = ReqLLM.tool(
  name: "get_weather",
  description: "Get current weather for a location",
  parameter_schema: [
    location: [type: :string, required: true, doc: "City name"]
  ],
  callback: fn args -> {:ok, "Sunny, 72°F"} end
)

{:ok, response} = ReqLLM.generate_text(
  model,
  "What's the weather in Paris?",
  tools: [weather_tool]
)

# Streaming text generation
ReqLLM.stream_text!(model, "Write a short story")
|> Stream.each(&IO.write(&1.text))
|> Stream.run()

# Embeddings
{:ok, embeddings} = ReqLLM.embed_many("openai:text-embedding-3-small", ["Hello", "World"])

Provider Support

Provider	Chat	Streaming	Tools	Embeddings
Anthropic	✓	✓	✓	✗
OpenAI	✓	✓	✓	✓
Google	✓	✓	✓	✗
Groq	✓	✓	✓	✗
xAI	✓	✓	✓	✗
OpenRouter	✓	✓	✓	✗

API Key Management

ReqLLM provides flexible API key management with clear precedence order:

Per-request keys (highest priority)
ReqLLM.put_key (recommended - secure in-memory storage)
Application configuration
Environment variables
JidoKeys (.env auto-loading)

# Recommended: Secure in-memory storage
ReqLLM.put_key(:openai_api_key, "sk-...")
ReqLLM.put_key(:anthropic_api_key, "sk-ant-...")

# Per-request override (highest priority)
ReqLLM.generate_text("openai:gpt-4", "Hello", api_key: "sk-...")

# Alternative: Application configuration
Application.put_env(:req_llm, :openai_api_key, "sk-...")

# Alternative: Environment variables
System.put_env("OPENAI_API_KEY", "sk-...")

# .env files are automatically loaded via JidoKeys integration
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...

# Helper functions show expected keys
ReqLLM.Keys.env_var_name(:openai)    #=> "OPENAI_API_KEY"
ReqLLM.Keys.config_key(:openai)      #=> :openai_api_key

# Debug key source (useful for troubleshooting)
{:ok, key, source} = ReqLLM.Keys.get(:openai)
#=> {:ok, "sk-...", :jido}  # Shows key came from ReqLLM.put_key

Usage Cost Tracking

Every response includes detailed usage and cost information:

{:ok, response} = ReqLLM.generate_text("openai:gpt-4", "Hello")

response.usage
#=> %ReqLLM.Usage{
#     input_tokens: 8,
#     output_tokens: 12,
#     total_tokens: 20,
#     input_cost: 0.00024,
#     output_cost: 0.00036,
#     total_cost: 0.0006
#   }

Adding a Provider

See the Adding a Provider Guide for detailed instructions on implementing new providers using the ReqLLM.Plugin behaviour.

Lower-Level Req Plugin API

For advanced use cases, you can use ReqLLM providers directly as Req plugins:

alias ReqLLM.Providers.Anthropic

# Configure your API key
ReqLLM.put_key(:anthropic_api_key, "sk-ant-...")

# Build context and model
context = ReqLLM.Context.new([
  ReqLLM.Context.system("You are a helpful assistant"),  
  ReqLLM.Context.user("Hello!")
])
model = ReqLLM.Model.from!("anthropic:claude-3-sonnet")

# Option 1: Use provider's prepare_request (recommended)
{:ok, request} = Anthropic.prepare_request(:chat, model, context, temperature: 0.7)
{:ok, response} = Req.request(request)

# Option 2: Build Req request manually with attach
request = 
  Req.new(url: "/messages", method: :post)
  |> Anthropic.attach(model, context: context, temperature: 0.7)

{:ok, response} = Req.request(request)

# Access response data
response.body["content"]
#=> [%{"type" => "text", "text" => "Hello! How can I help you today?"}]

This approach gives you full control over the Req pipeline, allowing you to add custom middleware, modify requests, or integrate with existing Req-based applications.

Documentation

Getting Started – first call and basic concepts
Core Concepts – architecture & data model
API Reference – functions & types
Data Structures – detailed type information
Capability Testing – testing strategies
Adding a Provider – extend with new providers

Roadmap & Status

ReqLLM 1.0-rc.1 is a release candidate. The core API is stable, but minor breaking changes may occur before the final 1.0.0 release based on community feedback.

Planned for 1.x:

Additional open-source providers (Ollama, LocalAI)
Enhanced streaming capabilities
Performance optimizations
Extended model metadata

Development

# Install dependencies
mix deps.get

# Run tests with cached fixtures
mix test

# Run tests against live APIs (regenerates fixtures)
LIVE=true mix test

# Run quality checks
mix q  # format, compile, dialyzer, credo

# Generate documentation
mix docs

Testing with Fixtures

ReqLLM uses a sophisticated fixture system powered by LiveFixture:

Default mode: Tests run against cached JSON fixtures
Live mode (LIVE=true): Tests run against real APIs and regenerate fixtures
Provider filtering (FIXTURE_FILTER=anthropic): Regenerate fixtures for specific providers only

Contributing

We welcome contributions! Please:

Fork the repository
Create a feature branch
Add tests for your changes
Run mix q to ensure quality standards
Submit a pull request

Running Tests

mix test - Run all tests with fixtures
LIVE=true mix test - Run against live APIs (requires API keys)
FIXTURE_FILTER=openai mix test - Limit to specific provider

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

For Tasks:

Click tags to check more tools for each tasks

generate text track usage costs add new providers manage api keys enhance streaming capabilities

For Jobs:

ai engineer software developer data scientist machine learning engineer research scientist

Alternative AI tools for req_llm

Similar Open Source Tools

req_llm

github

: 64

pilottai

PilottAI is a Python framework for building autonomous multi-agent systems with advanced orchestration capabilities. It provides enterprise-ready features for building scalable AI applications. The framework includes hierarchical agent systems, production-ready features like asynchronous processing and fault tolerance, advanced memory management with semantic storage, and integrations with multiple LLM providers and custom tools. PilottAI offers specialized agents for various tasks such as customer service, document processing, email handling, knowledge acquisition, marketing, research analysis, sales, social media, and web search. The framework also provides documentation, example use cases, and advanced features like memory management, load balancing, and fault tolerance.

github

: 219

LTEngine

LTEngine is a free and open-source local AI machine translation API written in Rust. It is self-hosted and compatible with LibreTranslate. LTEngine utilizes large language models (LLMs) via llama.cpp, offering high-quality translations that rival or surpass DeepL for certain languages. It supports various accelerators like CUDA, Metal, and Vulkan, with the largest model 'gemma3-27b' fitting on a single consumer RTX 3090. LTEngine is actively developed, with a roadmap outlining future enhancements and features.

github

: 57

UnrealGenAISupport

The Unreal Engine Generative AI Support Plugin is a tool designed to integrate various cutting-edge LLM/GenAI models into Unreal Engine for game development. It aims to simplify the process of using AI models for game development tasks, such as controlling scene objects, generating blueprints, running Python scripts, and more. The plugin currently supports models from organizations like OpenAI, Anthropic, XAI, Google Gemini, Meta AI, Deepseek, and Baidu. It provides features like API support, model control, generative AI capabilities, UI generation, project file management, and more. The plugin is still under development but offers a promising solution for integrating AI models into game development workflows.

github

: 263

acte

Acte is a framework designed to build GUI-like tools for AI Agents. It aims to address the issues of cognitive load and freedom degrees when interacting with multiple APIs in complex scenarios. By providing a graphical user interface (GUI) for Agents, Acte helps reduce cognitive load and constraints interaction, similar to how humans interact with computers through GUIs. The tool offers APIs for starting new sessions, executing actions, and displaying screens, accessible via HTTP requests or the SessionManager class.

github

: 113

MassGen

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The system operates through an architecture designed for seamless multi-agent collaboration, with key features including cross-model/agent synergy, parallel processing, intelligence sharing, consensus building, and live visualization. Users can install the system, configure API settings, and run MassGen for various tasks such as question answering, creative writing, research, development & coding tasks, and web automation & browser tasks. The roadmap includes plans for advanced agent collaboration, expanded model, tool & agent integration, improved performance & scalability, enhanced developer experience, and a web interface.

github

: 436

cua

Cua is a tool for creating and running high-performance macOS and Linux virtual machines on Apple Silicon, with built-in support for AI agents. It provides libraries like Lume for running VMs with near-native performance, Computer for interacting with sandboxes, and Agent for running agentic workflows. Users can refer to the documentation for onboarding, explore demos showcasing AI-Gradio and GitHub issue fixing, and utilize accessory libraries like Core, PyLume, Computer Server, and SOM. Contributions are welcome, and the tool is open-sourced under the MIT License.

github

: 9.6k

mcphub.nvim

MCPHub.nvim is a powerful Neovim plugin that integrates MCP (Model Context Protocol) servers into your workflow. It offers a centralized config file for managing servers and tools, with an intuitive UI for testing resources. Ideal for LLM integration, it provides programmatic API access and interactive testing through the `:MCPHub` command.

github

: 448

pocketgroq

PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.

github

: 178

solana-agent-kit

Solana Agent Kit is an open-source toolkit designed for connecting AI agents to Solana protocols. It enables agents, regardless of the model used, to autonomously perform various Solana actions such as trading tokens, launching new tokens, lending assets, sending compressed airdrops, executing blinks, and more. The toolkit integrates core blockchain features like token operations, NFT management via Metaplex, DeFi integration, Solana blinks, AI integration features with LangChain, autonomous modes, and AI tools. It provides ready-to-use tools for blockchain operations, supports autonomous agent actions, and offers features like memory management, real-time feedback, and error handling. Solana Agent Kit facilitates tasks such as deploying tokens, creating NFT collections, swapping tokens, lending tokens, staking SOL, and sending SPL token airdrops via ZK compression. It also includes functionalities for fetching price data from Pyth and relies on key Solana and Metaplex libraries for its operations.

github

: 1.1k

sgr-deep-research

This repository contains a deep learning research project focused on natural language processing tasks. It includes implementations of various state-of-the-art models and algorithms for text classification, sentiment analysis, named entity recognition, and more. The project aims to provide a comprehensive resource for researchers and developers interested in exploring deep learning techniques for NLP applications.

github

: 416

LLMVoX

LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

github

: 167

instructor

Instructor is a tool that provides structured outputs from Large Language Models (LLMs) in a reliable manner. It simplifies the process of extracting structured data by utilizing Pydantic for validation, type safety, and IDE support. With Instructor, users can define models and easily obtain structured data without the need for complex JSON parsing, error handling, or retries. The tool supports automatic retries, streaming support, and extraction of nested objects, making it production-ready for various AI applications. Trusted by a large community of developers and companies, Instructor is used by teams at OpenAI, Google, Microsoft, AWS, and YC startups.

github

: 11.4k

Neosgenesis

Neogenesis System is an advanced AI decision-making framework that enables agents to 'think about how to think'. It implements a metacognitive approach with real-time learning, tool integration, and multi-LLM support, allowing AI to make expert-level decisions in complex environments. Key features include metacognitive intelligence, tool-enhanced decisions, real-time learning, aha-moment breakthroughs, experience accumulation, and multi-LLM support.

github

: 994

freeGPT

freeGPT provides free access to text and image generation models. It supports various models, including gpt3, gpt4, alpaca_7b, falcon_40b, prodia, and pollinations. The tool offers both asynchronous and non-asynchronous interfaces for text completion and image generation. It also features an interactive Discord bot that provides access to all the models in the repository. The tool is easy to use and can be integrated into various applications.

github

: 361

dingo

Dingo is a data quality evaluation tool that automatically detects data quality issues in datasets. It provides built-in rules and model evaluation methods, supports text and multimodal datasets, and offers local CLI and SDK usage. Dingo is designed for easy integration into evaluation platforms like OpenCompass.

github

: 109

For similar tasks

req_llm

github

: 64

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

LocalAI

LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.

github

: 35.2k

AiTreasureBox

AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

github

: 368

glide

Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.

github

: 110

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

infinity

Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.

github

: 4.0k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 669

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k