req_llm
Req plugin to query AI providers
Stars: 420
ReqLLM is a Req-based library for LLM interactions, offering a unified interface to AI providers through a plugin-based architecture. It brings composability and middleware advantages to LLM interactions, with features like auto-synced providers/models, typed data structures, ergonomic helpers, streaming capabilities, usage & cost extraction, and a plugin-based provider system. Users can easily generate text, structured data, embeddings, and track usage costs. The tool supports various AI providers like Anthropic, OpenAI, Groq, Google, and xAI, and allows for easy addition of new providers. ReqLLM also provides API key management, detailed documentation, and a roadmap for future enhancements.
README:
Join the community! Come chat about building AI tools with Elixir and coding Elixir with LLMs in The Swarm: Elixir AI Collective Discord server.
A Req-based package to call LLM APIs that standardizes the API calls and responses for LLM providers.
LLM APIs are inconsistent. ReqLLM provides a unified, idiomatic Elixir interface with standardized requests and responses across providers.
Two-layer architecture:
-
High-level API – Vercel AI SDK-inspired functions (
generate_text/3,stream_text/3,generate_object/4and more) that work uniformly across providers. Standard features, minimal configuration. - Low-level API – Direct Req plugin access for full HTTP control. Built around OpenAI Chat Completions baseline with provider-specific callbacks for non-compatible APIs (e.g., Anthropic).
16 Supported Providers:
| Provider | ID | Guide |
|---|---|---|
| Anthropic | anthropic |
Guide |
| OpenAI | openai |
Guide |
| Google Gemini | google |
Guide |
| Google Vertex AI | google_vertex |
Guide |
| Amazon Bedrock | amazon_bedrock |
Guide |
| Azure OpenAI | azure |
Guide |
| Groq | groq |
Guide |
| xAI | xai |
Guide |
| OpenRouter | openrouter |
Guide |
| Cerebras | cerebras |
Guide |
| Meta Llama | meta |
Guide |
| Z.AI | zai |
Guide |
| Z.AI Coder | zai_coder |
Guide |
| Zenmux | zenmux |
Guide |
| Venice | venice |
— |
| vLLM | vllm |
Ollama |
* Streaming uses Finch directly due to known Req limitations with SSE responses.
The fastest way to get started is with Igniter:
mix igniter.install req_llmAdd req_llm to your list of dependencies in mix.exs:
def deps do
[
{:req_llm, "~> 1.6"}
]
endThen run:
mix deps.get# Keys are picked up from .env files or environment variables - see `ReqLLM.Keys`
model = "anthropic:claude-haiku-4-5"
ReqLLM.generate_text!(model, "Hello world")
#=> "Hello! How can I assist you today?"
schema = [name: [type: :string, required: true], age: [type: :pos_integer]]
person = ReqLLM.generate_object!(model, "Generate a person", schema)
#=> %{name: "John Doe", age: 30}
{:ok, image_response} = ReqLLM.generate_image("openai:gpt-image-1", "A simple red square")
image_bytes = ReqLLM.Response.image_data(image_response)
File.write!("red_square.png", image_bytes)
Note: Google image models gemini-2.5-flash-image and gemini-3-pro-image-preview reject :n; specify the image count in the prompt.
{:ok, response} = ReqLLM.generate_text(
model,
ReqLLM.Context.new([
ReqLLM.Context.system("You are a helpful coding assistant"),
ReqLLM.Context.user("Explain recursion in Elixir")
]),
temperature: 0.7,
max_tokens: 200
)
{:ok, response} = ReqLLM.generate_text(
model,
"What's the weather in Paris?",
tools: [
ReqLLM.tool(
name: "get_weather",
description: "Get current weather for a location",
parameter_schema: [
location: [type: :string, required: true, doc: "City name"]
],
callback: {Weather, :fetch_weather, [:extra, :args]}
)
]
)
# Streaming text generation
{:ok, response} = ReqLLM.stream_text(model, "Write a short story")
ReqLLM.StreamResponse.tokens(response)
|> Stream.each(&IO.write/1)
|> Stream.run()
# Access usage metadata after streaming
usage = ReqLLM.StreamResponse.usage(response)-
Provider-agnostic model registry
- 45 providers / 665+ models sourced from models.dev via the
llm_dbdependency - Cost, context length, modality, capability and deprecation metadata included
- 45 providers / 665+ models sourced from models.dev via the
-
Canonical data model
- Typed
Context,Message,ContentPart,Tool,StreamChunk,Response,Usage - Multi-modal content parts (text, image URL, tool call, binary)
- All structs implement
Jason.Encoderfor simple persistence / inspection
- Typed
-
Two client layers
- Low-level Req plugin with full HTTP control (
Provider.prepare_request/4,attach/3) - High-level Vercel-AI style helpers (
generate_text/3,stream_text/3,generate_object/4, bang variants)
- Low-level Req plugin with full HTTP control (
-
Structured object generation
-
generate_object/4renders JSON-compatible Elixir maps validated by a NimbleOptions-compiled schema - Zero-copy mapping to provider JSON-schema / function-calling endpoints
- OpenAI native structured outputs with three modes (
:auto(default),:json_schema,:tool_strict)
-
-
Provider-specific capabilities
- Anthropic web search for real-time content access (via
provider_options: [web_search: %{max_uses: 5}]) - Extended thinking/reasoning for supported models
- Prompt caching for cost optimization
- All provider-specific options documented in provider guides
- Anthropic web search for real-time content access (via
-
Embedding generation
- Single or batch embeddings via
Embedding.generate/3(Not all providers support this) - Automatic dimension / encoding validation and usage accounting
- Single or batch embeddings via
-
Production-grade streaming
-
stream_text/3returns aStreamResponsewith both real-time tokens and async metadata - Finch-based streaming with HTTP/2 multiplexing and automatic connection pooling
- Concurrent metadata collection (usage, finish_reason) without blocking token flow
- Works uniformly across providers with internal SSE / chunked-response adaptation
-
-
Usage & cost tracking
-
response.usageexposes input/output tokens and USD cost, calculated from model metadata or provider invoices
-
-
Schema-driven option validation
- All public APIs validate options with NimbleOptions; errors are raised as
ReqLLM.Error.Invalid.*(Splode)
- All public APIs validate options with NimbleOptions; errors are raised as
-
Automatic parameter translation & codecs
- Provider DSL translates canonical options (e.g.
max_tokens->max_completion_tokensfor o1 & o3) to provider-specific names - Built-in OpenAI-style encoding/decoding with provider callback overrides for custom formats
- Provider DSL translates canonical options (e.g.
-
Flexible model specification
- Accepts
"provider:model",{:provider, "model", opts}tuples, or%ReqLLM.Model{}structs - Helper functions for parsing, introspection and default-merging
- Accepts
-
Secure, layered key management (
ReqLLM.Keys)- Per-request override → application config → env vars / .env files
-
Extensive reliability tooling
- Fixture-backed test matrix (
LiveFixture) supports cached, live, or provider-filtered runs - Dialyzer, Credo strict rules, and no-comment enforcement keep code quality high
- Fixture-backed test matrix (
ReqLLM makes key management as easy and flexible as possible - this needs to just work.
Please submit a PR if your key management use case is not covered
Keys are pulled from multiple sources with clear precedence: per-request override → in-memory storage → application config → environment variables → .env files.
# Store keys in memory (recommended)
ReqLLM.put_key(:openai_api_key, "sk-...")
ReqLLM.put_key(:anthropic_api_key, "sk-ant-...")
# Retrieve keys with source info
{:ok, key, source} = ReqLLM.get_key(:openai)All functions accept an api_key parameter to override the stored key:
ReqLLM.generate_text("anthropic:claude-haiku-4-5", "Hello", api_key: "sk-ant-...")
{:ok, response} = ReqLLM.stream_text("anthropic:claude-haiku-4-5", "Story", api_key: "sk-ant-...")By default, ReqLLM loads .env files from the current working directory at startup. To disable this behavior (e.g., if you manage environment variables yourself):
config :req_llm, load_dotenv: falseEvery response includes detailed usage and cost information calculated from model metadata:
{:ok, response} = ReqLLM.generate_text("anthropic:claude-haiku-4-5", "Hello")
response.usage
#=> %{
# input_tokens: 8,
# output_tokens: 12,
# total_tokens: 20,
# input_cost: 0.00024,
# output_cost: 0.00036,
# total_cost: 0.0006
# }When using web search or generating images, additional usage metadata is available:
# Web search usage (Anthropic, OpenAI, xAI, Google)
{:ok, response} = ReqLLM.generate_text(model, prompt,
provider_options: [web_search: %{max_uses: 5}])
response.usage.tool_usage
#=> %{web_search: %{count: 2, unit: "call"}}
response.usage.cost
#=> %{tokens: 0.001, tools: 0.02, images: 0.0, total: 0.021}
# Image generation usage
{:ok, response} = ReqLLM.generate_image("openai:gpt-image-1", prompt)
response.usage.image_usage
#=> %{generated: %{count: 1, size_class: "1024x1024"}}A telemetry event [:req_llm, :token_usage] is published on every request with token counts and calculated costs.
See lib/examples/scripts/usage_cost_search_image.exs for a multi-provider smoke test that validates search tool and image generation cost metadata. For comprehensive documentation, see the Usage & Billing Guide.
ReqLLM uses Finch for streaming connections with automatic connection pooling. By default, we use HTTP/1-only pools to work around a known Finch bug with large request bodies:
# Default configuration (automatic)
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http1], size: 1, count: 8]
}
]Important: Due to Finch issue #265, HTTP/2 pools may fail when sending request bodies larger than 64KB (large prompts, extensive context windows). This is a bug in Finch's HTTP/2 flow control implementation, not a limitation of HTTP/2 itself.
If you want to use HTTP/2 pools (e.g., for performance testing or if you know your prompts are small), you can configure it:
# HTTP/2 configuration (use with caution)
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http2, :http1], size: 1, count: 8]
}
]ReqLLM will error with a helpful message if you try to send a large request body with HTTP/2 pools. The error will reference this section for configuration guidance.
For high-scale deployments with small prompts, you can increase the connection count:
# High-scale configuration
config :req_llm,
finch: [
name: ReqLLM.Finch,
pools: %{
:default => [protocols: [:http1], size: 1, count: 32] # More connections
}
]Advanced users can specify custom Finch instances per request:
{:ok, response} = ReqLLM.stream_text(model, messages, finch_name: MyApp.CustomFinch)The new StreamResponse provides flexible access patterns:
# Real-time streaming for UI
{:ok, response} = ReqLLM.stream_text(model, "Tell me a story")
ReqLLM.StreamResponse.tokens(response)
|> Stream.each(&broadcast_to_liveview/1)
|> Stream.run()
# Concurrent metadata collection (non-blocking)
Task.start(fn ->
usage = ReqLLM.StreamResponse.usage(response)
log_usage(usage)
end)
# Simple text collection
text = ReqLLM.StreamResponse.text(response)
# Backward compatibility with legacy Response
{:ok, legacy_response} = ReqLLM.StreamResponse.to_response(response)ReqLLM uses OpenAI Chat Completions as the baseline API standard. Providers that support this format (like Groq, OpenRouter, xAI) require minimal overrides using the ReqLLM.Provider.DSL. Model metadata is automatically synced from models.dev.
Providers implement the ReqLLM.Provider behavior with functions like encode_body/1, decode_response/1, and optional parameter translation via translate_options/3.
See the Adding a Provider Guide for detailed implementation instructions.
For advanced use cases, you can use ReqLLM providers directly as Req plugins. This is the canonical implementation used by ReqLLM.generate_text/3:
# The canonical pattern from ReqLLM.Generation.generate_text/3
with {:ok, model} <- ReqLLM.Model.from("anthropic:claude-haiku-4-5"), # Parse model spec
{:ok, provider_module} <- ReqLLM.provider(model.provider), # Get provider module
{:ok, request} <- provider_module.prepare_request(:chat, model, "Hello!", temperature: 0.7), # Build Req request
{:ok, %Req.Response{body: response}} <- Req.request(request) do # Execute HTTP request
{:ok, response}
end
# Customize the Req pipeline with additional headers or middleware
{:ok, model} = ReqLLM.Model.from("anthropic:claude-haiku-4-5")
{:ok, provider_module} = ReqLLM.provider(model.provider)
{:ok, request} = provider_module.prepare_request(:chat, model, "Hello!", temperature: 0.7)
# Add custom headers or middleware before sending
custom_request =
request
|> Req.Request.put_header("x-request-id", "my-custom-id")
|> Req.Request.put_header("x-source", "my-app")
{:ok, response} = Req.request(custom_request)This approach gives you full control over the Req pipeline, allowing you to add custom middleware, modify requests, or integrate with existing Req-based applications.
- Getting Started – first call and basic concepts
- Configuration – timeouts, connection pools, and global settings
- Core Concepts – architecture & data model
- Data Structures – detailed type information
- Usage & Billing – token costs, tool usage, image costs
- Image Generation – generating images with OpenAI and Google
- Mix Tasks – model sync, compatibility testing, code generation
- Fixture Testing – model validation and supported models
- Adding a Provider – extend with new providers
- Provider Guides: Anthropic, OpenAI, Google, Google Vertex, xAI, Groq, OpenRouter, Amazon Bedrock, Azure, Cerebras, Meta, Z.AI, Z.AI Coder, Zenmux, Ollama/vLLM
ReqLLM has now reached v1.0.0. The core API is stable and ready for production use. We're continuing to refine the library and would love community feedback as we plan the next set of improvements. If you run into anything or have suggestions, please open an issue or PR.
130+ models currently pass our comprehensive fixture-based test suite across 10 providers. The LLM API landscape is highly dynamic. We guarantee that all supported models pass our fixture tests for basic functionality (text generation, streaming, tool calling, structured output, and embeddings where applicable).
These fixture tests are regularly refreshed against live APIs to ensure accuracy and catch provider-side changes. While we can't guarantee every edge case in production, our fixture-based approach provides a reliable baseline that you can verify with mix mc "*:*".
We welcome bug reports and feedback! If you encounter issues with any supported model, please open a GitHub issue with details. The more feedback we receive, the stronger the code will be!
# Install dependencies
mix deps.get
# Run tests with cached fixtures
mix test
# Run quality checks
mix quality # format, compile, dialyzer, credo
# Generate documentation
mix docsTests use cached JSON fixtures by default. To regenerate fixtures against live APIs (optional):
# Regenerate all fixtures
LIVE=true mix test
# Regenerate specific provider fixtures using test tags
LIVE=true mix test --only "provider:anthropic"We welcome contributions! ReqLLM uses a fixture-based testing approach to ensure reliability across all providers.
Please read CONTRIBUTING.md for detailed guidelines on:
- Core library contributions
- Adding new providers
- Extending provider features
- Testing requirements and fixture generation
- Code quality standards
Quick start:
- Fork the repository
- Create a feature branch
- Add tests with fixtures for your changes
- Run
mix testandmix qualityto ensure standards - Verify
mix mc "*:*"passes for affected providers - Submit a pull request
Copyright 2025 Mike Hostetler
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for req_llm
Similar Open Source Tools
req_llm
ReqLLM is a Req-based library for LLM interactions, offering a unified interface to AI providers through a plugin-based architecture. It brings composability and middleware advantages to LLM interactions, with features like auto-synced providers/models, typed data structures, ergonomic helpers, streaming capabilities, usage & cost extraction, and a plugin-based provider system. Users can easily generate text, structured data, embeddings, and track usage costs. The tool supports various AI providers like Anthropic, OpenAI, Groq, Google, and xAI, and allows for easy addition of new providers. ReqLLM also provides API key management, detailed documentation, and a roadmap for future enhancements.
LightRAG
LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.
curator
Bespoke Curator is an open-source tool for data curation and structured data extraction. It provides a Python library for generating synthetic data at scale, with features like programmability, performance optimization, caching, and integration with HuggingFace Datasets. The tool includes a Curator Viewer for dataset visualization and offers a rich set of functionalities for creating and refining data generation strategies.
jido_ai
Jido.AI is an AI integration layer for the Jido ecosystem, providing a toolkit for building intelligent agents with LLMs. It implements reasoning strategies for tool use, multi-step reasoning, and complex planning to enhance results from language models.
dspy.rb
DSPy.rb is a Ruby framework for building reliable LLM applications using composable, type-safe modules. It enables developers to define typed signatures and compose them into pipelines, offering a more structured approach compared to traditional prompting. The framework embraces Ruby conventions and adds innovations like CodeAct agents and enhanced production instrumentation, resulting in scalable LLM applications that are robust and efficient. DSPy.rb is actively developed, with a focus on stability and real-world feedback through the 0.x series before reaching a stable v1.0 API.
instructor
Instructor is a tool that provides structured outputs from Large Language Models (LLMs) in a reliable manner. It simplifies the process of extracting structured data by utilizing Pydantic for validation, type safety, and IDE support. With Instructor, users can define models and easily obtain structured data without the need for complex JSON parsing, error handling, or retries. The tool supports automatic retries, streaming support, and extraction of nested objects, making it production-ready for various AI applications. Trusted by a large community of developers and companies, Instructor is used by teams at OpenAI, Google, Microsoft, AWS, and YC startups.
evalchemy
Evalchemy is a unified and easy-to-use toolkit for evaluating language models, focusing on post-trained models. It integrates multiple existing benchmarks such as RepoBench, AlpacaEval, and ZeroEval. Key features include unified installation, parallel evaluation, simplified usage, and results management. Users can run various benchmarks with a consistent command-line interface and track results locally or integrate with a database for systematic tracking and leaderboard submission.
logicstamp-context
LogicStamp Context is a static analyzer that extracts deterministic component contracts from TypeScript codebases, providing structured architectural context for AI coding assistants. It helps AI assistants understand architecture by extracting props, hooks, and dependencies without implementation noise. The tool works with React, Next.js, Vue, Express, and NestJS, and is compatible with various AI assistants like Claude, Cursor, and MCP agents. It offers features like watch mode for real-time updates, breaking change detection, and dependency graph creation. LogicStamp Context is a security-first tool that protects sensitive data, runs locally, and is non-opinionated about architectural decisions.
metis
Metis is an open-source, AI-driven tool for deep security code review, created by Arm's Product Security Team. It helps engineers detect subtle vulnerabilities, improve secure coding practices, and reduce review fatigue. Metis uses LLMs for semantic understanding and reasoning, RAG for context-aware reviews, and supports multiple languages and vector store backends. It provides a plugin-friendly and extensible architecture, named after the Greek goddess of wisdom, Metis. The tool is designed for large, complex, or legacy codebases where traditional tooling falls short.
pentagi
PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. It is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests. The tool provides secure and isolated operations in a sandboxed Docker environment, fully autonomous AI-powered agent for penetration testing steps, a suite of 20+ professional security tools, smart memory system for storing research results, web intelligence for gathering information, integration with external search systems, team delegation system, comprehensive monitoring and reporting, modern interface, API integration, persistent storage, scalable architecture, self-hosted solution, flexible authentication, and quick deployment through Docker Compose.
quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.
volga
Volga is a general purpose real-time data processing engine in Python for modern AI/ML systems. It aims to be a Python-native alternative to Flink/Spark Streaming with extended functionality for real-time AI/ML workloads. It provides a hybrid push+pull architecture, Entity API for defining data entities and feature pipelines, DataStream API for general data processing, and customizable data connectors. Volga can run on a laptop or a distributed cluster, making it suitable for building custom real-time AI/ML feature platforms or general data pipelines without relying on third-party platforms.
agentpress
AgentPress is a collection of simple but powerful utilities that serve as building blocks for creating AI agents. It includes core components for managing threads, registering tools, processing responses, state management, and utilizing LLMs. The tool provides a modular architecture for handling messages, LLM API calls, response processing, tool execution, and results management. Users can easily set up the environment, create custom tools with OpenAPI or XML schema, and manage conversation threads with real-time interaction. AgentPress aims to be agnostic, simple, and flexible, allowing users to customize and extend functionalities as needed.
langchainrb
Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.
plexe
Plexe is a tool that allows users to create machine learning models by describing them in plain language. Users can explain their requirements, provide a dataset, and the AI-powered system will build a fully functional model through an automated agentic approach. It supports multiple AI agents and model building frameworks like XGBoost, CatBoost, and Keras. Plexe also provides Docker images with pre-configured environments, YAML configuration for customization, and support for multiple LiteLLM providers. Users can visualize experiment results using the built-in Streamlit dashboard and extend Plexe's functionality through custom integrations.
LTEngine
LTEngine is a free and open-source local AI machine translation API written in Rust. It is self-hosted and compatible with LibreTranslate. LTEngine utilizes large language models (LLMs) via llama.cpp, offering high-quality translations that rival or surpass DeepL for certain languages. It supports various accelerators like CUDA, Metal, and Vulkan, with the largest model 'gemma3-27b' fitting on a single consumer RTX 3090. LTEngine is actively developed, with a roadmap outlining future enhancements and features.
For similar tasks
req_llm
ReqLLM is a Req-based library for LLM interactions, offering a unified interface to AI providers through a plugin-based architecture. It brings composability and middleware advantages to LLM interactions, with features like auto-synced providers/models, typed data structures, ergonomic helpers, streaming capabilities, usage & cost extraction, and a plugin-based provider system. Users can easily generate text, structured data, embeddings, and track usage costs. The tool supports various AI providers like Anthropic, OpenAI, Groq, Google, and xAI, and allows for easy addition of new providers. ReqLLM also provides API key management, detailed documentation, and a roadmap for future enhancements.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
LocalAI
LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
glide
Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.
jupyter-ai
Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.
langchain_dart
LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).
infinity
Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.