
bellman
Golang lib for LLM APIs, ChatGPT, Gemini and Anthropic
Stars: 57

Bellman is a unified interface to interact with language and embedding models, supporting various vendors like VertexAI/Gemini, OpenAI, Anthropic, VoyageAI, and Ollama. It consists of a library for direct interaction with models and a service 'bellmand' for proxying requests with one API key. Bellman simplifies switching between models, vendors, and common tasks like chat, structured data, tools, and binary input. It addresses the lack of official SDKs for major players and differences in APIs, providing a single proxy for handling different models. The library offers clients for different vendors implementing common interfaces for generating and embedding text, enabling easy interchangeability between models.
README:
It tries to be unified interface to interact with LLMs and embedding models.
In particular, it seeks to make it easier to switch between models and vendors
along woth lowering tha barrier to get started.
Bellman supports VertexAI/Gemini
, OpenAI
, Anthropic
, VoyageAI
and Ollama
Bellman consists of two parts. The library and the service.
The go library enables you to interact with the different LLM vendors directly while the service,
bellmand
creates a proxy service that lets you connect to all providers with one api key.
Bellman supports the common things that we expect in modern llm models. Chat, Structured, Tools and binary input.
This project was built to the lack of official sdk/clients for the major players along with the slight differences in API. It also became clear when we started to play around with different LLMs in our projects that the differences, while slight, had implications and for each new model introduced it became an overhead. There are other projects out there, like go version of langchain, that deals with some of it. But having one proxy to hadle all types of models made things alot easier for us to iterate over problems, models and solutions.
bellmand
is a simple web service that implements the bellman library and exposes it through a http api
The easiest way to get started is to simply run it as a docker service.
- Docker being installed
- API Keys to Anthropic, OpenAI, VertexAI(Google Gemini) and/or VoyageAI
- Installing Ollama, https://ollama.com/ (very cool project imho)
## Help / Man
docker run --rm -it modfin/bellman --help
## Example
docker run --rm -d modfin/bellman \
--prometheus-metrics-basic-auth="user:pass"
--ollama-url=http://localhost:11434 \
--openai-key="$(cat ./credentials/openai-api-key.txt)" \
--anthropic-key="$(cat ./credentials/anthropic-api-key.txt)" \
--voyageai-key="$(cat ./credentials/voyageai-api-key.txt)" \
--google-credential="$(cat ./credentials/google-service-account.json)" \
--google-project=your-google-project \
--google-region=europe-north1 \
--api-key=qwerty
This will start the bellmand service that proxies requests to the model you define in the request.
go get github.com/modfin/bellman
The library provides clients for Anthropic, Ollama, OpenAI, VertexAI, VoyageAI and Bellmand itself.
All the clients implement the same interfaces, gen.Generator
and embed.Embeder
,
and can there for be used interchangeably.
client, err := anthropic.New(...)
client, err := ollama.New(...)
client, err := openai.New(...)
client, err := vertexai.New(...)
client, err := voyageai.New(...)
client, err := bellman.New(...)
The benefit of using the bellman client,
when you are running bellmand
,
is that we can interchangeably use any model that we wish to interact with.
client, err := bellman.New(...)
llm := client.Generator()
res, err := llm.Model(openai.GenModel_gpt4o_mini).
Prompt(
prompt.AsUser("What company made you?"),
)
fmt.Println(res, err)
// OpenAI
res, err := llm.Model(vertexai.GenModel_gemini_1_5_flash).
Prompt(
prompt.AsUser("What company made you?"),
)
fmt.Println(res, err)
// Google
// or even a custom model that you created yourself (trained)
// or a new model that is not in the library yet
model := gen.Model{
Provider: vertexai.Provider,
Name: "gemini-2.0-flash-exp",
Config: map[string]interface{}{"region": "us-central1"},
}
res, err := llm.Model(model).
Prompt(
prompt.AsUser("What company made you?"),
)
fmt.Println(res, err)
// Google
Just normal conversation mode
res, err := openai.New(apiKy).Generator().
Model(openai.GenModel_gpt4o_mini).
Prompt(
prompt.AsUser("What is the distance to the moon?"),
)
if err != nil {
log.Fatalf("Prompt() error = %v", err)
}
awnser, err := res.AsText()
fmt.Println(awnser, err)
// The average distance from Earth to the Moon is approximately 384,400 kilometers
// (about 238,855 miles). This distance can vary slightly because the Moon's orbit
// is elliptical, ranging from about 363,300 km (225,623 miles) at its closest
// (perigee) to 405,500 km (251,966 miles) at its farthest (apogee). <nil>
Just normal conversation mode
res, err := openai.New(apiKey).Generator().
Model(openai.GenModel_gpt4o_mini).
System("You are a expert movie quoter and lite fo finish peoples sentences with a movie reference").
Prompt(
prompt.AsUser("Who are you going to call?"),
)
if err != nil {
log.Fatalf("Prompt() error = %v", err)
}
awnser, err := res.AsText()
fmt.Println(awnser, err)
// Ghostbusters! <nil>
Setting things like temperature, max tokens, top p, and stop secuences
res, err := openai.New(apiKey).Generator().
Model(openai.GenModel_gpt4o_mini).
Temperature(0.5).
MaxTokens(100).
TopP(0.9). // should really not be used with temperature
StopAt(".", "!", "?").
Prompt(
prompt.AsUser("Write me a 2 paragraph text about gophers"),
)
if err != nil {
log.Fatalf("Prompt() error = %v", err)
}
awnser, err := res.AsText()
fmt.Println(awnser, err)
// Gophers are small,
// burrowing rodents belonging to the family Geomyidae,
// primarily found in North America
From many models, you can now specify a schema that you want the models to output.
A supporting library that can transforming your go struct to json schema is provided. github.com/modfin/bellman/schema
There are a few different annotations that you can use on your golang structs to enrich the corresponding json schema.
Annotation | Description | Supported |
---|---|---|
json-description | A description of the field, overrides the default description value | * |
json-type | The type of the field, overrides the default type value | * |
json-enum | A list of possible values for the field. Can be used with: slices, string, number, integer, boolean | * |
json-maximum | The maximum value for the field. Can be used with: number, integer | VertexAI, OpenAI |
json-minimum | The minimum value for the field. Can be used with: number, integer | VertexAI, OpenAI |
json-exclusive-maximum | The exclusive maximum value for the field. Can be used with: number, integer | VertexAI, OpenAI |
json-exclusive-minimum | The exclusive minimum value for the field. Can be used with: number, integer | VertexAI, OpenAI |
json-max-items | The maximum number of items in the array. Can be used with: slices | VertexAI, OpenAI |
json-min-items | The minimum number of items in the array. Can be used with: slices | VertexAI, OpenAI |
json-max-length | The maximum length of the string. Can be used with: string | OpenAI |
json-min-length | The minimum length of the string. Can be used with: string | OpenAI |
json-format | Format of a string, one of: date-time, time, date, duration, email, hostname, ipv4, ipv6, uuid. Can be used with: string | VertexAI, OpenAI |
json-pattern | Regex pattern of a string. Can be used with: string | OpenAI |
type Quote struct {
Character string `json:"character"`
Quote string `json:"quote"`
}
type Responese struct {
Quotes []Quote `json:"quotes"`
}
llm := vertexai.New(googleConfig).Generator()
res, err := llm.
Model(vertexai.GenModel_gemini_1_5_pro).
Output(schema.From(Responese{})).
Prompt(
prompt.AsUser("give me 3 quotes from different characters in Hamlet"),
)
if err != nil {
log.Fatalf("Prompt() error = %v", err)
}
awnser, err := res.AsText() // will return the json of the struct
fmt.Println(awnser, err)
//{
// "quotes": [
// {
// "character": "Hamlet",
// "quote": "To be or not to be, that is the question."
// },
// {
// "character": "Polonius",
// "quote": "This above all: to thine own self be true."
// },
// {
// "character": "Queen Gertrude",
// "quote": "The lady doth protest too much, methinks."
// }
// ]
//} <nil>
var result Result
err := res.Unmarshal(&result) // Just a shorthand to marshal it into your struct
fmt.Println(result, err)
// {[
// {Hamlet To be or not to be, that is the question.}
// {Polonius This above all: to thine own self be true.}
// {Queen Gertrude The lady doth protest too much, methinks.}
// ]} <nil>
The Bellman library allows you to define and use tools in your prompts. Here is an example of how to define and use a tool:
-
Define a tool:
type Args struct { Name string `json:"name"` } getQuote := tools.NewTool("get_quote", tools.WithDescription( "a function to get a quote from a person or character in Hamlet", ), tools.WithArgSchema(Args{}), tools.WithCallback(func(jsondata string) (string, error) { var arg Args err := json.Unmarshal([]byte(jsondata), &arg) if err != nil { return "",err } return dao.GetQuoateFrom(arg.Name) }), )
-
Use the tool in a prompt:
res, err := anthopic.New(apiKey).Generator(). Model(anthropic.GenModel_3_5_haiku_latest)). System("You are a Shakespeare quote generator"). Tools(getQuote). // Configure a specific too to be used, or the setting for it Tool(tools.RequiredTool). Prompt( prompt.AsUser("Give me 3 quotes from different characters"), ) if err != nil { log.Fatalf("Prompt() error = %v", err) } // Evaluate with callback function err = res.Eval() if err != nil { log.Fatalf("Eval() error = %v", err) } // or Evaluate your self tools, err := res.Tools() if err != nil { log.Fatalf("Tools() error = %v", err) } for _, tool := range tools { log.Printf("Tool: %s", tool.Name) switch tool.Name { // .... } }
Images is supported by Gemini, OpenAI and Anthropic.
PDFs is only supported by Gemini and Anthropic
image := "/9j/4AAQSkZJRgABAQEBLAEsAAD//g......gM4OToWbsBg5mGu0veCcRZO6f0EjK5Jv5X/AP/Z"
data, err := base64.StdEncoding.DecodeString(image)
if err != nil {
t.Fatalf("could not decode image %v", err)
}
res, err := llm.
Prompt(
prompt.AsUserWithData(prompt.MimeImageJPEG, data),
prompt.AsUser("Describe the image to me"),
)
if err != nil {
t.Fatalf("Prompt() error = %v", err)
}
fmt.Println(res.AsText())
// The image contains the word "Hot!" in red text. The text is centered on a white background.
// The exclamation point is after the word. The image is a simple and straightforward
// depiction of the word "hot." <nil>
pdf, err := os.ReadFile("path/to/pdf")
if err != nil {
t.Fatalf("could open file, %v", err)
}
res, err := anthopic.New(apiKey).Generator().
Prompt(
prompt.AsUserWithData(prompt.MimeApplicationPDF, pdf),
prompt.AsUser("Describe to me what is in the PDF"),
)
if err != nil {
t.Fatalf("Prompt() error = %v", err)
}
fmt.Println(res.AsText())
// The image contains the word "Hot!" in red text. The text is centered on a white background.
// The exclamation point is after the word. The image is a simple and straightforward
// depiction of the word "hot." <nil>
Control reasoning by setting the budget for the reasoning tokens. Determine whether to return the reasoning data or not. The default thinking/reasoning behaviour is different depending on the model you are using.
res, err := anthropic.New(apiKey).Generator().
Model(anthropic.GenModel_4_0_sonnet_20250514).
MaxTokens(3000).
ThinkingBudget(2000). // the budget for reasoning tokens, set to 0 to disable reasoning if supported by the selected model
IncludeThinkingParts(true). // if available, includes the reasoning parts in the response (will be summaries for some models)
Prompt(
prompt.AsUser("What is 27 * 453?"),
)
if err != nil {
log.Fatalf("Prompt() error = %v", err)
}
answer, err := res.AsText()
fmt.Println(awnser, err)
Some providers have specific configuration that is not supported by the common interface.
You can set these options manually on the gen.Model.Config
struct.
model := gen.Model{
Provider: openai.Provider,
Name: openai.GenModel_gpt5_mini_latest.Name,
Config: map[string]interface{}{
"service_tier": openai.ServiceTierPriority,
},
}
// prompt..
The returned metadata will then contain the service_tier used.
Supporter lib for simple agentic tasks
type GetQuoteArg struct {
StockId int `json:"stock_id" json-description:"the id of a stock for which quote to get"`
}
type Search struct {
Name string `json:"name" json-description:"the name of a stock being looked for"`
}
getQuote := tools.NewTool("get_quote",
tools.WithDescription("a function get a stock quote based on stock id"),
tools.WithArgSchema(GetQuoteArg{}),
tools.WithCallback(func (jsondata string) (string, error) {
var arg GetQuoteArg
err := json.Unmarshal([]byte(jsondata), &arg)
if err != nil {
return "", err
}
return `{"stock_id": ` + strconv.Itoa(arg.StockId) + `,"price": 123.45}`, nil
}),
)
getStock := tools.NewTool("get_stock",
tools.WithDescription("a function a stock based on name"),
tools.WithArgSchema(Search{}),
tools.WithCallback(func (jsondata string) (string, error) {
var arg GetQuoteArg
err := json.Unmarshal([]byte(jsondata), &arg)
if err != nil {
return "", err
}
return `{"stock_id": 98765}`, nil
}),
)
type Result struct {
StockId int `json:"stock_id"`
Price float64 `json:"price"`
}
llm := anthopic.New(apiKey).Generator()
llm = llm.SetTools(getQuote, getStock)
res, err := agent.Run[Result](5, llm, prompt.AsUser("Get me the price of Volvo B"))
if err != nil {
t.Fatalf("Prompt() error = %v", err)
}
fmt.Printf("==== Result after %d calls ====\n", res.Depth)
fmt.Printf("%+v\n", res.Result)
fmt.Printf("==== Conversation ====\n")
for _, p := range res.Promps {
fmt.Printf("%s: %s\n", p.Role, p.Text)
}
// ==== Result after 2 calls ====
// {StockId:98765 Price:123.45}
// ==== Conversation ====
// user: Get me the price of Volvo B
// assistant: tool function call: get_stock with argument: {"name":"Volvo B"}
// user: result: get_stock => {"stock_id": 98765}
// assistant: tool function call: get_quote with argument: {"stock_id":98765}
// user: result: get_quote => {"stock_id": 98765,"price": 123.45}
// assistant: tool function call: __return_result_tool__ with argument: {"price":123.45,"stock_id":98765}
Bellman integrates with most the embedding models as well as the LLMs that is provided by the supported providers. There is also a VoyageAI, voyageai.com, that only really deals with embeddings.
client := bellman_client := bellman.New(...)
res, err := client.Embed(embed.NewSingleRequest(
context.Background(),
vertexai.EmbedModel_text_005.WithType(embed.TypeDocument),
"The document to embed",
))
fmt.Println(res.SingleAsFloat32())
// [-0.06821047514677048 -0.00014664272021036595 0.011814368888735771 ....], nil
Or using the query type.
client := bellman_client := bellman.New(...)
res, err := client.Embed(embed.NewSingleRequest(
context.Background(),
vertexai.EmbedModel_text_005.WithType(embed.TypeQuery),
"The query to embed",
))
fmt.Println(res.SingleAsFloat32())
// [-0.06821047514677048 -0.00014664272021036595 0.011814368888735771 ....], nil
Bellman also supports context aware embeddings. As of now, only with VoyageAI models.
res, err := client.Embed(embed.NewDocumentRequest(
context.Background(),
voyageai.EmbedModel_voyage_context_3.WithType(embed.TypeDocument),
[]string{"document_chunk_1", "document_chunk_2", "document_chunk_3", ...},
))
fmt.Println(res.AsFloat64())
// [[-0.06821047514677048 ...], [0.011814368888735771 ....], ...], nil
Some embeddings models support specific types of input.
The API allows you to define what type of text you are sending.
For example embed.TypeDocument
for initial embedding and embed.TypeQuery
for getting a vector that is to be compared
This project is licensed under the MIT License. See the LICENSE
file for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for bellman
Similar Open Source Tools

bellman
Bellman is a unified interface to interact with language and embedding models, supporting various vendors like VertexAI/Gemini, OpenAI, Anthropic, VoyageAI, and Ollama. It consists of a library for direct interaction with models and a service 'bellmand' for proxying requests with one API key. Bellman simplifies switching between models, vendors, and common tasks like chat, structured data, tools, and binary input. It addresses the lack of official SDKs for major players and differences in APIs, providing a single proxy for handling different models. The library offers clients for different vendors implementing common interfaces for generating and embedding text, enabling easy interchangeability between models.

LightRAG
LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.

UniChat
UniChat is a pipeline tool for creating online and offline chat-bots in Unity. It leverages Unity.Sentis and text vector embedding technology to enable offline mode text content search based on vector databases. The tool includes a chain toolkit for embedding LLM and Agent in games, along with middleware components for Text to Speech, Speech to Text, and Sub-classifier functionalities. UniChat also offers a tool for invoking tools based on ReActAgent workflow, allowing users to create personalized chat scenarios and character cards. The tool provides a comprehensive solution for designing flexible conversations in games while maintaining developer's ideas.

instructor-go
Instructor Go is a library that simplifies working with structured outputs from large language models (LLMs). Built on top of `invopop/jsonschema` and utilizing `jsonschema` Go struct tags, it provides a user-friendly API for managing validation, retries, and streaming responses without changing code logic. The library supports LLM provider APIs such as OpenAI, Anthropic, Cohere, and Google, capturing and returning usage data in responses. Users can easily add metadata to struct fields using `jsonschema` tags to enhance model awareness and streamline workflows.

whetstone.chatgpt
Whetstone.ChatGPT is a simple light-weight library that wraps the Open AI API with support for dependency injection. It supports features like GPT 4, GPT 3.5 Turbo, chat completions, audio transcription and translation, vision completions, files, fine tunes, images, embeddings, moderations, and response streaming. The library provides a video walkthrough of a Blazor web app built on it and includes examples such as a command line bot. It offers quickstarts for dependency injection, chat completions, completions, file handling, fine tuning, image generation, and audio transcription.

mcp-go
MCP Go is a Go implementation of the Model Context Protocol (MCP), facilitating seamless integration between LLM applications and external data sources and tools. It handles complex protocol details and server management, allowing developers to focus on building tools. The tool is designed to be fast, simple, and complete, aiming to provide a high-level and easy-to-use interface for developing MCP servers. MCP Go is currently under active development, with core features working and advanced capabilities in progress.

LlmTornado
LLM Tornado is a .NET library designed to simplify the consumption of various large language models (LLMs) from providers like OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosted APIs. It acts as an aggregator, allowing users to easily switch between different LLM providers with just a change in argument. Users can perform tasks such as chatting with documents, voice calling with AI, orchestrating assistants, generating images, and more. The library exposes capabilities through vendor extensions, making it easy to integrate and use multiple LLM providers simultaneously.

OpenAI-DotNet
OpenAI-DotNet is a simple C# .NET client library for OpenAI to use through their RESTful API. It is independently developed and not an official library affiliated with OpenAI. Users need an OpenAI API account to utilize this library. The library targets .NET 6.0 and above, working across various platforms like console apps, winforms, wpf, asp.net, etc., and on Windows, Linux, and Mac. It provides functionalities for authentication, interacting with models, assistants, threads, chat, audio, images, files, fine-tuning, embeddings, and moderations.

modelfusion
ModelFusion is an abstraction layer for integrating AI models into JavaScript and TypeScript applications, unifying the API for common operations such as text streaming, object generation, and tool usage. It provides features to support production environments, including observability hooks, logging, and automatic retries. You can use ModelFusion to build AI applications, chatbots, and agents. ModelFusion is a non-commercial open source project that is community-driven. You can use it with any supported provider. ModelFusion supports a wide range of models including text generation, image generation, vision, text-to-speech, speech-to-text, and embedding models. ModelFusion infers TypeScript types wherever possible and validates model responses. ModelFusion provides an observer framework and logging support. ModelFusion ensures seamless operation through automatic retries, throttling, and error handling mechanisms. ModelFusion is fully tree-shakeable, can be used in serverless environments, and only uses a minimal set of dependencies.

com.openai.unity
com.openai.unity is an OpenAI package for Unity that allows users to interact with OpenAI's API through RESTful requests. It is independently developed and not an official library affiliated with OpenAI. Users can fine-tune models, create assistants, chat completions, and more. The package requires Unity 2021.3 LTS or higher and can be installed via Unity Package Manager or Git URL. Various features like authentication, Azure OpenAI integration, model management, thread creation, chat completions, audio processing, image generation, file management, fine-tuning, batch processing, embeddings, and content moderation are available.

gollm
gollm is a Go package designed to simplify interactions with Large Language Models (LLMs) for AI engineers and developers. It offers a unified API for multiple LLM providers, easy provider and model switching, flexible configuration options, advanced prompt engineering, prompt optimization, memory retention, structured output and validation, provider comparison tools, high-level AI functions, robust error handling and retries, and extensible architecture. The package enables users to create AI-powered golems for tasks like content creation workflows, complex reasoning tasks, structured data generation, model performance analysis, prompt optimization, and creating a mixture of agents.

litegraph
LiteGraph is a property graph database designed for knowledge and artificial intelligence applications. It supports graph relationships, tags, labels, metadata, data, and vectors. LiteGraph can be used in-process with LiteGraphClient or as a standalone RESTful server with LiteGraph.Server. The latest version includes major internal refactor, batch APIs, enumeration APIs, statistics APIs, database caching, vector search enhancements, and bug fixes. LiteGraph allows for simple embedding into applications without user configuration. Users can create tenants, graphs, nodes, edges, and perform operations like finding routes and exporting to GEXF file. It also provides features for working with object labels, tags, data, and vectors, enabling filtering and searching based on various criteria. LiteGraph offers REST API deployment with LiteGraph.Server and Docker support with a Docker image available on Docker Hub.

swarmgo
SwarmGo is a Go package designed to create AI agents capable of interacting, coordinating, and executing tasks. It focuses on lightweight agent coordination and execution, offering powerful primitives like Agents and handoffs. SwarmGo enables building scalable solutions with rich dynamics between tools and networks of agents, all while keeping the learning curve low. It supports features like memory management, streaming support, concurrent agent execution, LLM interface, and structured workflows for organizing and coordinating multiple agents.

letta
Letta is an open source framework for building stateful LLM applications. It allows users to build stateful agents with advanced reasoning capabilities and transparent long-term memory. The framework is white box and model-agnostic, enabling users to connect to various LLM API backends. Letta provides a graphical interface, the Letta ADE, for creating, deploying, interacting, and observing with agents. Users can access Letta via REST API, Python, Typescript SDKs, and the ADE. Letta supports persistence by storing agent data in a database, with PostgreSQL recommended for data migrations. Users can install Letta using Docker or pip, with Docker defaulting to PostgreSQL and pip defaulting to SQLite. Letta also offers a CLI tool for interacting with agents. The project is open source and welcomes contributions from the community.

ChatRex
ChatRex is a Multimodal Large Language Model (MLLM) designed to seamlessly integrate fine-grained object perception and robust language understanding. By adopting a decoupled architecture with a retrieval-based approach for object detection and leveraging high-resolution visual inputs, ChatRex addresses key challenges in perception tasks. It is powered by the Rexverse-2M dataset with diverse image-region-text annotations. ChatRex can be applied to various scenarios requiring fine-grained perception, such as object detection, grounded conversation, grounded image captioning, and region understanding.

agent-kit
AgentKit is a framework for creating and orchestrating AI Agents, enabling developers to build, test, and deploy reliable AI applications at scale. It allows for creating networked agents with separate tasks and instructions to solve specific tasks, as well as simple agents for tasks like writing content. The framework requires the Inngest TypeScript SDK as a dependency and provides documentation on agents, tools, network, state, and routing. Example projects showcase AgentKit in action, such as the Test Writing Network demo using Workflow Kit, Supabase, and OpenAI.
For similar tasks

bellman
Bellman is a unified interface to interact with language and embedding models, supporting various vendors like VertexAI/Gemini, OpenAI, Anthropic, VoyageAI, and Ollama. It consists of a library for direct interaction with models and a service 'bellmand' for proxying requests with one API key. Bellman simplifies switching between models, vendors, and common tasks like chat, structured data, tools, and binary input. It addresses the lack of official SDKs for major players and differences in APIs, providing a single proxy for handling different models. The library offers clients for different vendors implementing common interfaces for generating and embedding text, enabling easy interchangeability between models.

ollama-ex
Ollama is a powerful tool for running large language models locally or on your own infrastructure. It provides a full implementation of the Ollama API, support for streaming requests, and tool use capability. Users can interact with Ollama in Elixir to generate completions, chat messages, and perform streaming requests. The tool also supports function calling on compatible models, allowing users to define tools with clear descriptions and arguments. Ollama is designed to facilitate natural language processing tasks and enhance user interactions with language models.

llm_agents
LLM Agents is a small library designed to build agents controlled by large language models. It aims to provide a better understanding of how such agents work in a concise manner. The library allows agents to be instructed by prompts, use custom-built components as tools, and run in a loop of Thought, Action, Observation. The agents leverage language models to generate Thought and Action, while tools like Python REPL, Google search, and Hacker News search provide Observations. The library requires setting up environment variables for OpenAI API and SERPAPI API keys. Users can create their own agents by importing the library and defining tools accordingly.

kagent
Kagent is a Kubernetes native framework for building AI agents, designed to be easy to understand and use. It provides a flexible and powerful way to build, deploy, and manage AI agents in Kubernetes. The framework consists of agents, tools, and model configurations defined as Kubernetes custom resources, making them easy to manage and modify. Kagent is extensible, flexible, observable, declarative, testable, and has core components like a controller, UI, engine, and CLI.

AgentFly
AgentFly is an extensible framework for building LLM agents with reinforcement learning. It supports multi-turn training by adapting traditional RL methods with token-level masking. It features a decorator-based interface for defining tools and reward functions, enabling seamless extension and ease of use. To support high-throughput training, it implemented asynchronous execution of tool calls and reward computations, and designed a centralized resource management system for scalable environment coordination. A suite of prebuilt tools and environments are provided.

vectorflow
VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. It provides a simple API endpoint for ingesting large volumes of raw data, processing, and storing or returning the vectors quickly and reliably. The tool supports text-based files like TXT, PDF, HTML, and DOCX, and can be run locally with Kubernetes in production. VectorFlow offers functionalities like embedding documents, running chunking schemas, custom chunking, and integrating with vector databases like Pinecone, Qdrant, and Weaviate. It enforces a standardized schema for uploading data to a vector store and supports features like raw embeddings webhook, chunk validation webhook, S3 endpoint, and telemetry. The tool can be used with the Python client and provides detailed instructions for running and testing the functionalities.

paper-qa
PaperQA is a minimal package for question and answering from PDFs or text files, providing very good answers with in-text citations. It uses OpenAI Embeddings to embed and search documents, and includes a process of embedding docs, queries, searching for top passages, creating summaries, using an LLM to re-score and select relevant summaries, putting summaries into prompt, and generating answers. The tool can be used to answer specific questions related to scientific research by leveraging citations and relevant passages from documents.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.