bellman

Golang lib for LLM APIs, ChatGPT, Gemini and Anthropic

Stars: 57

Visit

Bellman is a unified interface to interact with language and embedding models, supporting various vendors like VertexAI/Gemini, OpenAI, Anthropic, VoyageAI, and Ollama. It consists of a library for direct interaction with models and a service 'bellmand' for proxying requests with one API key. Bellman simplifies switching between models, vendors, and common tasks like chat, structured data, tools, and binary input. It addresses the lack of official SDKs for major players and differences in APIs, providing a single proxy for handling different models. The library offers clients for different vendors implementing common interfaces for generating and embedding text, enabling easy interchangeability between models.

README:

Bellman

It tries to be unified interface to interact with LLMs and embedding models. In particular, it seeks to make it easier to switch between models and vendors along woth lowering tha barrier to get started. Bellman supports VertexAI/Gemini, OpenAI, Anthropic, VoyageAI and Ollama

Bellman consists of two parts. The library and the service. The go library enables you to interact with the different LLM vendors directly while the service, bellmand creates a proxy service that lets you connect to all providers with one api key.

Bellman supports the common things that we expect in modern llm models. Chat, Structured, Tools and binary input.

Why

This project was built to the lack of official sdk/clients for the major players along with the slight differences in API. It also became clear when we started to play around with different LLMs in our projects that the differences, while slight, had implications and for each new model introduced it became an overhead. There are other projects out there, like go version of langchain, that deals with some of it. But having one proxy to hadle all types of models made things alot easier for us to iterate over problems, models and solutions.

The Service

bellmand is a simple web service that implements the bellman library and exposes it through a http api

The easiest way to get started is to simply run it as a docker service.

Prerequisite

Docker being installed
API Keys to Anthropic, OpenAI, VertexAI(Google Gemini) and/or VoyageAI
Installing Ollama, https://ollama.com/ (very cool project imho)

Run

## Help / Man
docker run --rm -it modfin/bellman --help  

## Example 
docker run --rm -d modfin/bellman \
  --prometheus-metrics-basic-auth="user:pass"
  --ollama-url=http://localhost:11434 \
  --openai-key="$(cat ./credentials/openai-api-key.txt)" \
  --anthropic-key="$(cat ./credentials/anthropic-api-key.txt)" \
  --voyageai-key="$(cat ./credentials/voyageai-api-key.txt)" \
  --google-credential="$(cat ./credentials/google-service-account.json)" \
  --google-project=your-google-project \
  --google-region=europe-north1 \
  --api-key=qwerty

This will start the bellmand service that proxies requests to the model you define in the request.

The Library

Installation

go get github.com/modfin/bellman

Usage

The library provides clients for Anthropic, Ollama, OpenAI, VertexAI, VoyageAI and Bellmand itself.

All the clients implement the same interfaces, gen.Generator and embed.Embeder, and can there for be used interchangeably.

client, err := anthropic.New(...)
client, err := ollama.New(...)
client, err := openai.New(...)
client, err := vertexai.New(...)
client, err := voyageai.New(...)
client, err := bellman.New(...)

bellman.New()

The benefit of using the bellman client, when you are running bellmand, is that we can interchangeably use any model that we wish to interact with.

client, err := bellman.New(...)
llm := client.Generator()
res, err := llm.Model(openai.GenModel_gpt4o_mini).
    Prompt(
        prompt.AsUser("What company made you?"),
    )
fmt.Println(res, err)
// OpenAI

res, err := llm.Model(vertexai.GenModel_gemini_1_5_flash).
    Prompt(
        prompt.AsUser("What company made you?"),
    )
fmt.Println(res, err)
// Google

// or even a custom model that you created yourself (trained) 
// or a new model that is not in the library yet
model := gen.Model{
    Provider: vertexai.Provider,
    Name:     "gemini-2.0-flash-exp",
    Config:   map[string]interface{}{"region": "us-central1"},
}
res, err := llm.Model(model).
    Prompt(
        prompt.AsUser("What company made you?"),
    )
fmt.Println(res, err)
// Google

Prompting

Just normal conversation mode

res, err := openai.New(apiKy).Generator().
    Model(openai.GenModel_gpt4o_mini).
    Prompt(
        prompt.AsUser("What is the distance to the moon?"),
    )
if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

awnser, err := res.AsText()

fmt.Println(awnser, err)
// The average distance from Earth to the Moon is approximately 384,400 kilometers 
// (about 238,855 miles). This distance can vary slightly because the Moon's orbit
// is elliptical, ranging from about 363,300 km (225,623 miles) at its closest 
// (perigee) to 405,500 km (251,966 miles) at its farthest (apogee). <nil>

System Prompting

Just normal conversation mode

res, err := openai.New(apiKey).Generator().
    Model(openai.GenModel_gpt4o_mini).
    System("You are a expert movie quoter and lite fo finish peoples sentences with a movie reference").
    Prompt(
        prompt.AsUser("Who are you going to call?"),
    )
if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

awnser, err := res.AsText()

fmt.Println(awnser, err)
// Ghostbusters! <nil>

General Configuration

Setting things like temperature, max tokens, top p, and stop secuences

res, err := openai.New(apiKey).Generator().
    Model(openai.GenModel_gpt4o_mini).
    Temperature(0.5).
    MaxTokens(100).
    TopP(0.9). // should really not be used with temperature
    StopAt(".", "!", "?").
    Prompt(
        prompt.AsUser("Write me a 2 paragraph text about gophers"),
    )
if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

awnser, err := res.AsText()

fmt.Println(awnser, err)
// Gophers are small, 
// burrowing rodents belonging to the family Geomyidae, 
// primarily found in North America

Structured Output

From many models, you can now specify a schema that you want the models to output.

A supporting library that can transforming your go struct to json schema is provided. github.com/modfin/bellman/schema

There are a few different annotations that you can use on your golang structs to enrich the corresponding json schema.

Annotation	Description	Supported
json-description	A description of the field, overrides the default description value	*
json-type	The type of the field, overrides the default type value	*
json-enum	A list of possible values for the field. Can be used with: slices, string, number, integer, boolean	*
json-maximum	The maximum value for the field. Can be used with: number, integer	VertexAI, OpenAI
json-minimum	The minimum value for the field. Can be used with: number, integer	VertexAI, OpenAI
json-exclusive-maximum	The exclusive maximum value for the field. Can be used with: number, integer	VertexAI, OpenAI
json-exclusive-minimum	The exclusive minimum value for the field. Can be used with: number, integer	VertexAI, OpenAI
json-max-items	The maximum number of items in the array. Can be used with: slices	VertexAI, OpenAI
json-min-items	The minimum number of items in the array. Can be used with: slices	VertexAI, OpenAI
json-max-length	The maximum length of the string. Can be used with: string	OpenAI
json-min-length	The minimum length of the string. Can be used with: string	OpenAI
json-format	Format of a string, one of: date-time, time, date, duration, email, hostname, ipv4, ipv6, uuid. Can be used with: string	VertexAI, OpenAI
json-pattern	Regex pattern of a string. Can be used with: string	OpenAI

type Quote struct {
   Character string `json:"character"`
   Quote     string `json:"quote"`
}
type Responese struct {
   Quotes []Quote `json:"quotes"`
}


llm := vertexai.New(googleConfig).Generator()
res, err := llm.
    Model(vertexai.GenModel_gemini_1_5_pro).
    Output(schema.From(Responese{})).
    Prompt(
        prompt.AsUser("give me 3 quotes from different characters in Hamlet"),
    )
if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

awnser, err := res.AsText() // will return the json of the struct
fmt.Println(awnser, err)
//{
//  "quotes": [
//    {
//      "character": "Hamlet",
//      "quote": "To be or not to be, that is the question."
//    },
//    {
//      "character": "Polonius",
//      "quote": "This above all: to thine own self be true."
//    },
//    {
//      "character": "Queen Gertrude",
//      "quote": "The lady doth protest too much, methinks."
//    }
//  ]
//}  <nil>

var result Result
err := res.Unmarshal(&result) // Just a shorthand to marshal it into your struct
fmt.Println(result, err)
// {[
//      {Hamlet To be or not to be, that is the question.} 
//      {Polonius This above all: to thine own self be true.} 
//      {Queen Gertrude The lady doth protest too much, methinks.}
// ]} <nil>

Tools

The Bellman library allows you to define and use tools in your prompts. Here is an example of how to define and use a tool:

Define a tool:

 type Args struct {
      Name string `json:"name"`
 }

 getQuote := tools.NewTool("get_quote",
    tools.WithDescription(
         "a function to get a quote from a person or character in Hamlet",
    ),
    tools.WithArgSchema(Args{}),
    tools.WithCallback(func(jsondata string) (string, error) {
        var arg Args
        err := json.Unmarshal([]byte(jsondata), &arg)
        if err != nil {
            return "",err
        }
        return dao.GetQuoateFrom(arg.Name)
    }),
)

Use the tool in a prompt:

res, err := anthopic.New(apiKey).Generator().
    Model(anthropic.GenModel_3_5_haiku_latest)).
    System("You are a Shakespeare quote generator").
    Tools(getQuote).
    // Configure a specific too to be used, or the setting for it
    Tool(tools.RequiredTool). 
    Prompt(
        prompt.AsUser("Give me 3 quotes from different characters"),
    )

if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

// Evaluate with callback function
err = res.Eval()
if err != nil {
    log.Fatalf("Eval() error = %v", err)
}


// or Evaluate your self

tools, err := res.Tools()
if err != nil {
      log.Fatalf("Tools() error = %v", err)
}

for _, tool := range tools {
    log.Printf("Tool: %s", tool.Name)
    switch tool.Name {
       // ....
    }
}

Binary Data

Images is supported by Gemini, OpenAI and Anthropic.
PDFs is only supported by Gemini and Anthropic

Image

image := "/9j/4AAQSkZJRgABAQEBLAEsAAD//g......gM4OToWbsBg5mGu0veCcRZO6f0EjK5Jv5X/AP/Z"
data, err := base64.StdEncoding.DecodeString(image)
if err != nil {
    t.Fatalf("could not decode image %v", err)
}
res, err := llm.
    Prompt(
        prompt.AsUserWithData(prompt.MimeImageJPEG, data),
        prompt.AsUser("Describe the image to me"),
    )

if err != nil {
    t.Fatalf("Prompt() error = %v", err)
}
fmt.Println(res.AsText())
// The image contains the word "Hot!" in red text. The text is centered on a white background. 
// The exclamation point is after the word.  The image is a simple and straightforward 
// depiction of the word "hot." <nil>

PDF

pdf, err := os.ReadFile("path/to/pdf")
if err != nil {
    t.Fatalf("could open file, %v", err)
}

res, err := anthopic.New(apiKey).Generator().
    Prompt(
        prompt.AsUserWithData(prompt.MimeApplicationPDF, pdf),
        prompt.AsUser("Describe to me what is in the PDF"),
    )

if err != nil {
    t.Fatalf("Prompt() error = %v", err)
}
fmt.Println(res.AsText())
// The image contains the word "Hot!" in red text. The text is centered on a white background. 
// The exclamation point is after the word.  The image is a simple and straightforward 
// depiction of the word "hot." <nil>

Reasoning

Control reasoning by setting the budget for the reasoning tokens. Determine whether to return the reasoning data or not. The default thinking/reasoning behaviour is different depending on the model you are using.

res, err := anthropic.New(apiKey).Generator().
    Model(anthropic.GenModel_4_0_sonnet_20250514).
    MaxTokens(3000).
    ThinkingBudget(2000). // the budget for reasoning tokens, set to 0 to disable reasoning if supported by the selected model
    IncludeThinkingParts(true). // if available, includes the reasoning parts in the response (will be summaries for some models)
    Prompt(
        prompt.AsUser("What is 27 * 453?"),
    )
if err != nil {
    log.Fatalf("Prompt() error = %v", err)
}

answer, err := res.AsText()

fmt.Println(awnser, err)

Provider specific config

Some providers have specific configuration that is not supported by the common interface. You can set these options manually on the gen.Model.Config struct.

model := gen.Model{
    Provider: openai.Provider,
    Name:     openai.GenModel_gpt5_mini_latest.Name,
    Config: map[string]interface{}{
        "service_tier": openai.ServiceTierPriority,
    },
}

// prompt..
The returned metadata will then contain the service_tier used.

Agent Example

Supporter lib for simple agentic tasks

type GetQuoteArg struct {
    StockId int `json:"stock_id" json-description:"the id of a stock for which  quote to get"`
}
type Search struct {
    Name string `json:"name" json-description:"the name of a stock being looked for"`
}

getQuote := tools.NewTool("get_quote",
    tools.WithDescription("a function get a stock quote based on stock id"),
    tools.WithArgSchema(GetQuoteArg{}),
    tools.WithCallback(func (jsondata string) (string, error) {
        var arg GetQuoteArg
        err := json.Unmarshal([]byte(jsondata), &arg)
        if err != nil {
            return "", err
        }
         return `{"stock_id": ` + strconv.Itoa(arg.StockId) + `,"price": 123.45}`, nil
    }),
)

getStock := tools.NewTool("get_stock",
    tools.WithDescription("a function a stock based on name"),
    tools.WithArgSchema(Search{}),
    tools.WithCallback(func (jsondata string) (string, error) {
        var arg GetQuoteArg
        err := json.Unmarshal([]byte(jsondata), &arg)
        if err != nil {
            return "", err
        }
        return `{"stock_id": 98765}`, nil
    }),
)


type Result struct {
    StockId int     `json:"stock_id"`
    Price   float64 `json:"price"`
}

llm := anthopic.New(apiKey).Generator()
llm = llm.SetTools(getQuote, getStock)

res, err := agent.Run[Result](5, llm, prompt.AsUser("Get me the price of Volvo B"))
if err != nil {
    t.Fatalf("Prompt() error = %v", err)
}

fmt.Printf("==== Result after %d calls ====\n", res.Depth)
fmt.Printf("%+v\n", res.Result)
fmt.Printf("==== Conversation ====\n")

for _, p := range res.Promps {
    fmt.Printf("%s: %s\n", p.Role, p.Text)
}

// ==== Result after 2 calls ====
// {StockId:98765 Price:123.45}
// ==== Conversation ====
// user:       Get me the price of Volvo B
// assistant:  tool function call: get_stock with argument: {"name":"Volvo B"}
// user:       result: get_stock => {"stock_id": 98765}
// assistant:  tool function call: get_quote with argument: {"stock_id":98765}
// user:       result: get_quote => {"stock_id": 98765,"price": 123.45}
// assistant:  tool function call: __return_result_tool__ with argument: {"price":123.45,"stock_id":98765}

Embeddings

Bellman integrates with most the embedding models as well as the LLMs that is provided by the supported providers. There is also a VoyageAI, voyageai.com, that only really deals with embeddings.

client := bellman_client := bellman.New(...)

res, err := client.Embed(embed.NewSingleRequest(
    context.Background(),
    vertexai.EmbedModel_text_005.WithType(embed.TypeDocument),
    "The document to embed",
))

fmt.Println(res.SingleAsFloat32())
// [-0.06821047514677048 -0.00014664272021036595 0.011814368888735771 ....], nil

Or using the query type.

client := bellman_client := bellman.New(...)

res, err := client.Embed(embed.NewSingleRequest(
    context.Background(),
    vertexai.EmbedModel_text_005.WithType(embed.TypeQuery),
    "The query to embed",
))

fmt.Println(res.SingleAsFloat32())
// [-0.06821047514677048 -0.00014664272021036595 0.011814368888735771 ....], nil

Context aware embeddings

Bellman also supports context aware embeddings. As of now, only with VoyageAI models.

res, err := client.Embed(embed.NewDocumentRequest(
    context.Background(),
    voyageai.EmbedModel_voyage_context_3.WithType(embed.TypeDocument),
    []string{"document_chunk_1", "document_chunk_2", "document_chunk_3", ...},
))

fmt.Println(res.AsFloat64())
// [[-0.06821047514677048 ...], [0.011814368888735771 ....], ...], nil

Type

Some embeddings models support specific types of input.

Eg. VertexAI and VoyageAI

The API allows you to define what type of text you are sending. For example embed.TypeDocument for initial embedding and embed.TypeQuery for getting a vector that is to be compared

License

This project is licensed under the MIT License. See the LICENSE file for details.

For Tasks:

Click tags to check more tools for each tasks

generate text embed documents proxy requests control reasoning define tools

For Jobs:

data scientist machine learning engineer ai researcher nlp engineer software developer

Alternative AI tools for bellman

Similar Open Source Tools

bellman

github

: 57

LightRAG

LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.

github

: 562

UniChat

UniChat is a pipeline tool for creating online and offline chat-bots in Unity. It leverages Unity.Sentis and text vector embedding technology to enable offline mode text content search based on vector databases. The tool includes a chain toolkit for embedding LLM and Agent in games, along with middleware components for Text to Speech, Speech to Text, and Sub-classifier functionalities. UniChat also offers a tool for invoking tools based on ReActAgent workflow, allowing users to create personalized chat scenarios and character cards. The tool provides a comprehensive solution for designing flexible conversations in games while maintaining developer's ideas.

github

: 62

instructor-go

Instructor Go is a library that simplifies working with structured outputs from large language models (LLMs). Built on top of `invopop/jsonschema` and utilizing `jsonschema` Go struct tags, it provides a user-friendly API for managing validation, retries, and streaming responses without changing code logic. The library supports LLM provider APIs such as OpenAI, Anthropic, Cohere, and Google, capturing and returning usage data in responses. Users can easily add metadata to struct fields using `jsonschema` tags to enhance model awareness and streamline workflows.

github

: 167

whetstone.chatgpt

Whetstone.ChatGPT is a simple light-weight library that wraps the Open AI API with support for dependency injection. It supports features like GPT 4, GPT 3.5 Turbo, chat completions, audio transcription and translation, vision completions, files, fine tunes, images, embeddings, moderations, and response streaming. The library provides a video walkthrough of a Blazor web app built on it and includes examples such as a command line bot. It offers quickstarts for dependency injection, chat completions, completions, file handling, fine tuning, image generation, and audio transcription.

github

: 94

LlmTornado

LLM Tornado is a .NET library designed to simplify the consumption of various large language models (LLMs) from providers like OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosted APIs. It acts as an aggregator, allowing users to easily switch between different LLM providers with just a change in argument. Users can perform tasks such as chatting with documents, voice calling with AI, orchestrating assistants, generating images, and more. The library exposes capabilities through vendor extensions, making it easy to integrate and use multiple LLM providers simultaneously.

github

: 306

OpenAI-DotNet

OpenAI-DotNet is a simple C# .NET client library for OpenAI to use through their RESTful API. It is independently developed and not an official library affiliated with OpenAI. Users need an OpenAI API account to utilize this library. The library targets .NET 6.0 and above, working across various platforms like console apps, winforms, wpf, asp.net, etc., and on Windows, Linux, and Mac. It provides functionalities for authentication, interacting with models, assistants, threads, chat, audio, images, files, fine-tuning, embeddings, and moderations.

github

: 732

modelfusion

ModelFusion is an abstraction layer for integrating AI models into JavaScript and TypeScript applications, unifying the API for common operations such as text streaming, object generation, and tool usage. It provides features to support production environments, including observability hooks, logging, and automatic retries. You can use ModelFusion to build AI applications, chatbots, and agents. ModelFusion is a non-commercial open source project that is community-driven. You can use it with any supported provider. ModelFusion supports a wide range of models including text generation, image generation, vision, text-to-speech, speech-to-text, and embedding models. ModelFusion infers TypeScript types wherever possible and validates model responses. ModelFusion provides an observer framework and logging support. ModelFusion ensures seamless operation through automatic retries, throttling, and error handling mechanisms. ModelFusion is fully tree-shakeable, can be used in serverless environments, and only uses a minimal set of dependencies.

github

: 918

com.openai.unity

com.openai.unity is an OpenAI package for Unity that allows users to interact with OpenAI's API through RESTful requests. It is independently developed and not an official library affiliated with OpenAI. Users can fine-tune models, create assistants, chat completions, and more. The package requires Unity 2021.3 LTS or higher and can be installed via Unity Package Manager or Git URL. Various features like authentication, Azure OpenAI integration, model management, thread creation, chat completions, audio processing, image generation, file management, fine-tuning, batch processing, embeddings, and content moderation are available.

github

: 496

gollm

gollm is a Go package designed to simplify interactions with Large Language Models (LLMs) for AI engineers and developers. It offers a unified API for multiple LLM providers, easy provider and model switching, flexible configuration options, advanced prompt engineering, prompt optimization, memory retention, structured output and validation, provider comparison tools, high-level AI functions, robust error handling and retries, and extensible architecture. The package enables users to create AI-powered golems for tasks like content creation workflows, complex reasoning tasks, structured data generation, model performance analysis, prompt optimization, and creating a mixture of agents.

github

: 414

litegraph

LiteGraph is a property graph database designed for knowledge and artificial intelligence applications. It supports graph relationships, tags, labels, metadata, data, and vectors. LiteGraph can be used in-process with LiteGraphClient or as a standalone RESTful server with LiteGraph.Server. The latest version includes major internal refactor, batch APIs, enumeration APIs, statistics APIs, database caching, vector search enhancements, and bug fixes. LiteGraph allows for simple embedding into applications without user configuration. Users can create tenants, graphs, nodes, edges, and perform operations like finding routes and exporting to GEXF file. It also provides features for working with object labels, tags, data, and vectors, enabling filtering and searching based on various criteria. LiteGraph offers REST API deployment with LiteGraph.Server and Docker support with a Docker image available on Docker Hub.

github

: 66

swarmgo

SwarmGo is a Go package designed to create AI agents capable of interacting, coordinating, and executing tasks. It focuses on lightweight agent coordination and execution, offering powerful primitives like Agents and handoffs. SwarmGo enables building scalable solutions with rich dynamics between tools and networks of agents, all while keeping the learning curve low. It supports features like memory management, streaming support, concurrent agent execution, LLM interface, and structured workflows for organizing and coordinating multiple agents.

github

: 164

letta

Letta is an open source framework for building stateful LLM applications. It allows users to build stateful agents with advanced reasoning capabilities and transparent long-term memory. The framework is white box and model-agnostic, enabling users to connect to various LLM API backends. Letta provides a graphical interface, the Letta ADE, for creating, deploying, interacting, and observing with agents. Users can access Letta via REST API, Python, Typescript SDKs, and the ADE. Letta supports persistence by storing agent data in a database, with PostgreSQL recommended for data migrations. Users can install Letta using Docker or pip, with Docker defaulting to PostgreSQL and pip defaulting to SQLite. Letta also offers a CLI tool for interacting with agents. The project is open source and welcomes contributions from the community.

github

: 18.5k

claude-api

claude-api is a web conversation library for ClaudeAI implemented in GoLang. It provides functionalities to interact with ClaudeAI for web-based conversations. Users can easily integrate this library into their Go projects to enable chatbot capabilities and handle conversations with ClaudeAI. The library includes features for sending messages, receiving responses, and managing chat sessions, making it a valuable tool for developers looking to incorporate AI-powered chatbots into their applications.

github

: 164

Jlama

Jlama is a modern Java inference engine designed for large language models. It supports various model types such as Gemma, Llama, Mistral, GPT-2, BERT, and more. The tool implements features like Flash Attention, Mixture of Experts, and supports different model quantization formats. Built with Java 21 and utilizing the new Vector API for faster inference, Jlama allows users to add LLM inference directly to their Java applications. The tool includes a CLI for running models, a simple UI for chatting with LLMs, and examples for different model types.

github

: 987

azure-functions-openai-extension

Azure Functions OpenAI Extension is a project that adds support for OpenAI LLM (GPT-3.5-turbo, GPT-4) bindings in Azure Functions. It provides NuGet packages for various functionalities like text completions, chat completions, assistants, embeddings generators, and semantic search. The project requires .NET 6 SDK or greater, Azure Functions Core Tools v4.x, and specific settings in Azure Function or local settings for development. It offers features like text completions, chat completion, assistants with custom skills, embeddings generators for text relatedness, and semantic search using vector databases. The project also includes examples in C# and Python for different functionalities.

github

: 87

For similar tasks

bellman

github

: 57

ollama-ex

Ollama is a powerful tool for running large language models locally or on your own infrastructure. It provides a full implementation of the Ollama API, support for streaming requests, and tool use capability. Users can interact with Ollama in Elixir to generate completions, chat messages, and perform streaming requests. The tool also supports function calling on compatible models, allowing users to define tools with clear descriptions and arguments. Ollama is designed to facilitate natural language processing tasks and enhance user interactions with language models.

github

: 127

llm_agents

LLM Agents is a small library designed to build agents controlled by large language models. It aims to provide a better understanding of how such agents work in a concise manner. The library allows agents to be instructed by prompts, use custom-built components as tools, and run in a loop of Thought, Action, Observation. The agents leverage language models to generate Thought and Action, while tools like Python REPL, Google search, and Hacker News search provide Observations. The library requires setting up environment variables for OpenAI API and SERPAPI API keys. Users can create their own agents by importing the library and defining tools accordingly.

github

: 945

kagent

Kagent is a Kubernetes native framework for building AI agents, designed to be easy to understand and use. It provides a flexible and powerful way to build, deploy, and manage AI agents in Kubernetes. The framework consists of agents, tools, and model configurations defined as Kubernetes custom resources, making them easy to manage and modify. Kagent is extensible, flexible, observable, declarative, testable, and has core components like a controller, UI, engine, and CLI.

github

: 1.6k

AgentFly

AgentFly is an extensible framework for building LLM agents with reinforcement learning. It supports multi-turn training by adapting traditional RL methods with token-level masking. It features a decorator-based interface for defining tools and reward functions, enabling seamless extension and ease of use. To support high-throughput training, it implemented asynchronous execution of tool calls and reward computations, and designed a centralized resource management system for scalable environment coordination. A suite of prebuilt tools and environments are provided.

github

: 60

vectorflow

VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. It provides a simple API endpoint for ingesting large volumes of raw data, processing, and storing or returning the vectors quickly and reliably. The tool supports text-based files like TXT, PDF, HTML, and DOCX, and can be run locally with Kubernetes in production. VectorFlow offers functionalities like embedding documents, running chunking schemas, custom chunking, and integrating with vector databases like Pinecone, Qdrant, and Weaviate. It enforces a standardized schema for uploading data to a vector store and supports features like raw embeddings webhook, chunk validation webhook, S3 endpoint, and telemetry. The tool can be used with the Python client and provides detailed instructions for running and testing the functionalities.

github

: 639

paper-qa

PaperQA is a minimal package for question and answering from PDFs or text files, providing very good answers with in-text citations. It uses OpenAI Embeddings to embed and search documents, and includes a process of embedding docs, queries, searching for top passages, creating summaries, using an LLM to re-score and select relevant summaries, putting summaries into prompt, and generating answers. The tool can be used to answer specific questions related to scientific research by leveraging citations and relevant passages from documents.

github

: 6.6k

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675