openai-scala-client

Scala client for OpenAI API and other major LLM providers

Stars: 208

Visit

This is a no-nonsense async Scala client for OpenAI API supporting all the available endpoints and params including streaming, chat completion, vision, and voice routines. It provides a single service called OpenAIService that supports various calls such as Models, Completions, Chat Completions, Edits, Images, Embeddings, Batches, Audio, Files, Fine-tunes, Moderations, Assistants, Threads, Thread Messages, Runs, Run Steps, Vector Stores, Vector Store Files, and Vector Store File Batches. The library aims to be self-contained with minimal dependencies and supports API-compatible providers like Azure OpenAI, Azure AI, Anthropic, Google Vertex AI, Groq, Grok, Fireworks AI, OctoAI, TogetherAI, Cerebras, Mistral, Deepseek, Ollama, FastChat, and more.

README:

OpenAI Scala Client 🤖

This is a no-nonsense async Scala client for OpenAI API supporting all the available endpoints and params including streaming, the newest chat completion, vision, and voice routines (as defined here), provided in a single, convenient service called OpenAIService. The supported calls are:

Models: listModels, and retrieveModel
Completions: createCompletion
Chat Completions: createChatCompletion, createChatFunCompletion (deprecated), and createChatToolCompletion
Edits: createEdit (deprecated)
Images: createImage, createImageEdit, and createImageVariation
Embeddings: createEmbeddings
Batches: createBatch, retrieveBatch, cancelBatch, and listBatches
Audio: createAudioTranscription, createAudioTranslation, and createAudioSpeech
Files: listFiles, uploadFile, deleteFile, retrieveFile, and retrieveFileContent
Fine-tunes: createFineTune, listFineTunes, retrieveFineTune, cancelFineTune, listFineTuneEvents, listFineTuneCheckpoints, and deleteFineTuneModel
Moderations: createModeration
Assistants: createAssistant, listAssistants, retrieveAssistant, modifyAssistant, and deleteAssistant
Threads: createThread, retrieveThread, modifyThread, and deleteThread
Thread Messages: createThreadMessage, retrieveThreadMessage, modifyThreadMessage, listThreadMessages, retrieveThreadMessageFile, and listThreadMessageFiles
Runs: createRun, createThreadAndRun, listRuns, retrieveRun, modifyRun, submitToolOutputs, and cancelRun
Run Steps: listRunSteps, and retrieveRunStep
Vector Stores: createVectorStore, listVectorStores, retrieveVectorStore, modifyVectorStore, and deleteVectorStore
Vector Store Files: createVectorStoreFile, listVectorStoreFiles, retrieveVectorStoreFile, and deleteVectorStoreFile
Vector Store File Batches: createVectorStoreFileBatch, retrieveVectorStoreFileBatch, cancelVectorStoreFileBatch, and listVectorStoreBatchFiles

Note that in order to be consistent with the OpenAI API naming, the service function names match exactly the API endpoint titles/descriptions with camelcase. Also, we aimed the lib to be self-contained with the fewest dependencies possible therefore we ended up using only two libs play-ahc-ws-standalone and play-ws-standalone-json (at the top level). Additionally, if dependency injection is required we use scala-guice lib as well.

👉 No time to read a lengthy tutorial? Sure, we hear you! Check out the examples to see how to use the lib in practice.

In addition to the OpenAI API, this library also supports API-compatible providers (see examples) such as:

Azure OpenAI - cloud-based, utilizes OpenAI models but with lower latency
Azure AI - cloud-based, offers a vast selection of open-source models
Anthropic - cloud-based, a major competitor to OpenAI, features proprietary/closed-source models such as Claude3 - Haiku, Sonnet, and Opus. 🔥 New: now also through Bedrock!
Google Vertex AI - cloud-based, features proprietary/closed-source models such as Gemini 1.5 Pro and flash
Groq - cloud-based provider, known for its superfast inference with LPUs
Grok - cloud-based provider from x.AI
Fireworks AI - cloud-based provider
OctoAI - cloud-based provider
TogetherAI - cloud-based provider
Cerebras - cloud-based provider, superfast (akin to Groq)
Mistral - cloud-based, leading open-source LLM company
Deepseek - cloud-based provider from China
Ollama - runs locally, serves as an umbrella for open-source LLMs including LLaMA3, dbrx, and Command-R
FastChat - runs locally, serves as an umbrella for open-source LLMs such as Vicuna, Alpaca, and FastChat-T5

👉 For background information read an article about the lib/client on Medium.

Also try out our Scala client for Pinecone vector database, or use both clients together! This demo project shows how to generate and store OpenAI embeddings (with text-embedding-ada-002 model) into Pinecone and query them afterward. The OpenAI + Pinecone combo is commonly used for autonomous AI agents, such as babyAGI and AutoGPT.

✔️ Important: this is a "community-maintained" library and, as such, has no relation to OpenAI company.

Installation 🚀

The currently supported Scala versions are 2.12, 2.13, and 3.

To install the library, add the following dependency to your build.sbt

"io.cequence" %% "openai-scala-client" % "1.1.2"

or to pom.xml (if you use maven)

<dependency>
    <groupId>io.cequence</groupId>
    <artifactId>openai-scala-client_2.12</artifactId>
    <version>1.1.2</version>
</dependency>

If you want streaming support, use "io.cequence" %% "openai-scala-client-stream" % "1.1.2" instead.

Config ⚙️

Env. variables: OPENAI_SCALA_CLIENT_API_KEY and optionally also OPENAI_SCALA_CLIENT_ORG_ID (if you have one)
File config (default): openai-scala-client.conf

Usage 👨‍🎓

I. Obtaining OpenAIService

First you need to provide an implicit execution context as well as akka materializer, e.g., as

  implicit val ec = ExecutionContext.global
  implicit val materializer = Materializer(ActorSystem())

Then you can obtain a service in one of the following ways.

Default config (expects env. variable(s) to be set as defined in Config section)

  val service = OpenAIServiceFactory()

Custom config

  val config = ConfigFactory.load("path_to_my_custom_config")
  val service = OpenAIServiceFactory(config)

Without config

  val service = OpenAIServiceFactory(
     apiKey = "your_api_key",
     orgId = Some("your_org_id") // if you have one
  )

For Azure with API Key

  val service = OpenAIServiceFactory.forAzureWithApiKey(
    resourceName = "your-resource-name",
    deploymentId = "your-deployment-id", // usually model name such as "gpt-35-turbo"
    apiVersion = "2023-05-15",           // newest version
    apiKey = "your_api_key"
  )

Minimal OpenAICoreService supporting listModels, createCompletion, createChatCompletion, and createEmbeddings calls - provided e.g. by FastChat service running on the port 8000

  val service = OpenAICoreServiceFactory("http://localhost:8000/v1/")

OpenAIChatCompletionService providing solely createChatCompletion

Azure AI - e.g. Cohere R+ model

  val service = OpenAIChatCompletionServiceFactory.forAzureAI(
    endpoint = sys.env("AZURE_AI_COHERE_R_PLUS_ENDPOINT"),
    region = sys.env("AZURE_AI_COHERE_R_PLUS_REGION"),
    accessToken = sys.env("AZURE_AI_COHERE_R_PLUS_ACCESS_KEY")
  )

Anthropic - requires openai-scala-anthropic-client lib and ANTHROPIC_API_KEY

  val service = AnthropicServiceFactory.asOpenAI() // or AnthropicServiceFactory.bedrockAsOpenAI

Google Vertex AI - requires openai-scala-google-vertexai-client lib and VERTEXAI_LOCATION + VERTEXAI_PROJECT_ID

  val service = VertexAIServiceFactory.asOpenAI()

Groq - requires GROQ_API_KEY"

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.groq)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.groq)

Grok - requires GROK_API_KEY"

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.grok)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.grok)

Fireworks AI - requires FIREWORKS_API_KEY"

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.fireworks)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.fireworks)

Octo AI - requires OCTOAI_TOKEN

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.octoML)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.octoML)

TogetherAI requires TOGETHERAI_API_KEY

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.togetherAI)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.togetherAI)

Cerebras requires CEREBRAS_API_KEY

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.cerebras)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.cerebras)

Mistral requires MISTRAL_API_KEY

  val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.mistral)
  // or with streaming
  val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.mistral)

Ollama

  val service = OpenAIChatCompletionServiceFactory(
    coreUrl = "http://localhost:11434/v1/"
  )

or with streaming

  val service = OpenAIChatCompletionServiceFactory.withStreaming(
    coreUrl = "http://localhost:11434/v1/"
  )

Note that services with additional streaming support - createCompletionStreamed and createChatCompletionStreamed provided by OpenAIStreamedServiceExtra (requires openai-scala-client-stream lib)

  import io.cequence.openaiscala.service.StreamedServiceTypes.OpenAIStreamedService
  import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._

  val service: OpenAIStreamedService = OpenAIServiceFactory.withStreaming()

similarly for a chat-completion service

  import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._

  val service = OpenAIChatCompletionServiceFactory.withStreaming(
    coreUrl = "https://api.fireworks.ai/inference/v1/",
    authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
  )

or only if streaming is required

  val service: OpenAIChatCompletionStreamedServiceExtra =
    OpenAIChatCompletionStreamedServiceFactory(
      coreUrl = "https://api.fireworks.ai/inference/v1/",
      authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
   )

Via dependency injection (requires openai-scala-guice lib)

  class MyClass @Inject() (openAIService: OpenAIService) {...}

II. Calling functions

Full documentation of each call with its respective inputs and settings is provided in OpenAIService. Since all the calls are async they return responses wrapped in Future.

There is a new project openai-scala-client-examples where you can find a lot of ready-to-use examples!

List models

  service.listModels.map(models =>
    models.foreach(println)
  )

Retrieve model

  service.retrieveModel(ModelId.text_davinci_003).map(model =>
    println(model.getOrElse("N/A"))
  )

Create completion

  val text = """Extract the name and mailing address from this email:
               |Dear Kelly,
               |It was great to talk to you at the seminar. I thought Jane's talk was quite good.
               |Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
               |Best,
               |Maya
             """.stripMargin

  service.createCompletion(text).map(completion =>
    println(completion.choices.head.text)
  )

Create completion with a custom setting

  val text = """Extract the name and mailing address from this email:
               |Dear Kelly,
               |It was great to talk to you at the seminar. I thought Jane's talk was quite good.
               |Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
               |Best,
               |Maya
             """.stripMargin

  service.createCompletion(
    text,
    settings = CreateCompletionSettings(
      model = ModelId.gpt_4o,
      max_tokens = Some(1500),
      temperature = Some(0.9),
      presence_penalty = Some(0.2),
      frequency_penalty = Some(0.2)
    )
  ).map(completion =>
    println(completion.choices.head.text)
  )

Create completion with streaming and a custom setting

  val source = service.createCompletionStreamed(
    prompt = "Write me a Shakespeare poem about two cats playing baseball in Russia using at least 2 pages",
    settings = CreateCompletionSettings(
      model = ModelId.text_davinci_003,
      max_tokens = Some(1500),
      temperature = Some(0.9),
      presence_penalty = Some(0.2),
      frequency_penalty = Some(0.2)
    )
  )

  source.map(completion => 
    println(completion.choices.head.text)
  ).runWith(Sink.ignore)

For this to work you need to use OpenAIServiceStreamedFactory from openai-scala-client-stream lib.

Create chat completion

  val createChatCompletionSettings = CreateChatCompletionSettings(
    model = ModelId.gpt_4o
  )

  val messages = Seq(
    SystemMessage("You are a helpful assistant."),
    UserMessage("Who won the world series in 2020?"),
    AssistantMessage("The Los Angeles Dodgers won the World Series in 2020."),
    UserMessage("Where was it played?"),
  )

  service.createChatCompletion(
    messages = messages,
    settings = createChatCompletionSettings
  ).map { chatCompletion =>
    println(chatCompletion.choices.head.message.content)
  }

Create chat completion for functions

  val messages = Seq(
    SystemMessage("You are a helpful assistant."),
    UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?")
  )

  // as a param type we can use "number", "string", "boolean", "object", "array", and "null"
  val tools = Seq(
    FunctionSpec(
      name = "get_current_weather",
      description = Some("Get the current weather in a given location"),
      parameters = Map(
        "type" -> "object",
        "properties" -> Map(
          "location" -> Map(
            "type" -> "string",
            "description" -> "The city and state, e.g. San Francisco, CA"
          ),
          "unit" -> Map(
            "type" -> "string",
            "enum" -> Seq("celsius", "fahrenheit")
          )
        ),
        "required" -> Seq("location")
      )
    )
  )

  // if we want to force the model to use the above function as a response
  // we can do so by passing: responseToolChoice = Some("get_current_weather")`
  service.createChatToolCompletion(
    messages = messages,
    tools = tools,
    responseToolChoice = None, // means "auto"
    settings = CreateChatCompletionSettings(ModelId.gpt_3_5_turbo_1106)
  ).map { response =>
    val chatFunCompletionMessage = response.choices.head.message
    val toolCalls = chatFunCompletionMessage.tool_calls.collect {
      case (id, x: FunctionCallSpec) => (id, x)
    }

    println(
      "tool call ids                : " + toolCalls.map(_._1).mkString(", ")
    )
    println(
      "function/tool call names     : " + toolCalls.map(_._2.name).mkString(", ")
    )
    println(
      "function/tool call arguments : " + toolCalls.map(_._2.arguments).mkString(", ")
    )
  }

Create chat completion with json output

  val messages = Seq(
    SystemMessage("Give me the most populous capital cities in JSON format."),
    UserMessage("List only african countries")
  )

  val capitalsSchema = JsonSchema.Object(
    properties = Map(
      "countries" -> JsonSchema.Array(
        items = JsonSchema.Object(
          properties = Map(
            "country" -> JsonSchema.String(
              description = Some("The name of the country")
            ),
            "capital" -> JsonSchema.String(
              description = Some("The capital city of the country")
            )
          ),
          required = Seq("country", "capital")
        )
      )
    ),
    required = Seq("countries")
  )

  val jsonSchemaDef = JsonSchemaDef(
    name = "capitals_response",
    strict = true,
    structure = schema
  )

  service
    .createChatCompletion(
      messages = messages,
      settings = DefaultSettings.createJsonChatCompletion(jsonSchemaDef)
    )
    .map { response =>
      val json = Json.parse(messageContent(response))
      println(Json.prettyPrint(json))
    }

Count expected used tokens before calling createChatCompletions or createChatFunCompletions, this helps you select proper model and reduce costs. This is an experimental feature and it may not work for all models. Requires openai-scala-count-tokens lib.

An example how to count message tokens:

import io.cequence.openaiscala.service.OpenAICountTokensHelper
import io.cequence.openaiscala.domain.{AssistantMessage, BaseMessage, FunctionSpec, ModelId, SystemMessage, UserMessage}

class MyCompletionService extends OpenAICountTokensHelper {
  def exec = {
    val model = ModelId.gpt_4_turbo_2024_04_09

    // messages to be sent to OpenAI
    val messages: Seq[BaseMessage] = Seq(
      SystemMessage("You are a helpful assistant."),
      UserMessage("Who won the world series in 2020?"),
      AssistantMessage("The Los Angeles Dodgers won the World Series in 2020."),
      UserMessage("Where was it played?"),
    )

    val tokenCount = countMessageTokens(model, messages)
  }
}

An example how to count message tokens when a function is involved:

import io.cequence.openaiscala.service.OpenAICountTokensHelper
import io.cequence.openaiscala.domain.{BaseMessage, FunctionSpec, ModelId, SystemMessage, UserMessage}

class MyCompletionService extends OpenAICountTokensHelper {
  def exec = {
    val model = ModelId.gpt_4_turbo_2024_04_09
    
    // messages to be sent to OpenAI
    val messages: Seq[BaseMessage] = 
     Seq(
       SystemMessage("You are a helpful assistant."),
       UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?")
     )
     
    // function to be called
    val function: FunctionSpec = FunctionSpec(
      name = "getWeather",
      parameters = Map(
        "type" -> "object",
        "properties" -> Map(
          "location" -> Map(
            "type" -> "string",
            "description" -> "The city to get the weather for"
          ),
          "unit" -> Map("type" -> "string", "enum" -> List("celsius", "fahrenheit"))
        )
      )
    )

    val tokenCount = countFunMessageTokens(model, messages, Seq(function), Some(function.name))
  }
}

✔️ Important: After you are done using the service, you should close it by calling service.close. Otherwise, the underlying resources/threads won't be released.

III. Using adapters

Adapters for OpenAI services (chat completion, core, or full) are provided by OpenAIServiceAdapters. The adapters are used to distribute the load between multiple services, retry on transient errors, route, or provide additional functionality. See examples for more details.

Note that the adapters can be arbitrarily combined/stacked.

Round robin load distribution

  val adapters = OpenAIServiceAdapters.forFullService

  val service1 = OpenAIServiceFactory("your-api-key1")
  val service2 = OpenAIServiceFactory("your-api-key2")

  val service = adapters.roundRobin(service1, service2)

Random order load distribution

  val adapters = OpenAIServiceAdapters.forFullService

  val service1 = OpenAIServiceFactory("your-api-key1")
  val service2 = OpenAIServiceFactory("your-api-key2")

  val service = adapters.randomOrder(service1, service2)

Logging function calls

  val adapters = OpenAIServiceAdapters.forFullService

  val rawService = OpenAIServiceFactory()
  
  val service = adapters.log(
    rawService,
    "openAIService",
    logger.log
  )

Retry on transient errors (e.g. rate limit error)

  val adapters = OpenAIServiceAdapters.forFullService

  implicit val retrySettings: RetrySettings = RetrySettings(maxRetries = 10).constantInterval(10.seconds)

  val service = adapters.retry(
    OpenAIServiceFactory(),
    Some(println(_)) // simple logging
  )

Retry on a specific function using RetryHelpers directly

class MyCompletionService @Inject() (
  val actorSystem: ActorSystem,
  implicit val ec: ExecutionContext,
  implicit val scheduler: Scheduler
)(val apiKey: String)
  extends RetryHelpers {
  val service: OpenAIService = OpenAIServiceFactory(apiKey)
  implicit val retrySettings: RetrySettings =
    RetrySettings(interval = 10.seconds)

  def ask(prompt: String): Future[String] =
    for {
      completion <- service
        .createChatCompletion(
          List(MessageSpec(ChatRole.User, prompt))
        )
        .retryOnFailure
    } yield completion.choices.head.message.content
}

Route chat completion calls based on models

  val adapters = OpenAIServiceAdapters.forFullService

  // OctoAI
  val octoMLService = OpenAIChatCompletionServiceFactory(
    coreUrl = "https://text.octoai.run/v1/",
    authHeaders = Seq(("Authorization", s"Bearer ${sys.env("OCTOAI_TOKEN")}"))
  )

  // Anthropic
  val anthropicService = AnthropicServiceFactory.asOpenAI()

  // OpenAI
  val openAIService = OpenAIServiceFactory()

  val service: OpenAIService =
    adapters.chatCompletionRouter(
      // OpenAI service is default so no need to specify its models here
      serviceModels = Map(
        octoMLService -> Seq(NonOpenAIModelId.mixtral_8x22b_instruct),
        anthropicService -> Seq(
          NonOpenAIModelId.claude_2_1,
          NonOpenAIModelId.claude_3_opus_20240229,
          NonOpenAIModelId.claude_3_haiku_20240307
        )
      ),
      openAIService
    )

Chat-to-completion adapter

    val adapters = OpenAIServiceAdapters.forCoreService

    val service = adapters.chatToCompletion(
      OpenAICoreServiceFactory(
        coreUrl = "https://api.fireworks.ai/inference/v1/",
        authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
      )
    )

FAQ 🤔

Wen Scala 3?

~~Feb 2023. You are right; we chose the shortest month to do so :)~~ Done!
I got a timeout exception. How can I change the timeout setting?

You can do it either by passing the timeouts param to OpenAIServiceFactory or, if you use your own configuration file, then you can simply add it there as:

openai-scala-client {
    timeouts {
        requestTimeoutSec = 200
        readTimeoutSec = 200
        connectTimeoutSec = 5
        pooledConnectionIdleTimeoutSec = 60
    }
}

I got an exception like com.typesafe.config.ConfigException$UnresolvedSubstitution: openai-scala-client.conf @ jar:file:.../io/cequence/openai-scala-client_2.13/0.0.1/openai-scala-client_2.13-0.0.1.jar!/openai-scala-client.conf: 4: Could not resolve substitution to a value: ${OPENAI_SCALA_CLIENT_API_KEY}. What should I do?

Set the env. variable OPENAI_SCALA_CLIENT_API_KEY. If you don't have one register here.
It all looks cool. I want to chat with you about your research and development?

Just shoot us an email at [email protected].

License ⚖️

This library is available and published as open source under the terms of the MIT License.

Contributors 🙏

This project is open-source and welcomes any contribution or feedback (here).

Development of this library has been supported by - Cequence.io - The future of contracting

Created and maintained by Peter Banda.

For Tasks:

Click tags to check more tools for each tasks

list models create completion create chat completion create embeddings list fine-tunes

For Jobs:

data scientist machine learning engineer ai researcher software developer ai solutions architect

Alternative AI tools for openai-scala-client

Similar Open Source Tools

openai-scala-client

github

: 208

ai21-python

The AI21 Labs Python SDK is a comprehensive tool for interacting with the AI21 API. It provides functionalities for chat completions, conversational RAG, token counting, error handling, and support for various cloud providers like AWS, Azure, and Vertex. The SDK offers both synchronous and asynchronous usage, along with detailed examples and documentation. Users can quickly get started with the SDK to leverage AI21's powerful models for various natural language processing tasks.

github

: 62

generative-ai-python

The Google AI Python SDK is the easiest way for Python developers to build with the Gemini API. The Gemini API gives you access to Gemini models created by Google DeepMind. Gemini models are built from the ground up to be multimodal, so you can reason seamlessly across text, images, and code.

github

: 2.2k

dashscope-sdk

DashScope SDK for .NET is an unofficial SDK maintained by Cnblogs, providing various APIs for text embedding, generation, multimodal generation, image synthesis, and more. Users can interact with the SDK to perform tasks such as text completion, chat generation, function calls, file operations, and more. The project is under active development, and users are advised to check the Release Notes before upgrading.

github

: 84

candle-vllm

Candle-vllm is an efficient and easy-to-use platform designed for inference and serving local LLMs, featuring an OpenAI compatible API server. It offers a highly extensible trait-based system for rapid implementation of new module pipelines, streaming support in generation, efficient management of key-value cache with PagedAttention, and continuous batching. The tool supports chat serving for various models and provides a seamless experience for users to interact with LLMs through different interfaces.

github

: 329

aioshelly

Aioshelly is an asynchronous library designed to control Shelly devices. It is currently under development and requires Python version 3.11 or higher, along with dependencies like bluetooth-data-tools, aiohttp, and orjson. The library provides examples for interacting with Gen1 devices using CoAP protocol and Gen2/Gen3 devices using RPC and WebSocket protocols. Users can easily connect to Shelly devices, retrieve status information, and perform various actions through the provided APIs. The repository also includes example scripts for quick testing and usage guidelines for contributors to maintain consistency with the Shelly API.

github

: 51

CodeGen

CodeGen is an official release of models for Program Synthesis by Salesforce AI Research. It includes CodeGen1 and CodeGen2 models with varying parameters. The latest version, CodeGen2.5, outperforms previous models. The tool is designed for code generation tasks using large language models trained on programming and natural languages. Users can access the models through the Hugging Face Hub and utilize them for program synthesis and infill sampling. The accompanying Jaxformer library provides support for data pre-processing, training, and fine-tuning of the CodeGen models.

github

: 5.0k

mcphub.nvim

MCPHub.nvim is a powerful Neovim plugin that integrates MCP (Model Context Protocol) servers into your workflow. It offers a centralized config file for managing servers and tools, with an intuitive UI for testing resources. Ideal for LLM integration, it provides programmatic API access and interactive testing through the `:MCPHub` command.

github

: 448

instructor

Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses. Get ready to supercharge your LLM workflows!

github

: 7.7k

aiotdlib

aiotdlib is a Python asyncio Telegram client based on TDLib. It provides automatic generation of types and functions from tl schema, validation, good IDE type hinting, and high-level API methods for simpler work with tdlib. The package includes prebuilt TDLib binaries for macOS (arm64) and Debian Bullseye (amd64). Users can use their own binary by passing `library_path` argument to `Client` class constructor. Compatibility with other versions of the library is not guaranteed. The tool requires Python 3.9+ and users need to get their `api_id` and `api_hash` from Telegram docs for installation and usage.

github

: 96

byzer-llm

Easy, fast, and cheap pretrain, finetune, serving for everyone

github

: 293

client-python

The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.

github

: 570

solana-agent-kit

Solana Agent Kit is an open-source toolkit designed for connecting AI agents to Solana protocols. It enables agents, regardless of the model used, to autonomously perform various Solana actions such as trading tokens, launching new tokens, lending assets, sending compressed airdrops, executing blinks, and more. The toolkit integrates core blockchain features like token operations, NFT management via Metaplex, DeFi integration, Solana blinks, AI integration features with LangChain, autonomous modes, and AI tools. It provides ready-to-use tools for blockchain operations, supports autonomous agent actions, and offers features like memory management, real-time feedback, and error handling. Solana Agent Kit facilitates tasks such as deploying tokens, creating NFT collections, swapping tokens, lending tokens, staking SOL, and sending SPL token airdrops via ZK compression. It also includes functionalities for fetching price data from Pyth and relies on key Solana and Metaplex libraries for its operations.

github

: 1.1k

acte

Acte is a framework designed to build GUI-like tools for AI Agents. It aims to address the issues of cognitive load and freedom degrees when interacting with multiple APIs in complex scenarios. By providing a graphical user interface (GUI) for Agents, Acte helps reduce cognitive load and constraints interaction, similar to how humans interact with computers through GUIs. The tool offers APIs for starting new sessions, executing actions, and displaying screens, accessible via HTTP requests or the SessionManager class.

github

: 113

aio-pika

Aio-pika is a wrapper around aiormq for asyncio and humans. It provides a completely asynchronous API, object-oriented API, transparent auto-reconnects with complete state recovery, Python 3.7+ compatibility, transparent publisher confirms support, transactions support, and complete type-hints coverage.

github

: 1.2k

lagent

Lagent is a lightweight open-source framework that allows users to efficiently build large language model(LLM)-based agents. It also provides some typical tools to augment LLM. The overview of our framework is shown below:

github

: 2.1k

For similar tasks

jvm-openai

jvm-openai is a minimalistic unofficial OpenAI API client for the JVM, written in Java. It serves as a Java client for OpenAI API with a focus on simplicity and minimal dependencies. The tool provides support for various OpenAI APIs and endpoints, including Audio, Chat, Embeddings, Fine-tuning, Batch, Files, Uploads, Images, Models, Moderations, Assistants, Threads, Messages, Runs, Run Steps, Vector Stores, Vector Store Files, Vector Store File Batches, Invites, Users, Projects, Project Users, Project Service Accounts, Project API Keys, and Audit Logs. Users can easily integrate this tool into their Java projects to interact with OpenAI services efficiently.

github

: 70

openai-scala-client

github

: 208

embedJs

EmbedJs is a NodeJS framework that simplifies RAG application development by efficiently processing unstructured data. It segments data, creates relevant embeddings, and stores them in a vector database for quick retrieval.

github

: 352

mistral-ai-kmp

Mistral AI SDK for Kotlin Multiplatform (KMP) allows communication with Mistral API to get AI models, start a chat with the assistant, and create embeddings. The library is based on Mistral API documentation and built with Kotlin Multiplatform and Ktor client library. Sample projects like ZeChat showcase the capabilities of Mistral AI SDK. Users can interact with different Mistral AI models through ZeChat apps on Android, Desktop, and Web platforms. The library is not yet published on Maven, but users can fork the project and use it as a module dependency in their apps.

github

: 59

pgai

pgai simplifies the process of building search and Retrieval Augmented Generation (RAG) AI applications with PostgreSQL. It brings embedding and generation AI models closer to the database, allowing users to create embeddings, retrieve LLM chat completions, reason over data for classification, summarization, and data enrichment directly from within PostgreSQL in a SQL query. The tool requires an OpenAI API key and a PostgreSQL client to enable AI functionality in the database. Users can install pgai from source, run it in a pre-built Docker container, or enable it in a Timescale Cloud service. The tool provides functions to handle API keys using psql or Python, and offers various AI functionalities like tokenizing, detokenizing, embedding, chat completion, and content moderation.

github

: 4.6k

azure-functions-openai-extension

Azure Functions OpenAI Extension is a project that adds support for OpenAI LLM (GPT-3.5-turbo, GPT-4) bindings in Azure Functions. It provides NuGet packages for various functionalities like text completions, chat completions, assistants, embeddings generators, and semantic search. The project requires .NET 6 SDK or greater, Azure Functions Core Tools v4.x, and specific settings in Azure Function or local settings for development. It offers features like text completions, chat completion, assistants with custom skills, embeddings generators for text relatedness, and semantic search using vector databases. The project also includes examples in C# and Python for different functionalities.

github

: 87

openai-kit

OpenAIKit is a Swift package designed to facilitate communication with the OpenAI API. It provides methods to interact with various OpenAI services such as chat, models, completions, edits, images, embeddings, files, moderations, and speech to text. The package encourages the use of environment variables to securely inject the OpenAI API key and organization details. It also offers error handling for API requests through the `OpenAIKit.APIErrorResponse`.

github

: 688

VectorETL

VectorETL is a lightweight ETL framework designed to assist Data & AI engineers in processing data for AI applications quickly. It streamlines the conversion of diverse data sources into vector embeddings and storage in various vector databases. The framework supports multiple data sources, embedding models, and vector database targets, simplifying the creation and management of vector search systems for semantic search, recommendation systems, and other vector-based operations.

github

: 72

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675