ai21-python

AI21 Python SDK

Stars: 65

Visit

The AI21 Labs Python SDK is a comprehensive tool for interacting with the AI21 API. It provides functionalities for chat completions, conversational RAG, token counting, error handling, and support for various cloud providers like AWS, Azure, and Vertex. The SDK offers both synchronous and asynchronous usage, along with detailed examples and documentation. Users can quickly get started with the SDK to leverage AI21's powerful models for various natural language processing tasks.

README:

AI21 Labs Python SDK

Examples 🗂️
AI21 Official Documentation
Installation 💿
Usage - Chat Completions
Maestro
Agents (Beta)
Conversational RAG (Beta)
Older Models Support Usage
More Models
- Streaming
Environment Variables
Error Handling
Cloud Providers ☁️
- AWS
  - Bedrock
  - SageMaker
- Azure
- Vertex

Examples (tl;dr)

If you want to quickly get a glance how to use the AI21 Python SDK and jump straight to business, you can check out the examples. Take a look at our models and see them in action! Several examples and demonstrations have been put together to show our models' functionality and capabilities.

Check out the Examples

Feel free to dive in, experiment, and adapt these examples to suit your needs. We believe they'll help you get up and running quickly.

Documentation

The full documentation for the REST API can be found on docs.ai21.com.

Installation

pip install ai21

Usage

from ai21 import AI21Client
from ai21.models.chat import ChatMessage

client = AI21Client(
    # defaults to os.enviorn.get('AI21_API_KEY')
    api_key='my_api_key',
)

system = "You're a support engineer in a SaaS company"
messages = [
    ChatMessage(content=system, role="system"),
    ChatMessage(content="Hello, I need help with a signup process.", role="user"),
]

chat_completions = client.chat.completions.create(
    messages=messages,
    model="jamba-mini",
)

Async Usage

You can use the AsyncAI21Client to make asynchronous requests. There is no difference between the sync and the async client in terms of usage.

import asyncio

from ai21 import AsyncAI21Client
from ai21.models.chat import ChatMessage

system = "You're a support engineer in a SaaS company"
messages = [
    ChatMessage(content=system, role="system"),
    ChatMessage(content="Hello, I need help with a signup process.", role="user"),
]

client = AsyncAI21Client(
   # defaults to os.enviorn.get('AI21_API_KEY')
    api_key='my_api_key',
)


async def main():
    response = await client.chat.completions.create(
        messages=messages,
        model="jamba-mini",
    )

    print(response)


asyncio.run(main())

A more detailed example can be found here.

Chat

from ai21 import AI21Client
from ai21.models import RoleType
from ai21.models import ChatMessage

system = "You're a support engineer in a SaaS company"
messages = [
    ChatMessage(text="Hello, I need help with a signup process.", role=RoleType.USER),
    ChatMessage(text="Hi Alice, I can help you with that. What seems to be the problem?", role=RoleType.ASSISTANT),
    ChatMessage(text="I am having trouble signing up for your product with my Google account.", role=RoleType.USER),
]


client = AI21Client()
chat_response = client.chat.create(
    system=system,
    messages=messages,
    model="j2-ultra",
)

For a more detailed example, see the chat examples.

Completion

from ai21 import AI21Client


client = AI21Client()
completion_response = client.completion.create(
    prompt="This is a test prompt",
    model="j2-mid",
)

Chat Completion

from ai21 import AI21Client
from ai21.models.chat import ChatMessage

system = "You're a support engineer in a SaaS company"
messages = [
    ChatMessage(content=system, role="system"),
    ChatMessage(content="Hello, I need help with a signup process.", role="user"),
    ChatMessage(content="Hi Alice, I can help you with that. What seems to be the problem?", role="assistant"),
    ChatMessage(content="I am having trouble signing up for your product with my Google account.", role="user"),
]

client = AI21Client()

response = client.chat.completions.create(
    messages=messages,
    model="jamba-large",
    max_tokens=100,
    temperature=0.7,
    top_p=1.0,
    stop=["\n"],
)

print(response)

Note that jamba-large supports async and streaming as well.

For a more detailed example, see the completion examples.

Streaming

We currently support streaming for the Chat Completions API in Jamba.

from ai21 import AI21Client
from ai21.models.chat import ChatMessage

messages = [ChatMessage(content="What is the meaning of life?", role="user")]

client = AI21Client()

response = client.chat.completions.create(
    messages=messages,
    model="jamba-large",
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Async Streaming

import asyncio

from ai21 import AsyncAI21Client
from ai21.models.chat import ChatMessage

messages = [ChatMessage(content="What is the meaning of life?", role="user")]

client = AsyncAI21Client()


async def main():
    response = await client.chat.completions.create(
        messages=messages,
        model="jamba-mini",
        stream=True,
    )
    async for chunk in response:
        print(chunk.choices[0].delta.content, end="")


asyncio.run(main())

Maestro

AI Planning & Orchestration System built for the enterprise. Read more here.

from ai21 import AI21Client

client = AI21Client()

run_result = client.beta.maestro.runs.create_and_poll(
    input="Write a poem about the ocean",
    requirements=[
        {
            "name": "length requirement",
            "description": "The length of the poem should be less than 1000 characters",
        },
        {
            "name": "rhyme requirement",
            "description": "The poem should rhyme",
        },
    ],
    include=["requirements_result"]
)

For a more detailed example, see maestro sync and async examples.

Agents (Beta)

AI21 Agents provide a comprehensive way to create, manage, and run your Agents.

from ai21 import AI21Client
from ai21.models.agents import BudgetLevel, AgentType

client = AI21Client()

# Run the agent
run_response = client.beta.agents.runs.create_and_poll(
    agent_id=agent.id,
    input=[{"role": "user", "content": "What is 2+2?"}],
    poll_timeout_sec=120,
)

print(f"Result: {run_response.result}")

Agent CRUD Operations

from ai21 import AI21Client
from ai21.models.agents import BudgetLevel, AgentType

client = AI21Client()

# Create
agent = client.beta.agents.create(
    name="Research Assistant",
    description="Specialized in research tasks",
    budget=BudgetLevel.HIGH,
)

# Read
retrieved_agent = client.beta.agents.get(agent.id)
agents_list = client.beta.agents.list()

# Update
modified_agent = client.beta.agents.modify(
    agent.id,
    name="Enhanced Research Assistant",
    description="Updated with enhanced capabilities",
)

# Delete
delete_response = client.beta.agents.delete(agent.id)

For more detailed examples, see agent CRUD operations, basic runs, and async operations examples.

Conversational RAG (Beta)

Like chat, but with the ability to retrieve information from your Studio library.

from ai21 import AI21Client
from ai21.models.chat import ChatMessage

messages = [
    ChatMessage(content="Ask a question about your files", role="user"),
]

client = AI21Client()

client.library.files.create(
  file_path="path/to/file",
  path="path/to/file/in/library",
  labels=["my_file_label"],
)
chat_response = client.beta.conversational_rag.create(
    messages=messages,
    labels=["my_file_label"],
)

For a more detailed example, see the chat sync and async examples.

File Upload

from ai21 import AI21Client

client = AI21Client()

file_id = client.library.files.create(
    file_path="path/to/file",
    path="path/to/file/in/library",
    labels=["label1", "label2"],
    public_url="www.example.com",
)

uploaded_file = client.library.files.get(file_id)

Environment Variables

You can set several environment variables to configure the client.

Logging

We use the standard library logging module.

To enable logging, set the AI21_LOG_LEVEL environment variable.

$ export AI21_LOG_LEVEL=debug

Other Important Environment Variables

AI21_API_KEY - Your API key. If not set, you must pass it to the client constructor.
AI21_API_VERSION - The API version. Defaults to v1.
AI21_API_HOST - The API host. Defaults to https://api.ai21.com/studio/v1/.
AI21_TIMEOUT_SEC - The timeout for API requests.
AI21_NUM_RETRIES - The maximum number of retries for API requests. Defaults to 3 retries.
AI21_AWS_REGION - The AWS region to use for AWS clients. Defaults to us-east-1.

Error Handling

from ai21 import errors as ai21_errors
from ai21 import AI21Client, AI21APIError
from ai21.models import ChatMessage

client = AI21Client()

system = "You're a support engineer in a SaaS company"
messages = [
        # Notice the given role does not exist and will be the reason for the raised error
        ChatMessage(text="Hello, I need help with a signup process.", role="Non-Existent-Role"),
    ]

try:
    chat_completion = client.chat.create(
        messages=messages,
        model="j2-ultra",
        system=system
    )
except ai21_errors.AI21ServerError as e:
    print("Server error and could not be reached")
    print(e.details)
except ai21_errors.TooManyRequestsError as e:
    print("A 429 status code was returned. Slow down on the requests")
except AI21APIError as e:
    print("A non 200 status code error. For more error types see ai21.errors")

Cloud Providers

AWS

AI21 Library provides convenient ways to interact with two AWS clients for use with AWS Bedrock and AWS SageMaker.

Installation

pip install -U "ai21[AWS]"

This will make sure you have the required dependencies installed, including boto3 >= 1.28.82.

Usage

Bedrock

from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage

client = AI21BedrockClient(region='us-east-1') # region is optional, as you can use the env variable instead

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

response = client.chat.completions.create(
    messages=messages,
    model_id=BedrockModelID.JAMBA_1_5_LARGE,
)

Stream

from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage

system = "You're a support engineer in a SaaS company"
messages = [
    ChatMessage(content=system, role="system"),
    ChatMessage(content="Hello, I need help with a signup process.", role="user"),
    ChatMessage(content="Hi Alice, I can help you with that. What seems to be the problem?", role="assistant"),
    ChatMessage(content="I am having trouble signing up for your product with my Google account.", role="user"),
]

client = AI21BedrockClient()

response = client.chat.completions.create(
    messages=messages,
    model=BedrockModelID.JAMBA_1_5_LARGE,
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].message.content, end="")

Async

import asyncio
from ai21 import AsyncAI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage

client = AsyncAI21BedrockClient(region='us-east-1') # region is optional, as you can use the env variable instead

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

async def main():
    response = await client.chat.completions.create(
        messages=messages,
        model_id=BedrockModelID.JAMBA_1_5_LARGE,
    )


asyncio.run(main())

With Boto3 Session

import boto3

from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage

boto_session = boto3.Session(region_name="us-east-1")

client = AI21BedrockClient(session=boto_session)

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

response = client.chat.completions.create(
    messages=messages,
    model_id=BedrockModelID.JAMBA_1_5_LARGE,
)

Async

import boto3
import asyncio

from ai21 import AsyncAI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage

boto_session = boto3.Session(region_name="us-east-1")

client = AsyncAI21BedrockClient(session=boto_session)

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

async def main():
  response = await client.chat.completions.create(
      messages=messages,
      model_id=BedrockModelID.JAMBA_1_5_LARGE,
  )

asyncio.run(main())

SageMaker

from ai21 import AI21SageMakerClient

client = AI21SageMakerClient(endpoint_name="j2-endpoint-name")
response = client.summarize.create(
    source="Text to summarize",
    source_type="TEXT",
)
print(response.summary)

Async

import asyncio
from ai21 import AsyncAI21SageMakerClient

client = AsyncAI21SageMakerClient(endpoint_name="j2-endpoint-name")

async def main():
  response = await client.summarize.create(
      source="Text to summarize",
      source_type="TEXT",
  )
  print(response.summary)

asyncio.run(main())

With Boto3 Session

from ai21 import AI21SageMakerClient
import boto3
boto_session = boto3.Session(region_name="us-east-1")

client = AI21SageMakerClient(
    session=boto_session,
    endpoint_name="j2-endpoint-name",
)

Azure

If you wish to interact with your Azure endpoint on Azure AI Studio, use the AI21AzureClient and AsyncAI21AzureClient clients.

The following models are supported on Azure:

jamba-large

from ai21 import AI21AzureClient
from ai21.models.chat import ChatMessage

client = AI21AzureClient(
  base_url="https://<YOUR-ENDPOINT>.inference.ai.azure.com",
  api_key="<your Azure api key>",
)

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

response = client.chat.completions.create(
  model="jamba-mini",
  messages=messages,
)

Async

import asyncio
from ai21 import AsyncAI21AzureClient
from ai21.models.chat import ChatMessage

client = AsyncAI21AzureClient(
  base_url="https://<YOUR-ENDPOINT>.inference.ai.azure.com/v1/chat/completions",
  api_key="<your Azure api key>",
)

messages = [
  ChatMessage(content="You are a helpful assistant", role="system"),
  ChatMessage(content="What is the meaning of life?", role="user")
]

async def main():
  response = await client.chat.completions.create(
    model="jamba-large",
    messages=messages,
  )

asyncio.run(main())

Vertex

If you wish to interact with your Vertex AI endpoint on GCP, use the AI21VertexClient and AsyncAI21VertexClient clients.

The following models are supported on Vertex:

jamba-1.5-mini
jamba-1.5-large

from ai21 import AI21VertexClient

from ai21.models.chat import ChatMessage

# You can also set the project_id, region, access_token and Google credentials in the constructor
client = AI21VertexClient()

messages = ChatMessage(content="What is the meaning of life?", role="user")

response = client.chat.completions.create(
    model="jamba-1.5-mini",
    messages=[messages],
)

Async

import asyncio

from ai21 import AsyncAI21VertexClient
from ai21.models.chat import ChatMessage

# You can also set the project_id, region, access_token and Google credentials in the constructor
client = AsyncAI21VertexClient()


async def main():
    messages = ChatMessage(content="What is the meaning of life?", role="user")

    response = await client.chat.completions.create(
        model="jamba-1.5-mini",
        messages=[messages],
    )

asyncio.run(main())

Happy prompting! 🚀

For Tasks:

Click tags to check more tools for each tasks

generate chat responses count tokens handle errors interact with cloud providers perform natural language processing tasks

For Jobs:

data scientist machine learning engineer ai researcher software developer nlp engineer

Alternative AI tools for ai21-python

Similar Open Source Tools

ai21-python

github

: 65

LocalLLMClient

LocalLLMClient is a Swift package designed to interact with local Large Language Models (LLMs) on Apple platforms. It supports GGUF, MLX models, and the FoundationModels framework, providing streaming API, multimodal capabilities, and tool calling functionalities. Users can easily integrate this tool to work with various models for text generation and processing. The package also includes advanced features for low-level API control and multimodal image processing. LocalLLMClient is experimental and subject to API changes, offering support for iOS, macOS, and Linux platforms.

github

: 82

chatluna

Chatluna is a machine learning model plugin that provides chat services with large language models. It is highly extensible, supports multiple output formats, and offers features like custom conversation presets, rate limiting, and context awareness. Users can deploy Chatluna under Koishi without additional configuration. The plugin supports various models/platforms like OpenAI, Azure OpenAI, Google Gemini, and more. It also provides preset customization using YAML files and allows for easy forking and development within Koishi projects. However, the project lacks web UI, HTTP server, and project documentation, inviting contributions from the community.

github

: 345

OllamaSharp

OllamaSharp is a .NET binding for the Ollama API, providing an intuitive API client to interact with Ollama. It offers support for all Ollama API endpoints, real-time streaming, progress reporting, and an API console for remote management. Users can easily set up the client, list models, pull models with progress feedback, stream completions, and build interactive chats. The project includes a demo console for exploring and managing the Ollama host.

github

: 1.1k

llm

The 'llm' package for Emacs provides an interface for interacting with Large Language Models (LLMs). It abstracts functionality to a higher level, concealing API variations and ensuring compatibility with various LLMs. Users can set up providers like OpenAI, Gemini, Vertex, Claude, Ollama, GPT4All, and a fake client for testing. The package allows for chat interactions, embeddings, token counting, and function calling. It also offers advanced prompt creation and logging capabilities. Users can handle conversations, create prompts with placeholders, and contribute by creating providers.

github

: 340

koog

Koog is a Kotlin-based framework for building and running AI agents entirely in idiomatic Kotlin. It allows users to create agents that interact with tools, handle complex workflows, and communicate with users. Key features include pure Kotlin implementation, MCP integration, embedding capabilities, custom tool creation, ready-to-use components, intelligent history compression, powerful streaming API, persistent agent memory, comprehensive tracing, flexible graph workflows, modular feature system, scalable architecture, and multiplatform support.

github

: 3.2k

ollama4j

Ollama4j is a Java library that serves as a wrapper or binding for the Ollama server. It allows users to communicate with the Ollama server and manage models for various deployment scenarios. The library provides APIs for interacting with Ollama, generating fake data, testing UI interactions, translating messages, and building web UIs. Users can easily integrate Ollama4j into their Java projects to leverage the functionalities offered by the Ollama server.

github

: 438

hayhooks

Hayhooks is a tool that simplifies the deployment and serving of Haystack pipelines as REST APIs. It allows users to wrap their pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, users can easily convert their Haystack pipelines into API services with minimal boilerplate code.

github

: 115

tools

Strands Agents Tools is a community-driven project that provides a powerful set of tools for your agents to use. It bridges the gap between large language models and practical applications by offering ready-to-use tools for file operations, system execution, API interactions, mathematical operations, and more. The tools cover a wide range of functionalities including file operations, shell integration, memory storage, web infrastructure, HTTP client, Slack client, Python execution, mathematical tools, AWS integration, image and video processing, audio output, environment management, task scheduling, advanced reasoning, swarm intelligence, dynamic MCP client, parallel tool execution, browser automation, diagram creation, RSS feed management, and computer automation.

github

: 620

traceroot

TraceRoot is a tool that helps engineers debug production issues 10× faster using AI-powered analysis of traces, logs, and code context. It accelerates the debugging process with AI-powered insights, integrates seamlessly into the development workflow, provides real-time trace and log analysis, code context understanding, and intelligent assistance. Features include ease of use, LLM flexibility, distributed services, AI debugging interface, and integration support. Users can get started with TraceRoot Cloud for a 7-day trial or self-host the tool. SDKs are available for Python and JavaScript/TypeScript.

github

: 336

aide

Aide is a code-first API documentation and utility library for Rust, along with other related utility crates for web-servers. It provides tools for creating API documentation and handling JSON request validation. The repository contains multiple crates that offer drop-in replacements for existing libraries, ensuring compatibility with Aide. Contributions are welcome, and the code is dual licensed under MIT and Apache-2.0. If Aide does not meet your requirements, you can explore similar libraries like paperclip, utoipa, and okapi.

github

: 433

fastapi

智元 Fast API is a one-stop API management system that unifies various LLM APIs in terms of format, standards, and management, achieving the ultimate in functionality, performance, and user experience. It supports various models from companies like OpenAI, Azure, Baidu, Keda Xunfei, Alibaba Cloud, Zhifu AI, Google, DeepSeek, 360 Brain, and Midjourney. The project provides user and admin portals for preview, supports cluster deployment, multi-site deployment, and cross-zone deployment. It also offers Docker deployment, a public API site for registration, and screenshots of the admin and user portals. The API interface is similar to OpenAI's interface, and the project is open source with repositories for API, web, admin, and SDK on GitHub and Gitee.

github

: 265

ai-manus

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment. It offers deployment with minimal dependencies, supports multiple tools like Terminal, Browser, File, Web Search, and messaging tools, allocates separate sandboxes for tasks, manages session history, supports stopping and interrupting conversations, file upload and download, and is multilingual. The system also provides user login and authentication. The project primarily relies on Docker for development and deployment, with model capability requirements and recommended Deepseek and GPT models.

github

: 976

crystal

Crystal is an Electron desktop application that allows users to run, inspect, and test multiple Claude Code instances simultaneously using git worktrees. It provides features such as parallel sessions, git worktree isolation, session persistence, git integration, change tracking, notifications, and the ability to run scripts. Crystal simplifies the workflow by creating isolated sessions, iterating with Claude Code, reviewing diff changes, and squashing commits for a clean history. It is a tool designed for collaborative AI notebook editing and testing.

github

: 1.9k

hyper-mcp

hyper-mcp is a fast and secure MCP server that enables adding AI capabilities to applications through WebAssembly plugins. It supports writing plugins in various languages, distributing them via standard OCI registries, and running them in resource-constrained environments. The tool offers sandboxing with WASM for limiting access, cross-platform compatibility, and deployment flexibility. Security features include sandboxed plugins, memory-safe execution, secure plugin distribution, and fine-grained access control. Users can configure the tool for global or project-specific use, start the server with different transport options, and utilize available plugins for tasks like time calculations, QR code generation, hash generation, IP retrieval, and webpage fetching.

github

: 787

pdr_ai_v2

pdr_ai_v2 is a Python library for implementing machine learning algorithms and models. It provides a wide range of tools and functionalities for data preprocessing, model training, evaluation, and deployment. The library is designed to be user-friendly and efficient, making it suitable for both beginners and experienced data scientists. With pdr_ai_v2, users can easily build and deploy machine learning models for various applications, such as classification, regression, clustering, and more.

github

: 599

For similar tasks

tokencost

Tokencost is a clientside tool for calculating the USD cost of using major Large Language Model (LLMs) APIs by estimating the cost of prompts and completions. It helps track the latest price changes of major LLM providers, accurately count prompt tokens before sending OpenAI requests, and easily integrate to get the cost of a prompt or completion with a single function. Users can calculate prompt and completion costs using OpenAI requests, count tokens in prompts formatted as message lists or string prompts, and refer to a cost table with updated prices for various LLM models. The tool also supports callback handlers for LLM wrapper/framework libraries like LlamaIndex and Langchain.

github

: 1.6k

llm

github

: 340

gigachat

GigaChat is a Python library that allows GigaChain to interact with GigaChat, a neural network model capable of engaging in dialogue, writing code, creating texts, and images on demand. Data exchange with the service is facilitated through the GigaChat API. The library supports processing token streaming, as well as working in synchronous or asynchronous mode. It enables precise token counting in text using the GigaChat API.

github

: 74

client

Gemini API PHP Client is a library that allows you to interact with Google's generative AI models, such as Gemini Pro and Gemini Pro Vision. It provides functionalities for basic text generation, multimodal input, chat sessions, streaming responses, tokens counting, listing models, and advanced usages like safety settings and custom HTTP client usage. The library requires an API key to access Google's Gemini API and can be installed using Composer. It supports various features like generating content, starting chat sessions, embedding content, counting tokens, and listing available models.

github

: 97

gemini-cli

gemini-cli is a versatile command-line interface for Google's Gemini LLMs, written in Go. It includes tools for chatting with models, generating/comparing embeddings, and storing data in SQLite for analysis. Users can interact with Gemini models through various subcommands like prompt, chat, counttok, embed content, embed db, and embed similar.

github

: 95

client

Gemini PHP is a PHP API client for interacting with the Gemini AI API. It allows users to generate content, chat, count tokens, configure models, embed resources, list models, get model information, troubleshoot timeouts, and test API responses. The client supports various features such as text-only input, text-and-image input, multi-turn conversations, streaming content generation, token counting, model configuration, and embedding techniques. Users can interact with Gemini's API to perform tasks related to natural language generation and text analysis.

github

: 198

ai21-python

github

: 65

Tiktoken

Tiktoken is a high-performance implementation focused on token count operations. It provides various encodings like o200k_base, cl100k_base, r50k_base, p50k_base, and p50k_edit. Users can easily encode and decode text using the provided API. The repository also includes a benchmark console app for performance tracking. Contributions in the form of PRs are welcome.

github

: 78

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

ai21-python

README:

AI21 Labs Python SDK

Table of Contents

Examples (tl;dr)

Check out the Examples

Documentation

Installation

Usage

Async Usage

Chat

Completion

Chat Completion

Streaming

Async Streaming

Maestro

Agents (Beta)

Agent CRUD Operations

Conversational RAG (Beta)

File Upload

Environment Variables

Logging

Other Important Environment Variables

Error Handling

Cloud Providers

AWS

Installation

Usage

Bedrock

Stream

Async

With Boto3 Session

Async

SageMaker

Async

With Boto3 Session

Azure

Async

Vertex

Async

For Tasks:

For Jobs:

Alternative AI tools for ai21-python

Similar Open Source Tools

ai21-python

LocalLLMClient

chatluna

OllamaSharp

llm

koog

ollama4j

hayhooks

tools

traceroot

aide

fastapi

ai-manus

crystal

hyper-mcp

pdr_ai_v2

For similar tasks

tokencost

llm

gigachat

client

gemini-cli

client

ai21-python

Tiktoken

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick