magentic

Seamlessly integrate LLMs as Python functions

Stars: 2209

Visit

Easily integrate Large Language Models into your Python code. Simply use the `@prompt` and `@chatprompt` decorators to create functions that return structured output from the LLM. Mix LLM queries and function calling with regular Python code to create complex logic.

README:

magentic

Seamlessly integrate Large Language Models into Python code. Use the @prompt and @chatprompt decorators to create functions that return structured output from an LLM. Combine LLM queries and tool use with traditional Python code to build complex agentic systems.

Features

Structured Outputs using pydantic models and built-in python types.
Streaming of structured outputs and function calls, to use them while being generated.
LLM-Assisted Retries to improve LLM adherence to complex output schemas.
Observability using OpenTelemetry, with native Pydantic Logfire integration.
Type Annotations to work nicely with linters and IDEs.
Configuration options for multiple LLM providers including OpenAI, Anthropic, and Ollama.
Many more features: Chat Prompting, Parallel Function Calling, Vision, Formatting, Asyncio...

Installation

pip install magentic

or using uv

uv add magentic

Configure your OpenAI API key by setting the OPENAI_API_KEY environment variable. To configure a different LLM provider see Configuration for more.

Usage

@prompt

The @prompt decorator allows you to define a template for a Large Language Model (LLM) prompt as a Python function. When this function is called, the arguments are inserted into the template, then this prompt is sent to an LLM which generates the function output.

from magentic import prompt


@prompt('Add more "dude"ness to: {phrase}')
def dudeify(phrase: str) -> str: ...  # No function body as this is never executed


dudeify("Hello, how are you?")
# "Hey, dude! What's up? How's it going, my man?"

The @prompt decorator will respect the return type annotation of the decorated function. This can be any type supported by pydantic including a pydantic model.

from magentic import prompt
from pydantic import BaseModel


class Superhero(BaseModel):
    name: str
    age: int
    power: str
    enemies: list[str]


@prompt("Create a Superhero named {name}.")
def create_superhero(name: str) -> Superhero: ...


create_superhero("Garden Man")
# Superhero(name='Garden Man', age=30, power='Control over plants', enemies=['Pollution Man', 'Concrete Woman'])

See Structured Outputs for more.

@chatprompt

The @chatprompt decorator works just like @prompt but allows you to pass chat messages as a template rather than a single text prompt. This can be used to provide a system message or for few-shot prompting where you provide example responses to guide the model's output. Format fields denoted by curly braces {example} will be filled in all messages (except FunctionResultMessage).

from magentic import chatprompt, AssistantMessage, SystemMessage, UserMessage
from pydantic import BaseModel


class Quote(BaseModel):
    quote: str
    character: str


@chatprompt(
    SystemMessage("You are a movie buff."),
    UserMessage("What is your favorite quote from Harry Potter?"),
    AssistantMessage(
        Quote(
            quote="It does not do to dwell on dreams and forget to live.",
            character="Albus Dumbledore",
        )
    ),
    UserMessage("What is your favorite quote from {movie}?"),
)
def get_movie_quote(movie: str) -> Quote: ...


get_movie_quote("Iron Man")
# Quote(quote='I am Iron Man.', character='Tony Stark')

See Chat Prompting for more.

FunctionCall

An LLM can also decide to call functions. In this case the @prompt-decorated function returns a FunctionCall object which can be called to execute the function using the arguments provided by the LLM.

from typing import Literal

from magentic import prompt, FunctionCall


def search_twitter(query: str, category: Literal["latest", "people"]) -> str:
    """Searches Twitter for a query."""
    print(f"Searching Twitter for {query!r} in category {category!r}")
    return "<twitter results>"


def search_youtube(query: str, channel: str = "all") -> str:
    """Searches YouTube for a query."""
    print(f"Searching YouTube for {query!r} in channel {channel!r}")
    return "<youtube results>"


@prompt(
    "Use the appropriate search function to answer: {question}",
    functions=[search_twitter, search_youtube],
)
def perform_search(question: str) -> FunctionCall[str]: ...


output = perform_search("What is the latest news on LLMs?")
print(output)
# > FunctionCall(<function search_twitter at 0x10c367d00>, 'LLMs', 'latest')
output()
# > Searching Twitter for 'Large Language Models news' in category 'latest'
# '<twitter results>'

See Function Calling for more.

@prompt_chain

Sometimes the LLM requires making one or more function calls to generate a final answer. The @prompt_chain decorator will resolve FunctionCall objects automatically and pass the output back to the LLM to continue until the final answer is reached.

In the following example, when describe_weather is called the LLM first calls the get_current_weather function, then uses the result of this to formulate its final answer which gets returned.

from magentic import prompt_chain


def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    # Pretend to query an API
    return {"temperature": "72", "forecast": ["sunny", "windy"]}


@prompt_chain(
    "What's the weather like in {city}?",
    functions=[get_current_weather],
)
def describe_weather(city: str) -> str: ...


describe_weather("Boston")
# 'The current weather in Boston is 72°F and it is sunny and windy.'

LLM-powered functions created using @prompt, @chatprompt and @prompt_chain can be supplied as functions to other @prompt/@prompt_chain decorators, just like regular python functions. This enables increasingly complex LLM-powered functionality, while allowing individual components to be tested and improved in isolation.

Streaming

The StreamedStr (and AsyncStreamedStr) class can be used to stream the output of the LLM. This allows you to process the text while it is being generated, rather than receiving the whole output at once.

from magentic import prompt, StreamedStr


@prompt("Tell me about {country}")
def describe_country(country: str) -> StreamedStr: ...


# Print the chunks while they are being received
for chunk in describe_country("Brazil"):
    print(chunk, end="")
# 'Brazil, officially known as the Federative Republic of Brazil, is ...'

Multiple StreamedStr can be created at the same time to stream LLM outputs concurrently. In the below example, generating the description for multiple countries takes approximately the same amount of time as for a single country.

from time import time

countries = ["Australia", "Brazil", "Chile"]


# Generate the descriptions one at a time
start_time = time()
for country in countries:
    # Converting `StreamedStr` to `str` blocks until the LLM output is fully generated
    description = str(describe_country(country))
    print(f"{time() - start_time:.2f}s : {country} - {len(description)} chars")

# 22.72s : Australia - 2130 chars
# 41.63s : Brazil - 1884 chars
# 74.31s : Chile - 2968 chars


# Generate the descriptions concurrently by creating the StreamedStrs at the same time
start_time = time()
streamed_strs = [describe_country(country) for country in countries]
for country, streamed_str in zip(countries, streamed_strs):
    description = str(streamed_str)
    print(f"{time() - start_time:.2f}s : {country} - {len(description)} chars")

# 22.79s : Australia - 2147 chars
# 23.64s : Brazil - 2202 chars
# 24.67s : Chile - 2186 chars

Object Streaming

Structured outputs can also be streamed from the LLM by using the return type annotation Iterable (or AsyncIterable). This allows each item to be processed while the next one is being generated.

from collections.abc import Iterable
from time import time

from magentic import prompt
from pydantic import BaseModel


class Superhero(BaseModel):
    name: str
    age: int
    power: str
    enemies: list[str]


@prompt("Create a Superhero team named {name}.")
def create_superhero_team(name: str) -> Iterable[Superhero]: ...


start_time = time()
for hero in create_superhero_team("The Food Dudes"):
    print(f"{time() - start_time:.2f}s : {hero}")

# 2.23s : name='Pizza Man' age=30 power='Can shoot pizza slices from his hands' enemies=['The Hungry Horde', 'The Junk Food Gang']
# 4.03s : name='Captain Carrot' age=35 power='Super strength and agility from eating carrots' enemies=['The Sugar Squad', 'The Greasy Gang']
# 6.05s : name='Ice Cream Girl' age=25 power='Can create ice cream out of thin air' enemies=['The Hot Sauce Squad', 'The Healthy Eaters']

See Streaming for more.

Asyncio

Asynchronous functions / coroutines can be used to concurrently query the LLM. This can greatly increase the overall speed of generation, and also allow other asynchronous code to run while waiting on LLM output. In the below example, the LLM generates a description for each US president while it is waiting on the next one in the list. Measuring the characters generated per second shows that this example achieves a 7x speedup over serial processing.

import asyncio
from time import time
from typing import AsyncIterable

from magentic import prompt


@prompt("List ten presidents of the United States")
async def iter_presidents() -> AsyncIterable[str]: ...


@prompt("Tell me more about {topic}")
async def tell_me_more_about(topic: str) -> str: ...


# For each president listed, generate a description concurrently
start_time = time()
tasks = []
async for president in await iter_presidents():
    # Use asyncio.create_task to schedule the coroutine for execution before awaiting it
    # This way descriptions will start being generated while the list of presidents is still being generated
    task = asyncio.create_task(tell_me_more_about(president))
    tasks.append(task)

descriptions = await asyncio.gather(*tasks)

# Measure the characters per second
total_chars = sum(len(desc) for desc in descriptions)
time_elapsed = time() - start_time
print(total_chars, time_elapsed, total_chars / time_elapsed)
# 24575 28.70 856.07


# Measure the characters per second to describe a single president
start_time = time()
out = await tell_me_more_about("George Washington")
time_elapsed = time() - start_time
print(len(out), time_elapsed, len(out) / time_elapsed)
# 2206 18.72 117.78

See Asyncio for more.

Additional Features

The functions argument to @prompt can contain async/coroutine functions. When the corresponding FunctionCall objects are called the result must be awaited.
The Annotated type annotation can be used to provide descriptions and other metadata for function parameters. See the pydantic documentation on using Field to describe function arguments.
The @prompt and @prompt_chain decorators also accept a model argument. You can pass an instance of OpenaiChatModel to use GPT4 or configure a different temperature. See below.
Register other types to use as return type annotations in @prompt functions by following the example notebook for a Pandas DataFrame.

Backend/LLM Configuration

Magentic supports multiple LLM providers or "backends". This roughly refers to which Python package is used to interact with the LLM API. The following backends are supported.

OpenAI

The default backend, using the openai Python package and supports all features of magentic.

No additional installation is required. Just import the OpenaiChatModel class from magentic.

from magentic import OpenaiChatModel

model = OpenaiChatModel("gpt-4o")

Ollama via OpenAI

Ollama supports an OpenAI-compatible API, which allows you to use Ollama models via the OpenAI backend.

First, install ollama from ollama.com. Then, pull the model you want to use.

ollama pull llama3.2

Then, specify the model name and base_url when creating the OpenaiChatModel instance.

from magentic import OpenaiChatModel

model = OpenaiChatModel("llama3.2", base_url="http://localhost:11434/v1/")

Other OpenAI-compatible APIs

When using the openai backend, setting the MAGENTIC_OPENAI_BASE_URL environment variable or using OpenaiChatModel(..., base_url="http://localhost:8080") in code allows you to use magentic with any OpenAI-compatible API e.g. Azure OpenAI Service, LiteLLM OpenAI Proxy Server, LocalAI. Note that if the API does not support tool calls then you will not be able to create prompt-functions that return Python objects, but other features of magentic will still work.

To use Azure with the openai backend you will need to set the MAGENTIC_OPENAI_API_TYPE environment variable to "azure" or use OpenaiChatModel(..., api_type="azure"), and also set the environment variables needed by the openai package to access Azure. See https://github.com/openai/openai-python#microsoft-azure-openai

Anthropic

This uses the anthropic Python package and supports all features of magentic.

Install the magentic package with the anthropic extra, or install the anthropic package directly.

pip install "magentic[anthropic]"

Then import the AnthropicChatModel class.

from magentic.chat_model.anthropic_chat_model import AnthropicChatModel

model = AnthropicChatModel("claude-3-5-sonnet-latest")

LiteLLM

This uses the litellm Python package to enable querying LLMs from many different providers. Note: some models may not support all features of magentic e.g. function calling/structured output and streaming.

Install the magentic package with the litellm extra, or install the litellm package directly.

pip install "magentic[litellm]"

Then import the LitellmChatModel class.

from magentic.chat_model.litellm_chat_model import LitellmChatModel

model = LitellmChatModel("gpt-4o")

Mistral

This uses the openai Python package with some small modifications to make the API queries compatible with the Mistral API. It supports all features of magentic. However tool calls (including structured outputs) are not streamed so are received all at once.

Note: a future version of magentic might switch to using the mistral Python package.

No additional installation is required. Just import the MistralChatModel class.

from magentic.chat_model.mistral_chat_model import MistralChatModel

model = MistralChatModel("mistral-large-latest")

Configure a Backend

The default ChatModel used by magentic (in @prompt, @chatprompt, etc.) can be configured in several ways. When a prompt-function or chatprompt-function is called, the ChatModel to use follows this order of preference

The ChatModel instance provided as the model argument to the magentic decorator
The current chat model context, created using with MyChatModel:
The global ChatModel created from environment variables and the default settings in src/magentic/settings.py

The following code snippet demonstrates this behavior:

from magentic import OpenaiChatModel, prompt
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel


@prompt("Say hello")
def say_hello() -> str: ...


@prompt(
    "Say hello",
    model=AnthropicChatModel("claude-3-5-sonnet-latest"),
)
def say_hello_anthropic() -> str: ...


say_hello()  # Uses env vars or default settings

with OpenaiChatModel("gpt-4o-mini", temperature=1):
    say_hello()  # Uses openai with gpt-4o-mini and temperature=1 due to context manager
    say_hello_anthropic()  # Uses Anthropic claude-3-5-sonnet-latest because explicitly configured

The following environment variables can be set.

Environment Variable	Description	Example
MAGENTIC_BACKEND	The package to use as the LLM backend	anthropic / openai / litellm
MAGENTIC_ANTHROPIC_MODEL	Anthropic model	claude-3-haiku-20240307
MAGENTIC_ANTHROPIC_API_KEY	Anthropic API key to be used by magentic	sk-...
MAGENTIC_ANTHROPIC_BASE_URL	Base URL for an Anthropic-compatible API	http://localhost:8080
MAGENTIC_ANTHROPIC_MAX_TOKENS	Max number of generated tokens	1024
MAGENTIC_ANTHROPIC_TEMPERATURE	Temperature	0.5
MAGENTIC_LITELLM_MODEL	LiteLLM model	claude-2
MAGENTIC_LITELLM_API_BASE	The base url to query	http://localhost:11434
MAGENTIC_LITELLM_MAX_TOKENS	LiteLLM max number of generated tokens	1024
MAGENTIC_LITELLM_TEMPERATURE	LiteLLM temperature	0.5
MAGENTIC_MISTRAL_MODEL	Mistral model	mistral-large-latest
MAGENTIC_MISTRAL_API_KEY	Mistral API key to be used by magentic	XEG...
MAGENTIC_MISTRAL_BASE_URL	Base URL for an Mistral-compatible API	http://localhost:8080
MAGENTIC_MISTRAL_MAX_TOKENS	Max number of generated tokens	1024
MAGENTIC_MISTRAL_SEED	Seed for deterministic sampling	42
MAGENTIC_MISTRAL_TEMPERATURE	Temperature	0.5
MAGENTIC_OPENAI_MODEL	OpenAI model	gpt-4
MAGENTIC_OPENAI_API_KEY	OpenAI API key to be used by magentic	sk-...
MAGENTIC_OPENAI_API_TYPE	Allowed options: "openai", "azure"	azure
MAGENTIC_OPENAI_BASE_URL	Base URL for an OpenAI-compatible API	http://localhost:8080
MAGENTIC_OPENAI_MAX_TOKENS	OpenAI max number of generated tokens	1024
MAGENTIC_OPENAI_SEED	Seed for deterministic sampling	42
MAGENTIC_OPENAI_TEMPERATURE	OpenAI temperature	0.5

Type Checking

Many type checkers will raise warnings or errors for functions with the @prompt decorator due to the function having no body or return value. There are several ways to deal with these.

Disable the check globally for the type checker. For example in mypy by disabling error code empty-body.
```
# pyproject.toml
[tool.mypy]
disable_error_code = ["empty-body"]
```
Make the function body ... (this does not satisfy mypy) or raise.
```
@prompt("Choose a color")
def random_color() -> str: ...
```

Use comment # type: ignore[empty-body] on each function. In this case you can add a docstring instead of ....

@prompt("Choose a color")
def random_color() -> str:  # type: ignore[empty-body]
    """Returns a random color."""

For Tasks:

Click tags to check more tools for each tasks

generate text answer questions translate languages write code debug code

For Jobs:

content writer chatbot developer ai researcher data scientist software engineer

Alternative AI tools for magentic

Similar Open Source Tools

magentic

github

: 2.2k

syncode

SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.

github

: 225

duckdb-airport-extension

The 'duckdb-airport-extension' is a tool that enables the use of Arrow Flight with DuckDB. It provides functions to list available Arrow Flights at a specific endpoint and to retrieve the contents of an Arrow Flight. The extension also supports creating secrets for authentication purposes. It includes features for serializing filters and optimizing projections to enhance data transmission efficiency. The tool is built on top of gRPC and the Arrow IPC format, offering high-performance data services for data processing and retrieval.

github

: 170

monacopilot

Monacopilot is a powerful and customizable AI auto-completion plugin for the Monaco Editor. It supports multiple AI providers such as Anthropic, OpenAI, Groq, and Google, providing real-time code completions with an efficient caching system. The plugin offers context-aware suggestions, customizable completion behavior, and framework agnostic features. Users can also customize the model support and trigger completions manually. Monacopilot is designed to enhance coding productivity by providing accurate and contextually appropriate completions in daily spoken language.

github

: 111

syncode

github

: 251

auto-playwright

Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.

github

: 298

datadreamer

DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.

github

: 77

receipt-scanner

The receipt-scanner repository is an AI-Powered Receipt and Invoice Scanner for Laravel that allows users to easily extract structured receipt data from images, PDFs, and emails within their Laravel application using OpenAI. It provides a light wrapper around OpenAI Chat and Completion endpoints, supports various input formats, and integrates with Textract for OCR functionality. Users can install the package via composer, publish configuration files, and use it to extract data from plain text, PDFs, images, Word documents, and web content. The scanned receipt data is parsed into a DTO structure with main classes like Receipt, Merchant, and LineItem.

github

: 95

LLMDebugger

This repository contains the code and dataset for LDB, a novel debugging framework that enables Large Language Models (LLMs) to refine their generated programs by tracking the values of intermediate variables throughout the runtime execution. LDB segments programs into basic blocks, allowing LLMs to concentrate on simpler code units, verify correctness block by block, and pinpoint errors efficiently. The tool provides APIs for debugging and generating code with debugging messages, mimicking how human developers debug programs.

github

: 302

detoxify

Detoxify is a library that provides trained models and code to predict toxic comments on 3 Jigsaw challenges: Toxic comment classification, Unintended Bias in Toxic comments, Multilingual toxic comment classification. It includes models like 'original', 'unbiased', and 'multilingual' trained on different datasets to detect toxicity and minimize bias. The library aims to help in stopping harmful content online by interpreting visual content in context. Users can fine-tune the models on carefully constructed datasets for research purposes or to aid content moderators in flagging out harmful content quicker. The library is built to be user-friendly and straightforward to use.

github

: 980

extractor

Extractor is an AI-powered data extraction library for Laravel that leverages OpenAI's capabilities to effortlessly extract structured data from various sources, including images, PDFs, and emails. It features a convenient wrapper around OpenAI Chat and Completion endpoints, supports multiple input formats, includes a flexible Field Extractor for arbitrary data extraction, and integrates with Textract for OCR functionality. Extractor utilizes JSON Mode from the latest GPT-3.5 and GPT-4 models, providing accurate and efficient data extraction.

github

: 86

Autono

github

: 191

can-ai-code

Can AI Code is a self-evaluating interview tool for AI coding models. It includes interview questions written by humans and tests taken by AI, inference scripts for common API providers and CUDA-enabled quantization runtimes, a Docker-based sandbox environment for validating untrusted Python and NodeJS code, and the ability to evaluate the impact of prompting techniques and sampling parameters on large language model (LLM) coding performance. Users can also assess LLM coding performance degradation due to quantization. The tool provides test suites for evaluating LLM coding performance, a webapp for exploring results, and comparison scripts for evaluations. It supports multiple interviewers for API and CUDA runtimes, with detailed instructions on running the tool in different environments. The repository structure includes folders for interviews, prompts, parameters, evaluation scripts, comparison scripts, and more.

github

: 511

nano-graphrag

nano-GraphRAG is a simple, easy-to-hack implementation of GraphRAG that provides a smaller, faster, and cleaner version of the official implementation. It is about 800 lines of code, small yet scalable, asynchronous, and fully typed. The tool supports incremental insert, async methods, and various parameters for customization. Users can replace storage components and LLM functions as needed. It also allows for embedding function replacement and comes with pre-defined prompts for entity extraction and community reports. However, some features like covariates and global search implementation differ from the original GraphRAG. Future versions aim to address issues related to data source ID, community description truncation, and add new components.

github

: 2.6k

python-tgpt

Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.

github

: 95

parsera

Parsera is a lightweight Python library designed for scraping websites using LLMs. It offers simplicity and efficiency by minimizing token usage, enhancing speed, and reducing costs. Users can easily set up and run the tool to extract specific elements from web pages, generating JSON output with relevant data. Additionally, Parsera supports integration with various chat models, such as Azure, expanding its functionality and customization options for web scraping tasks.

github

: 1.1k

For similar tasks

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

onnxruntime-genai

ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.

github

: 442

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

khoj

Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.

github

: 28.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

danswer

Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"

github

: 10.5k

infinity

Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.

github

: 3.3k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675