
openinference
OpenTelemetry Instrumentation for AI Observability
Stars: 587

OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.
README:
OpenInference is a set of conventions and plugins that is complimentary to OpenTelemetry to enable tracing of AI applications. OpenInference is natively supported by arize-phoenix, but can be used with any OpenTelemetry-compatible backend as well.
The OpenInference specification is edited in markdown files found in the spec directory. It's designed to provide insight into the invocation of LLMs and the surrounding application context such as retrieval from vector stores and the usage of external tools such as search engines or APIs. The specification is transport and file-format agnostic, and is intended to be used in conjunction with other specifications such as JSON, ProtoBuf, and DataFrames.
OpenInference provides a set of instrumentations for popular machine learning SDKs and frameworks in a variety of languages.
Package | Description | Version |
---|---|---|
openinference-semantic-conventions |
Semantic conventions for tracing of LLM Apps. | |
openinference-instrumentation |
Reusable utilities, decorators, configurations, and helpers for instrumentation. | |
openinference-instrumentation-agno |
OpenInference Instrumentation for Agno Agents. | |
openinference-instrumentation-openai |
OpenInference Instrumentation for OpenAI SDK. | |
openinference-instrumentation-openai-agents |
OpenInference Instrumentation for OpenAI Agents SDK. | |
openinference-instrumentation-llama-index |
OpenInference Instrumentation for LlamaIndex. | |
openinference-instrumentation-dspy |
OpenInference Instrumentation for DSPy. | |
openinference-instrumentation-bedrock |
OpenInference Instrumentation for AWS Bedrock. | |
openinference-instrumentation-langchain |
OpenInference Instrumentation for LangChain. | |
openinference-instrumentation-mcp |
OpenInference Instrumentation for MCP. | |
openinference-instrumentation-mistralai |
OpenInference Instrumentation for MistralAI. | |
openinference-instrumentation-portkey |
OpenInference Instrumentation for Portkey. | |
openinference-instrumentation-guardrails |
OpenInference Instrumentation for Guardrails. | |
openinference-instrumentation-vertexai |
OpenInference Instrumentation for VertexAI. | |
openinference-instrumentation-crewai |
OpenInference Instrumentation for CrewAI. | |
openinference-instrumentation-haystack |
OpenInference Instrumentation for Haystack. | |
openinference-instrumentation-litellm |
OpenInference Instrumentation for liteLLM. | |
openinference-instrumentation-groq |
OpenInference Instrumentation for Groq. | |
openinference-instrumentation-instructor |
OpenInference Instrumentation for Instructor. | |
openinference-instrumentation-anthropic |
OpenInference Instrumentation for Anthropic. | |
openinference-instrumentation-beeai |
OpenInference Instrumentation for BeeAI. | |
openinference-instrumentation-google-genai |
OpenInference Instrumentation for Google GenAI. | |
openinference-instrumentation-google-adk |
OpenInference Instrumentation for Google ADK. | |
openinference-instrumentation-autogen-agentchat |
OpenInference Instrumentation for Microsoft Autogen AgentChat. | |
openinference-instrumentation-pydantic-ai |
OpenInference Instrumentation for PydanticAI. | |
openinference-instrumentation-smolagents |
OpenInference Instrumentation for smolagents. |
Normalize and convert data across other instrumentation libraries by adding span processors that unify data.
Package | Description | Version |
---|---|---|
openinference-instrumentation-openlit |
OpenInference Span Processor for OpenLIT traces. | |
openinference-instrumentation-openllmetry |
OpenInference Span Processor for OpenLLMetry (Traceloop) traces. |
Name | Description | Complexity Level |
---|---|---|
Agno | Agno agent examples | Beginner |
OpenAI SDK | OpenAI Python SDK, including chat completions and embeddings | Beginner |
MistralAI SDK | MistralAI Python SDK | Beginner |
VertexAI SDK | VertexAI Python SDK | Beginner |
LlamaIndex | LlamaIndex query engines | Beginner |
DSPy | DSPy primitives and custom RAG modules | Beginner |
Boto3 Bedrock Client | Boto3 Bedrock client | Beginner |
LangChain | LangChain primitives and simple chains | Beginner |
LiteLLM | A lightweight LiteLLM framework | Beginner |
LiteLLM Proxy | LiteLLM Proxy to log OpenAI, Azure, Vertex, Bedrock | Beginner |
Groq | Groq and AsyncGroq chat completions | Beginner |
Anthropic | Anthropic Messages client | Beginner |
BeeAI | Agentic instrumentation in the BeeAI framework | Beginner |
LlamaIndex + Next.js Chatbot | A fully functional chatbot using Next.js and a LlamaIndex FastAPI backend | Intermediate |
LangServe | A LangChain application deployed with LangServe using custom metadata on a per-request basis | Intermediate |
DSPy | A DSPy RAG application using FastAPI, Weaviate, and Cohere | Intermediate |
Haystack | A Haystack QA RAG application | Intermediate |
OpenAI Agents | OpenAI Agents with handoffs | Intermediate |
Autogen AgentChat | Microsoft Autogen Assistant Agent and Team Chat | Intermediate |
PydanticAI | PydanticAI agent examples | Intermediate |
Package | Description | Version |
---|---|---|
@arizeai/openinference-semantic-conventions |
Semantic conventions for tracing of LLM Apps. | |
@arizeai/openinference-core |
Reusable utilities, configuration, and helpers for instrumentation. | |
@arizeai/openinference-instrumentation-bedrock |
OpenInference Instrumentation for AWS Bedrock. | |
@arizeai/openinference-instrumentation-bedrock-agent-runtime |
OpenInference Instrumentation for AWS Bedrock Agent Runtime. | |
@arizeai/openinference-instrumentation-beeai |
OpenInference Instrumentation for BeeAI. | |
@arizeai/openinference-instrumentation-langchain |
OpenInference Instrumentation for LangChain.js. | |
@arizeai/openinference-instrumentation-mcp |
OpenInference Instrumentation for MCP. | |
@arizeai/openinference-instrumentation-openai |
OpenInference Instrumentation for OpenAI SDK. | |
@arizeai/openinference-vercel |
OpenInference Support for Vercel AI SDK. | |
@arizeai/openinference-mastra |
OpenInference Support for Mastra. |
Name | Description | Complexity Level |
---|---|---|
OpenAI SDK | OpenAI Node.js client | Beginner |
BeeAI framework - ReAct agent | Agentic ReActAgent instrumentation in the BeeAI framework |
Beginner |
BeeAI framework - ToolCalling agent | Agentic ToolCallingAgent instrumentation in the BeeAI framework |
Beginner |
BeeAI framework - LLM | See how to run instrumentation only for the specific LLM module part in the BeeAI framework | Beginner |
LlamaIndex Express App | A fully functional LlamaIndex chatbot with a Next.js frontend and a LlamaIndex Express backend, instrumented using openinference-instrumentation-openai
|
Intermediate |
LangChain OpenAI | A simple script to call OpenAI via LangChain, instrumented using openinference-instrumentation-langchain
|
Beginner |
LangChain RAG Express App | A fully functional LangChain chatbot that uses RAG to answer user questions. It has a Next.js frontend and a LangChain Express backend, instrumented using openinference-instrumentation-langchain
|
Intermediate |
Next.js + OpenAI | A Next.js 13 project bootstrapped with create-next-app that uses OpenAI to generate text |
Beginner |
Package | Description | Version |
---|---|---|
openinference-semantic-conventions |
Semantic conventions for tracing of LLM Apps. | |
openinference-instrumentation |
Base instrumentation utilities. | |
openinference-instrumentation-langchain4j |
OpenInference Instrumentation for LangChain4j. | |
openinference-instrumentation-springAI |
OpenInference Instrumentation for Spring AI. |
Name | Description | Complexity Level |
---|---|---|
LangChain4j Example | Simple example using LangChain4j with OpenAI | Beginner |
Spring AI Example | Spring AI example with OpenAI and tool calling | Beginner |
OpenInference supports the following destinations as span collectors.
- β Arize-Phoenix
- β Arize
- β Any OTEL-compatible collector
Join our community to connect with thousands of machine learning practitioners and LLM observability enthusiasts!
- π Join our Slack community.
- π‘ Ask questions and provide feedback in the #phoenix-support channel.
- π Leave a star on our GitHub.
- π Report bugs with GitHub Issues.
- π Follow us on X.
- πΊοΈ Check out our roadmap to see where we're heading next.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for openinference
Similar Open Source Tools

openinference
OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.

phoenix
Phoenix is a tool that provides MLOps and LLMOps insights at lightning speed with zero-config observability. It offers a notebook-first experience for monitoring models and LLM Applications by providing LLM Traces, LLM Evals, Embedding Analysis, RAG Analysis, and Structured Data Analysis. Users can trace through the execution of LLM Applications, evaluate generative models, explore embedding point-clouds, visualize generative application's search and retrieval process, and statistically analyze structured data. Phoenix is designed to help users troubleshoot problems related to retrieval, tool execution, relevance, toxicity, drift, and performance degradation.

TRACE
TRACE is a temporal grounding video model that utilizes causal event modeling to capture videos' inherent structure. It presents a task-interleaved video LLM model tailored for sequential encoding/decoding of timestamps, salient scores, and textual captions. The project includes various model checkpoints for different stages and fine-tuning on specific datasets. It provides evaluation codes for different tasks like VTG, MVBench, and VideoMME. The repository also offers annotation files and links to raw videos preparation projects. Users can train the model on different tasks and evaluate the performance based on metrics like CIDER, METEOR, SODA_c, F1, mAP, Hit@1, etc. TRACE has been enhanced with trace-retrieval and trace-uni models, showing improved performance on dense video captioning and general video understanding tasks.

hcaptcha-challenger
hCaptcha Challenger is a tool designed to gracefully face hCaptcha challenges using a multimodal large language model. It does not rely on Tampermonkey scripts or third-party anti-captcha services, instead implementing interfaces for 'AI vs AI' scenarios. The tool supports various challenge types such as image labeling, drag and drop, and advanced tasks like self-supervised challenges and Agentic Workflow. Users can access documentation in multiple languages and leverage resources for tasks like model training, dataset annotation, and model upgrading. The tool aims to enhance user experience in handling hCaptcha challenges with innovative AI capabilities.

llm-export
llm-export is a tool for exporting llm models to onnx and mnn formats. It has features such as passing onnxruntime correctness tests, optimizing the original code to support dynamic shapes, reducing constant parts, optimizing onnx models using OnnxSlim for performance improvement, and exporting lora weights to onnx and mnn formats. Users can clone the project locally, clone the desired LLM project locally, and use LLMExporter to export the model. The tool supports various export options like exporting the entire model as one onnx model, exporting model segments as multiple models, exporting model vocabulary to a text file, exporting specific model layers like Embedding and lm_head, testing the model with queries, validating onnx model consistency with onnxruntime, converting onnx models to mnn models, and more. Users can specify export paths, skip optimization steps, and merge lora weights before exporting.

airbyte-connectors
This repository contains Airbyte connectors used in Faros and Faros Community Edition platforms as well as Airbyte Connector Development Kit (CDK) for JavaScript/TypeScript.

nntrainer
NNtrainer is a software framework for training neural network models on devices with limited resources. It enables on-device fine-tuning of neural networks using user data for personalization. NNtrainer supports various machine learning algorithms and provides examples for tasks such as few-shot learning, ResNet, VGG, and product rating. It is optimized for embedded devices and utilizes CBLAS and CUBLAS for accelerated calculations. NNtrainer is open source and released under the Apache License version 2.0.

oumi
Oumi is an open-source platform for building state-of-the-art foundation models, offering tools for data preparation, training, evaluation, and deployment. It supports training and fine-tuning models with various parameters, working with text and multimodal models, synthesizing and curating training data, deploying models efficiently, evaluating models comprehensively, and running on different platforms. Oumi provides a consistent API, reliability, and flexibility for research purposes.

web-builder
Web Builder is a low-code front-end framework based on Material for Angular, offering a rich component library for excellent digital innovation experience. It allows rapid construction of modern responsive UI, multi-theme, multi-language web pages through drag-and-drop visual configuration. The framework includes a beautiful admin theme, complete front-end solutions, and AI integration in the Pro version for optimizing copy, creating components, and generating pages with a single sentence.

LLM-Finetune
LLM-Finetune is a repository for fine-tuning language models for various NLP tasks such as text classification and named entity recognition. It provides instructions and scripts for training and inference using models like Qwen2-VL and GLM4. The repository also includes datasets for tasks like text classification, named entity recognition, and multimodal tasks. Users can easily prepare the environment, download datasets, train models, and perform inference using the provided scripts and notebooks. Additionally, the repository references SwanLab, an AI training record, analysis, and visualization tool.

aidea-server
AIdea Server is an open-source Golang-based server that integrates mainstream large language models and drawing models. It supports various functionalities including OpenAI's GPT-3.5 and GPT-4, Anthropic's Claude instant and Claude 2.1, Google's Gemini Pro, as well as Chinese models like Tongyi Qianwen, Wenxin Yiyuan, and more. It also supports open-source large models like Yi 34B, Llama2, and AquilaChat 7B. Additionally, it provides features for text-to-image, super-resolution, coloring black and white images, generating art fonts and QR codes, among others.

visionOS-examples
visionOS-examples is a repository containing accelerators for Spatial Computing. It includes examples such as Local Large Language Model, Chat Apple Vision Pro, WebSockets, Anchor To Head, Hand Tracking, Battery Life, Countdown, Plane Detection, Timer Vision, and PencilKit for visionOS. The repository showcases various functionalities and features for Apple Vision Pro, offering tools for developers to enhance their visionOS apps with capabilities like hand tracking, plane detection, and real-time cryptocurrency prices.

awesome-llm-webapps
This repository is a curated list of open-source, actively maintained web applications that leverage large language models (LLMs) for various use cases, including chatbots, natural language interfaces, assistants, and question answering systems. The projects are evaluated based on key criteria such as licensing, maintenance status, complexity, and features, to help users select the most suitable starting point for their LLM-based applications. The repository welcomes contributions and encourages users to submit projects that meet the criteria or suggest improvements to the existing list.

Awesome-LLM-Constrained-Decoding
Awesome-LLM-Constrained-Decoding is a curated list of papers, code, and resources related to constrained decoding of Large Language Models (LLMs). The repository aims to facilitate reliable, controllable, and efficient generation with LLMs by providing a comprehensive collection of materials in this domain.

VoiceBench
VoiceBench is a repository containing code and data for benchmarking LLM-Based Voice Assistants. It includes a leaderboard with rankings of various voice assistant models based on different evaluation metrics. The repository provides setup instructions, datasets, evaluation procedures, and a curated list of awesome voice assistants. Users can submit new voice assistant results through the issue tracker for updates on the ranking list.
For similar tasks

openinference
OpenInference is a set of conventions and plugins that complement OpenTelemetry to enable tracing of AI applications. It provides a way to capture and analyze the performance and behavior of AI models, including their interactions with other components of the application. OpenInference is designed to be language-agnostic and can be used with any OpenTelemetry-compatible backend. It includes a set of instrumentations for popular machine learning SDKs and frameworks, making it easy to add tracing to your AI applications.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.