
open-responses
OpenResponses API empowers developers to leverage the incredible capabilities of various LLM providers through a familiar interface - the OpenAI Responses API structure.
Stars: 56

README:
Unlock enterprise-grade AI capabilities through a single, powerful API โ simplify development, accelerate deployment, and maintain complete data control
OpenResponses revolutionizes how developers build AI applications by providing a comprehensive, production-ready toolkit with essential enterprise featuresโall through an elegantly simplified API interface. Stop cobbling together disparate tools and start building what matters.
Run OpenResponses locally to access an OpenAI-compatible API that works seamlessly with multiple model providers and supports unlimited tool integrations. Deploy a complete AI infrastructure on your own hardware with full data sovereignty.
docker run -p 8080:8080 masaicai/open-responses:latest
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key=os.getenv("OPENAI_API_KEY"), default_headers={'x-model-provider': 'openai'})
response = openai_client.responses.create(
model="gpt-4o-mini",
input="Write a poem on Masaic"
)
client = AsyncOpenAI(base_url="http://localhost:8080/v1", api_key=os.getenv("OPENAI_API_KEY"), default_headers={'x-model-provider': 'openai'})
agent = Agent(
name="Assistant",
instructions="You are a humorous poet who can write funny poems of 4 lines.",
model=OpenAIResponsesModel(model="gpt-4o-mini", openai_client=client)
)
curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer OPENAI_API_KEY' \
--header 'x-model-provider: openai' \
--data '{
"model": "gpt-4o",
"stream": false,
"input": [
{
"role": "user",
"content": "Write a poem on Masaic"
}
]
}'
For detailed implementation instructions, see our Quick Start Guide.
- Getting Started
- Core Capabilities
- Key Problems Solved
- Why Engineering Teams Should Choose OpenResponses
- API Reference
- Coming Soon
- Frequently Asked Questions
- Configuration
- Documentation
- Local Development
- Production Use
- Contributing
- License
Feature | Description | Benefit |
---|---|---|
Automated Tracing | Comprehensive request and response monitoring | Track performance and usage without additional code |
Integrated RAG | Contextual information retrieval | Enhance responses with relevant external data automatically |
Pre-built Tool Integrations | Web search, GitHub access, and more | Deploy advanced capabilities instantly |
Self-Hosted Architecture | Full control of deployment infrastructure | Maintain complete data sovereignty |
OpenAI-Compatible Interface | Drop-in replacement for existing OpenAI implementations | Minimal code changes for migration |
- Feature Gap: Most open-source AI models lack critical enterprise capabilities required for production environments
- Integration Complexity: Implementing supplementary features like retrieval augmentation and monitoring requires significant development overhead
- Resource Diversion: Engineering teams spend excessive time on infrastructure rather than core application logic
- Data Privacy: Organizations with sensitive data face compliance barriers when using cloud-hosted AI services
- Operational Control: Many applications require full control over the AI processing pipeline
- Developer Productivity: Focus engineering efforts on application features rather than infrastructure
- Production Readiness: Enterprise capabilities and batteries included out-of-the-box
- Compliance Confidence: Deploy with data privacy requirements fully addressed
- Simplified Architecture: Consolidate AI infrastructure through widely used OpenAI API Specifications
The API implements the following OpenAI-compatible endpoints:
Endpoint | Description |
---|---|
POST /v1/responses |
Create a new model response |
GET /v1/responses/{responseId} |
Retrieve a specific response |
DELETE /v1/responses/{responseId} |
Delete a response |
GET /v1/responses/{responseId}/input_items |
List input items for a response |
Replace the placeholder API keys with your own values.
curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GROQ_API_KEY' \
--data '{
"model": "llama-3.2-3b-preview",
"stream": true,
"input": [
{
"role": "user",
"content": "Write a poem on OpenResponses"
}
]
}'
curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ANTHROPIC_API_KEY' \
--header 'x-model-provider: claude' \
--data '{
"model": "claude-3-5-sonnet-20241022",
"stream": false,
"input": [
{
"role": "user",
"content": "Write a poem on OpenResponses"
}
]
}'
curl --location 'http://localhost:8080/v1/responses' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
"model": "your-model",
"stream": false,
"tools": [
{
"type": "brave_web_search"
}
],
"input": [
{
"role": "user",
"content": "What are the latest developments in AI?"
}
]
}'
We're continuously evolving OpenResponses with powerful new features to elevate your AI applications even further. Stay tuned!
Yes! OpenResponses acts as a pass-through to the provider APIs using your own keys.
Our benchmarks show minimal overhead compared to direct API calls.
OpenResponses standardizes error responses across providers:
{
"type": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please try again in 30 seconds.",
"param": null,
"code": "rate_limit"
}
The application supports the following environment variables:
Variable | Description | Default |
---|---|---|
MCP_SERVER_CONFIG_FILE_PATH |
Path to MCP server configuration | - |
MASAIC_MAX_TOOL_CALLS |
Maximum number of allowed tool calls | 10 |
MASAIC_MAX_STREAMING_TIMEOUT |
Maximum streaming timeout in ms | 60000 |
SPRING_PRODFILES_ACTIVE |
otel profile enables open telemetry exports |
- |
For more details on granular configurations refer:
Explore our comprehensive documentation to learn more about OpenResponses features, configuration options, and integration methods.
Follow these instructions to set up the project locally for development:
- Java JDK 21+
- Gradle (optional, as project includes Gradle Wrapper)
- Docker (optional, for containerized setup)
- Clone the repository
git clone https://github.com/masaic-ai-platform/open-responses.git
cd open-responses
- Build the project
Use the Gradle Wrapper included in the project:
./gradlew build
- Configure Environment Variables
Create or update the application.properties
file with necessary configuration under src/main/resources
:
server.port: 8080
Set any additional configuration required by your project.
- Run the server
To start the server in development mode:
./gradlew bootRun
Build and run the application using Docker:
./gradlew build
docker build -t openresponses .
docker run -p 8080:8080 -d openresponses
Run the tests with:
./gradlew test
Alpha Release Disclaimer: This project is currently in alpha stage. The API and features are subject to breaking changes as we continue to evolve and improve the platform. While we strive to maintain stability, please be aware that updates may require modifications to your integration code.
Contributions are welcome! Please feel free to submit a Pull Request.
"Alone we can do so little; together we can do so much." โ Helen Keller
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Made with โค๏ธ by the Masaic AI Team
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for open-responses
Similar Open Source Tools

quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

arxiv-mcp-server
The ArXiv MCP Server acts as a bridge between AI assistants and arXiv's research repository, enabling AI models to search for and access papers programmatically through the Message Control Protocol (MCP). It offers features like paper search, access, listing, local storage, and research prompts. Users can install it via Smithery or manually for Claude Desktop. The server provides tools for paper search, download, listing, and reading, along with specialized prompts for paper analysis. Configuration can be done through environment variables, and testing is supported with a test suite. The tool is released under the MIT License and is developed by the Pearl Labs Team.

lumen
Lumen is a command-line tool that leverages AI to enhance your git workflow. It assists in generating commit messages, understanding changes, interactive searching, and analyzing impacts without the need for an API key. With smart commit messages, git history insights, interactive search, change analysis, and rich markdown output, Lumen offers a seamless and flexible experience for users across various git workflows.

ai-gateway
LangDB AI Gateway is an open-source enterprise AI gateway built in Rust. It provides a unified interface to all LLMs using the OpenAI API format, focusing on high performance, enterprise readiness, and data control. The gateway offers features like comprehensive usage analytics, cost tracking, rate limiting, data ownership, and detailed logging. It supports various LLM providers and provides OpenAI-compatible endpoints for chat completions, model listing, embeddings generation, and image generation. Users can configure advanced settings, such as rate limiting, cost control, dynamic model routing, and observability with OpenTelemetry tracing. The gateway can be run with Docker Compose and integrated with MCP tools for server communication.

LLMVoX
LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

pilottai
PilottAI is a Python framework for building autonomous multi-agent systems with advanced orchestration capabilities. It provides enterprise-ready features for building scalable AI applications. The framework includes hierarchical agent systems, production-ready features like asynchronous processing and fault tolerance, advanced memory management with semantic storage, and integrations with multiple LLM providers and custom tools. PilottAI offers specialized agents for various tasks such as customer service, document processing, email handling, knowledge acquisition, marketing, research analysis, sales, social media, and web search. The framework also provides documentation, example use cases, and advanced features like memory management, load balancing, and fault tolerance.

one
ONE is a modern web and AI agent development toolkit that empowers developers to build AI-powered applications with high performance, beautiful UI, AI integration, responsive design, type safety, and great developer experience. It is perfect for building modern web applications, from simple landing pages to complex AI-powered platforms.

LightRAG
LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.

Scrapegraph-ai
ScrapeGraphAI is a Python library that uses Large Language Models (LLMs) and direct graph logic to create web scraping pipelines for websites, documents, and XML files. It allows users to extract specific information from web pages by providing a prompt describing the desired data. ScrapeGraphAI supports various LLMs, including Ollama, OpenAI, Gemini, and Docker, enabling users to choose the most suitable model for their needs. The library provides a user-friendly interface through its `SmartScraper` class, which simplifies the process of building and executing scraping pipelines. ScrapeGraphAI is open-source and available on GitHub, with extensive documentation and examples to guide users. It is particularly useful for researchers and data scientists who need to extract structured data from web pages for analysis and exploration.

paperless-gpt
paperless-gpt is a tool designed to generate accurate and meaningful document titles and tags for paperless-ngx using Large Language Models (LLMs). It supports multiple LLM providers, including OpenAI and Ollama. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents. The tool offers features like multiple LLM support, customizable prompts, easy integration with paperless-ngx, user-friendly interface for reviewing and applying suggestions, dockerized deployment, automatic document processing, and an experimental OCR feature.