optillm
Optimizing inference proxy for LLMs
Stars: 955
optillm is an OpenAI API compatible optimizing inference proxy implementing state-of-the-art techniques to enhance accuracy and performance of LLMs, focusing on reasoning over coding, logical, and mathematical queries. By leveraging additional compute at inference time, it surpasses frontier models across diverse tasks.
README:
optillm is an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs. The current focus is on implementing techniques that improve reasoning over coding, logical and mathematical queries. It is possible to beat the frontier models using these techniques across diverse tasks by doing additional compute at inference time.
Clone the repository with git
and use pip install
to setup the dependencies.
git clone https://github.com/codelion/optillm.git
cd optillm
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Set up the OPENAI_API_KEY
environment variable (for OpenAI)
or the AZURE_OPENAI_API_KEY
, AZURE_API_VERSION
and AZURE_API_BASE
environment variables (for Azure OpenAI)
or the AZURE_API_VERSION
and AZURE_API_BASE
environment variables and login using az login
for Azure OpenAI with managed identity (see here).
You can then run the optillm proxy as follows.
python optillm.py
2024-09-06 07:57:14,191 - INFO - Starting server with approach: auto
2024-09-06 07:57:14,191 - INFO - Server configuration: {'approach': 'auto', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'model': 'gpt-4o-mini', 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'base_url': ''}
* Serving Flask app 'optillm'
* Debug mode: off
2024-09-06 07:57:14,212 - INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on all addresses (0.0.0.0)
* Running on http://127.0.0.1:8000
* Running on http://192.168.10.48:8000
2024-09-06 07:57:14,212 - INFO - Press CTRL+C to quit
- Set the
OPENAI_API_KEY
env variable to a placeholder value- e.g.
export OPENAI_API_KEY="no_key"
- e.g.
- Run
./llama-server -c 4096 -m path_to_model
to start the server with the specified model and a context length of 4096 tokens - Run
python3 optillm.py --base_url base_url
to start the proxy- e.g. for llama.cpp, run
python3 optillm.py --base_url http://localhost:8080/v1
- e.g. for llama.cpp, run
[!WARNING] Note that llama-server currently does not support sampling multiple responses from a model, which limits the available approaches to the following:
cot_reflection
,leap
,plansearch
,rstar
,rto
,self_consistency
,re2
, andz3
. In order to use other approaches, consider using an alternative compatible server such as ollama.
[!NOTE] You'll later need to specify a model name in the OpenAI client configuration. Since llama-server was started with a single model, you can choose any name you want.
Once the proxy is running, you can use it as a drop in replacement for an OpenAI client by setting the base_url
as http://localhost:8000/v1
.
import os
from openai import OpenAI
OPENAI_KEY = os.environ.get("OPENAI_API_KEY")
OPENAI_BASE_URL = "http://localhost:8000/v1"
client = OpenAI(api_key=OPENAI_KEY, base_url=OPENAI_BASE_URL)
response = client.chat.completions.create(
model="moa-gpt-4o",
messages=[
{
"role": "user",
"content": "Write a Python program to build an RL model to recite text from any position that the user provides, using only numpy."
}
],
temperature=0.2
)
print(response)
The code above applies to both OpenAI and Azure OpenAI, just remember to populate the OPENAI_API_KEY
env variable with the proper key.
There are multiple ways to control the optimization techniques, they are applied in the follow order of preference:
- You can control the technique you use for optimization by prepending the slug to the model name
{slug}-model-name
. E.g. in the above code we are usingmoa
or mixture of agents as the optimization approach. In the proxy logs you will see the following showing themoa
is been used with the base model asgpt-4o-mini
.
2024-09-06 08:35:32,597 - INFO - Using approach moa, with gpt-4o-mini
2024-09-06 08:35:35,358 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:39,553 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:44,795 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:44,797 - INFO - 127.0.0.1 - - [06/Sep/2024 08:35:44] "POST /v1/chat/completions HTTP/1.1" 200 -
- Or, you can pass the slug in the
optillm_approach
field in theextra_body
.
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{ "role": "user","content": "" }],
temperature=0.2,
extra_body={"optillm_approach": "bon|moa|mcts"}
)
- Or, you can just mention the approach in either your
system
oruser
prompt, within<optillm_approach> </optillm_approach>
tags.
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{ "role": "user","content": "<optillm_approach>re2</optillm_approach> How many r's are there in strawberry?" }],
temperature=0.2
)
[!TIP] You can also combine different techniques either by using symbols
&
and|
. When you use&
the techniques are processed in the order from left to right in a pipeline with response from previous stage used as request to the next. While, with|
we run all the requests in parallel and generate multiple responses that are returned as a list.
Please note that the convention described above works only when the optillm server has been started with inference approach set to auto
. Otherwise, the model
attribute in the client request must be set with the model name only.
We now suport all LLM providers (by wrapping around the LiteLLM sdk). E.g. you can use the Gemini Flash model with moa
by setting passing the api key in the environment variable os.environ['GEMINI_API_KEY']
and then calling the model moa-gemini/gemini-1.5-flash-002
. In the output you will then see that LiteLLM is being used to call the base model.
9:43:21 - LiteLLM:INFO: utils.py:2952 -
LiteLLM completion() model= gemini-1.5-flash-002; provider = gemini
2024-09-29 19:43:21,011 - INFO -
LiteLLM completion() model= gemini-1.5-flash-002; provider = gemini
2024-09-29 19:43:21,481 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-002:generateContent?key=[redacted] "HTTP/1.1 200 OK"
19:43:21 - LiteLLM:INFO: utils.py:988 - Wrapper: Completed Call, calling success_handler
2024-09-29 19:43:21,483 - INFO - Wrapper: Completed Call, calling success_handler
19:43:21 - LiteLLM:INFO: utils.py:2952 -
LiteLLM completion() model= gemini-1.5-flash-002; provider = gemini
[!TIP] optillm is a transparent proxy and will work with any LLM API or provider that has an OpenAI API compatible chat completions endpoint, and in turn, optillm also exposes the same OpenAI API compatible chat completions endpoint. This should allow you to integrate it into any existing tools or frameworks easily. If the LLM you want to use doesn't have an OpenAI API compatible endpoint (like Google or Anthropic) you can use LiteLLM proxy server that supports most LLMs.
The following sequence diagram illustrates how the request and responses go through optillm.
In the diagram:
-
A
is an existing tool (like oobabooga), framework (like patchwork) or your own code where you want to use the results from optillm. You can use it directly using any OpenAI client sdk. -
B
is the optillm service (running directly or in a docker container) that will send requests to thebase_url
. -
C
is any service providing an OpenAI API compatible chat completions endpoint.
Approach | Slug | Description |
---|---|---|
CoT with Reflection | cot_reflection |
Implements chain-of-thought reasoning with <thinking>, <reflection> and <output> sections |
PlanSearch | plansearch |
Implements a search algorithm over candidate plans for solving a problem in natural language |
ReRead | re2 |
Implements rereading to improve reasoning by processing queries twice |
Self-Consistency | self_consistency |
Implements an advanced self-consistency method |
Z3 Solver | z3 |
Utilizes the Z3 theorem prover for logical reasoning |
R* Algorithm | rstar |
Implements the R* algorithm for problem-solving |
LEAP | leap |
Learns task-specific principles from few shot examples |
Round Trip Optimization | rto |
Optimizes responses through a round-trip process |
Best of N Sampling | bon |
Generates multiple responses and selects the best one |
Mixture of Agents | moa |
Combines responses from multiple critiques |
Monte Carlo Tree Search | mcts |
Uses MCTS for decision-making in chat responses |
PV Game | pvg |
Applies a prover-verifier game approach at inference time |
CoT Decoding | N/A for proxy | Implements chain-of-thought decoding to elicit reasoning without explicit prompting |
Plugin | Slug | Description |
---|---|---|
Memory | memory |
Implements a short term memory layer, enables you to use unbounded context length with any LLM |
Privacy | privacy |
Anonymize PII data in request and deanonymize it back to original value in response |
Read URLs | readurls |
Reads all URLs found in the request, fetches the content at the URL and adds it to the context |
Execute Code | executecode |
Enables use of code interpreter to execute python code in requests and LLM generated responses |
optillm supports various command-line arguments and environment variables for configuration.
Parameter | Description | Default Value |
---|---|---|
--approach |
Inference approach to use | "auto" |
--simulations |
Number of MCTS simulations | 2 |
--exploration |
Exploration weight for MCTS | 0.2 |
--depth |
Simulation depth for MCTS | 1 |
--best-of-n |
Number of samples for best_of_n approach | 3 |
--model |
OpenAI model to use | "gpt-4o-mini" |
--base-url |
Base URL for OpenAI compatible endpoint | "" |
--rstar-max-depth |
Maximum depth for rStar algorithm | 3 |
--rstar-num-rollouts |
Number of rollouts for rStar algorithm | 5 |
--rstar-c |
Exploration constant for rStar algorithm | 1.4 |
--n |
Number of final responses to be returned | 1 |
--return-full-response |
Return the full response including the CoT with tags | False |
--port |
Specify the port to run the proxy | 8000 |
--optillm-api-key |
Optional API key for client authentication to optillm | "" |
When using Docker, these can be set as environment variables prefixed with OPTILLM_
.
optillm can optionally be built and run using Docker and the provided Dockerfile.
-
Make sure you have Docker and Docker Compose installed on your system.
-
Either update the environment variables in the docker-compose.yaml file or create a
.env
file in the project root directory and add any environment variables you want to set. For example, to set the OpenAI API key, add the following line to the.env
file:OPENAI_API_KEY=your_openai_api_key_here
-
Run the following command to start optillm:
docker compose up -d
This will build the Docker image if it doesn't exist and start the optillm service.
-
optillm will be available at
http://localhost:8000
.
When using Docker, you can set these parameters as environment variables. For example, to set the approach and model, you would use:
OPTILLM_APPROACH=mcts
OPTILLM_MODEL=gpt-4
To secure the optillm proxy with an API key, set the OPTILLM_API_KEY
environment variable:
OPTILLM_API_KEY=your_secret_api_key
When the API key is set, clients must include it in their requests using the Authorization
header:
Authorization: Bearer your_secret_api_key
Model | Accuracy |
---|---|
readlurls&memory-gpt-4o-mini | 65.66 |
gpt-4o-mini | 50.0 |
Gemini Flash 1.5 | 66.5 |
Gemini Pro 1.5 | 72.9 |
Model | pass@1 | pass@5 | pass@10 |
---|---|---|---|
plansearch-gpt-4o-mini | 44.03 | 59.31 | 63.5 |
gpt-4o-mini | 43.9 | 50.61 | 53.25 |
claude-3.5-sonnet | 51.3 | ||
gpt-4o-2024-05-13 | 45.2 | ||
gpt-4-turbo-2024-04-09 | 44.2 |
Since optillm is a drop-in replacement for OpenAI API you can easily integrate it with existing tools and frameworks using the OpenAI client. We used optillm with patchwork which is an open-source framework that automates development gruntwork like PR reviews, bug fixing, security patching using workflows called patchflows. We saw huge performance gains across all the supported patchflows as shown below when using the mixutre of agents approach (moa).
- Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
- Writing in the Margins: Better Inference Pattern for Long Context Retrieval
- Chain-of-Thought Reasoning Without Prompting
- Re-Reading Improves Reasoning in Large Language Models
- In-Context Principle Learning from Mistakes
- Planning In Natural Language Improves LLM Search For Code Generation
- Self-Consistency Improves Chain of Thought Reasoning in Language Models
- Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
- Mixture-of-Agents Enhances Large Language Model Capabilities
- Prover-Verifier Games improve legibility of LLM outputs
- Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
- Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
- Patched MOA: optimizing inference for diverse software development tasks
- Patched RTC: evaluating LLMs for diverse software development tasks
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for optillm
Similar Open Source Tools
optillm
optillm is an OpenAI API compatible optimizing inference proxy implementing state-of-the-art techniques to enhance accuracy and performance of LLMs, focusing on reasoning over coding, logical, and mathematical queries. By leveraging additional compute at inference time, it surpasses frontier models across diverse tasks.
promptfoo
Promptfoo is a tool for testing and evaluating LLM output quality. With promptfoo, you can build reliable prompts, models, and RAGs with benchmarks specific to your use-case, speed up evaluations with caching, concurrency, and live reloading, score outputs automatically by defining metrics, use as a CLI, library, or in CI/CD, and use OpenAI, Anthropic, Azure, Google, HuggingFace, open-source models like Llama, or integrate custom API providers for any LLM API.
trustgraph
TrustGraph is a tool that deploys private GraphRAG pipelines to build a RDF style knowledge graph from data, enabling accurate and secure `RAG` requests compatible with cloud LLMs and open-source SLMs. It showcases the reliability and efficiencies of GraphRAG algorithms, capturing contextual language flags missed in conventional RAG approaches. The tool offers features like PDF decoding, text chunking, inference of various LMs, RDF-aligned Knowledge Graph extraction, and more. TrustGraph is designed to be modular, supporting multiple Language Models and environments, with a plug'n'play architecture for easy customization.
llm-structured-output-benchmarks
Benchmark various LLM Structured Output frameworks like Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, LMFormatEnforcer, etc on tasks like multi-label classification, named entity recognition, synthetic data generation. The tool provides benchmark results, methodology, instructions to run the benchmark, add new data, and add a new framework. It also includes a roadmap for framework-related tasks, contribution guidelines, citation information, and feedback request.
Large-Language-Models-play-StarCraftII
Large Language Models Play StarCraft II is a project that explores the capabilities of large language models (LLMs) in playing the game StarCraft II. The project introduces TextStarCraft II, a textual environment for the game, and a Chain of Summarization method for analyzing game information and making strategic decisions. Through experiments, the project demonstrates that LLM agents can defeat the built-in AI at a challenging difficulty level. The project provides benchmarks and a summarization approach to enhance strategic planning and interpretability in StarCraft II gameplay.
Construction-Hazard-Detection
Construction-Hazard-Detection is an AI-driven tool focused on improving safety at construction sites by utilizing the YOLOv8 model for object detection. The system identifies potential hazards like overhead heavy loads and steel pipes, providing real-time analysis and warnings. Users can configure the system via a YAML file and run it using Docker. The primary dataset used for training is the Construction Site Safety Image Dataset enriched with additional annotations. The system logs are accessible within the Docker container for debugging, and notifications are sent through the LINE messaging API when hazards are detected.
thepipe
The Pipe is a multimodal-first tool for feeding files and web pages into vision-language models such as GPT-4V. It is best for LLM and RAG applications that require a deep understanding of tricky data sources. The Pipe is available as a hosted API at thepi.pe, or it can be set up locally.
pr-pilot
PR Pilot is an AI-powered tool designed to assist users in their daily workflow by delegating routine work to AI with confidence and predictability. It integrates seamlessly with popular development tools and allows users to interact with it through a Command-Line Interface, Python SDK, REST API, and Smart Workflows. Users can automate tasks such as generating PR titles and descriptions, summarizing and posting issues, and formatting README files. The tool aims to save time and enhance productivity by providing AI-powered solutions for common development tasks.
GenAIComps
GenAIComps is an initiative aimed at building enterprise-grade Generative AI applications using a microservice architecture. It simplifies the scaling and deployment process for production, abstracting away infrastructure complexities. GenAIComps provides a suite of containerized microservices that can be assembled into a mega-service tailored for real-world Enterprise AI applications. The modular approach of microservices allows for independent development, deployment, and scaling of individual components, promoting modularity, flexibility, and scalability. The mega-service orchestrates multiple microservices to deliver comprehensive solutions, encapsulating complex business logic and workflow orchestration. The gateway serves as the interface for users to access the mega-service, providing customized access based on user requirements.
stable-diffusion-webui
Stable Diffusion WebUI Docker Image allows users to run Automatic1111 WebUI in a docker container locally or in the cloud. The images do not bundle models or third-party configurations, requiring users to use a provisioning script for container configuration. It supports NVIDIA CUDA, AMD ROCm, and CPU platforms, with additional environment variables for customization and pre-configured templates for Vast.ai and Runpod.io. The service is password protected by default, with options for version pinning, startup flags, and service management using supervisorctl.
AutoGPTQ
AutoGPTQ is an easy-to-use LLM quantization package with user-friendly APIs, based on GPTQ algorithm (weight-only quantization). It provides a simple and efficient way to quantize large language models (LLMs) to reduce their size and computational cost while maintaining their performance. AutoGPTQ supports a wide range of LLM models, including GPT-2, GPT-J, OPT, and BLOOM. It also supports various evaluation tasks, such as language modeling, sequence classification, and text summarization. With AutoGPTQ, users can easily quantize their LLM models and deploy them on resource-constrained devices, such as mobile phones and embedded systems.
can-ai-code
Can AI Code is a self-evaluating interview tool for AI coding models. It includes interview questions written by humans and tests taken by AI, inference scripts for common API providers and CUDA-enabled quantization runtimes, a Docker-based sandbox environment for validating untrusted Python and NodeJS code, and the ability to evaluate the impact of prompting techniques and sampling parameters on large language model (LLM) coding performance. Users can also assess LLM coding performance degradation due to quantization. The tool provides test suites for evaluating LLM coding performance, a webapp for exploring results, and comparison scripts for evaluations. It supports multiple interviewers for API and CUDA runtimes, with detailed instructions on running the tool in different environments. The repository structure includes folders for interviews, prompts, parameters, evaluation scripts, comparison scripts, and more.
ABQ-LLM
ABQ-LLM is a novel arbitrary bit quantization scheme that achieves excellent performance under various quantization settings while enabling efficient arbitrary bit computation at the inference level. The algorithm supports precise weight-only quantization and weight-activation quantization. It provides pre-trained model weights and a set of out-of-the-box quantization operators for arbitrary bit model inference in modern architectures.
BodhiApp
Bodhi App runs Open Source Large Language Models locally, exposing LLM inference capabilities as OpenAI API compatible REST APIs. It leverages llama.cpp for GGUF format models and huggingface.co ecosystem for model downloads. Users can run fine-tuned models for chat completions, create custom aliases, and convert Huggingface models to GGUF format. The CLI offers commands for environment configuration, model management, pulling files, serving API, and more.
llm2sh
llm2sh is a command-line utility that leverages Large Language Models (LLMs) to translate plain-language requests into shell commands. It provides a convenient way to interact with your system using natural language. The tool supports multiple LLMs for command generation, offers a customizable configuration file, YOLO mode for running commands without confirmation, and is easily extensible with new LLMs and system prompts. Users can set up API keys for OpenAI, Claude, Groq, and Cerebras to use the tool effectively. llm2sh does not store user data or command history, and it does not record or send telemetry by itself, but the LLM APIs may collect and store requests and responses for their purposes.
comfyui
ComfyUI is a highly-configurable, cloud-first AI-Dock container that allows users to run ComfyUI without bundled models or third-party configurations. Users can configure the container using provisioning scripts. The Docker image supports NVIDIA CUDA, AMD ROCm, and CPU platforms, with version tags for different configurations. Additional environment variables and Python environments are provided for customization. ComfyUI service runs on port 8188 and can be managed using supervisorctl. The tool also includes an API wrapper service and pre-configured templates for Vast.ai. The author may receive compensation for services linked in the documentation.
For similar tasks
optillm
optillm is an OpenAI API compatible optimizing inference proxy implementing state-of-the-art techniques to enhance accuracy and performance of LLMs, focusing on reasoning over coding, logical, and mathematical queries. By leveraging additional compute at inference time, it surpasses frontier models across diverse tasks.
Awesome-Model-Merging-Methods-Theories-Applications
A comprehensive repository focusing on 'Model Merging in LLMs, MLLMs, and Beyond', providing an exhaustive overview of model merging methods, theories, applications, and future research directions. The repository covers various advanced methods, applications in foundation models, different machine learning subfields, and tasks like pre-merging methods, architecture transformation, weight alignment, basic merging methods, and more.
llm-structured-output
This repository contains a library for constraining LLM generation to structured output, enforcing a JSON schema for precise data types and property names. It includes an acceptor/state machine framework, JSON acceptor, and JSON schema acceptor for guiding decoding in LLMs. The library provides reference implementations using Apple's MLX library and examples for function calling tasks. The tool aims to improve LLM output quality by ensuring adherence to a schema, reducing unnecessary output, and enhancing performance through pre-emptive decoding. Evaluations show performance benchmarks and comparisons with and without schema constraints.
Step-DPO
Step-DPO is a method for enhancing long-chain reasoning ability of LLMs with a data construction pipeline creating a high-quality dataset. It significantly improves performance on math and GSM8K tasks with minimal data and training steps. The tool fine-tunes pre-trained models like Qwen2-7B-Instruct with Step-DPO, achieving superior results compared to other models. It provides scripts for training, evaluation, and deployment, along with examples and acknowledgements.
ChatPDF
ChatPDF is a knowledge question and answer retrieval tool based on local LLM. It supports various open-source LLM models like ChatGLM3-6b, Chinese-LLaMA-Alpaca-2, Baichuan, YI, and multiple file formats including PDF, docx, markdown, txt. The tool optimizes RAG accuracy, Chinese chunk segmentation, embedding using text2vec's sentence embedding, retrieval matching with rank_BM25, and introduces reranker module for reranking candidate sets. It also enhances candidate chunk extension context, supports custom RAG models, and provides a Gradio-based RAG conversation page for seamless dialogue.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.