rl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
Stars: 2458
TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. It provides pytorch and **python-first** , low and high level abstractions for RL that are intended to be **efficient** , **modular** , **documented** and properly **tested**. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.
README:
Documentation | TensorDict | Features | Examples, tutorials and demos | Citation | Installation | Asking a question | Contributing
TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch.
- ๐ Python-first: Designed with Python as the primary language for ease of use and flexibility
- โฑ๏ธ Efficient: Optimized for performance to support demanding RL research applications
- ๐งฎ Modular, customizable, extensible: Highly modular architecture allows for easy swapping, transformation, or creation of new components
- ๐ Documented: Thorough documentation ensures that users can quickly understand and utilize the library
- โ Tested: Rigorously tested to ensure reliability and stability
- โ๏ธ Reusable functionals: Provides a set of highly reusable functions for cost functions, returns, and data processing
- ๐ฅ Aligns with PyTorch ecosystem: Follows the structure and conventions of popular PyTorch libraries (e.g., dataset pillar, transforms, models, data utilities)
- โ Minimal dependencies: Only requires Python standard library, NumPy, and PyTorch; optional dependencies for common environment libraries (e.g., OpenAI Gym) and datasets (D4RL, OpenX...)
Read the full paper for a more curated description of the library.
Check our Getting Started tutorials for quickly ramp up with the basic features of the library!
The TorchRL documentation can be found here. It contains tutorials and the API reference.
TorchRL also provides a RL knowledge base to help you debug your code, or simply learn the basics of RL. Check it out here.
We have some introductory videos for you to get to know the library better, check them out:
TorchRL being domain-agnostic, you can use it across many different fields. Here are a few examples:
- ACEGEN: Reinforcement Learning of Generative Chemical Agents for Drug Discovery
- BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
- BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO
- OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control
- RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
- Robohive: A unified framework for robot learning
RL algorithms are very heterogeneous, and it can be hard to recycle a codebase
across settings (e.g. from online to offline, from state-based to pixel-based
learning).
TorchRL solves this problem through TensorDict
,
a convenient data structure(1) that can be used to streamline one's
RL codebase.
With this tool, one can write a complete PPO training script in less than 100
lines of code!
Code
import torch
from tensordict.nn import TensorDictModule
from tensordict.nn.distributions import NormalParamExtractor
from torch import nn
from torchrl.collectors import SyncDataCollector
from torchrl.data.replay_buffers import TensorDictReplayBuffer, \
LazyTensorStorage, SamplerWithoutReplacement
from torchrl.envs.libs.gym import GymEnv
from torchrl.modules import ProbabilisticActor, ValueOperator, TanhNormal
from torchrl.objectives import ClipPPOLoss
from torchrl.objectives.value import GAE
env = GymEnv("Pendulum-v1")
model = TensorDictModule(
nn.Sequential(
nn.Linear(3, 128), nn.Tanh(),
nn.Linear(128, 128), nn.Tanh(),
nn.Linear(128, 128), nn.Tanh(),
nn.Linear(128, 2),
NormalParamExtractor()
),
in_keys=["observation"],
out_keys=["loc", "scale"]
)
critic = ValueOperator(
nn.Sequential(
nn.Linear(3, 128), nn.Tanh(),
nn.Linear(128, 128), nn.Tanh(),
nn.Linear(128, 128), nn.Tanh(),
nn.Linear(128, 1),
),
in_keys=["observation"],
)
actor = ProbabilisticActor(
model,
in_keys=["loc", "scale"],
distribution_class=TanhNormal,
distribution_kwargs={"low": -1.0, "high": 1.0},
return_log_prob=True
)
buffer = TensorDictReplayBuffer(
storage=LazyTensorStorage(1000),
sampler=SamplerWithoutReplacement(),
batch_size=50,
)
collector = SyncDataCollector(
env,
actor,
frames_per_batch=1000,
total_frames=1_000_000,
)
loss_fn = ClipPPOLoss(actor, critic)
adv_fn = GAE(value_network=critic, average_gae=True, gamma=0.99, lmbda=0.95)
optim = torch.optim.Adam(loss_fn.parameters(), lr=2e-4)
for data in collector: # collect data
for epoch in range(10):
adv_fn(data) # compute advantage
buffer.extend(data)
for sample in buffer: # consume data
loss_vals = loss_fn(sample)
loss_val = sum(
value for key, value in loss_vals.items() if
key.startswith("loss")
)
loss_val.backward()
optim.step()
optim.zero_grad()
print(f"avg reward: {data['next', 'reward'].mean().item(): 4.4f}")
Here is an example of how the environment API relies on tensordict to carry data from one function to another during a rollout execution:
TensorDict
makes it easy to re-use pieces of code across environments, models and
algorithms.
Code
For instance, here's how to code a rollout in TorchRL:
- obs, done = env.reset()
+ tensordict = env.reset()
policy = SafeModule(
model,
in_keys=["observation_pixels", "observation_vector"],
out_keys=["action"],
)
out = []
for i in range(n_steps):
- action, log_prob = policy(obs)
- next_obs, reward, done, info = env.step(action)
- out.append((obs, next_obs, action, log_prob, reward, done))
- obs = next_obs
+ tensordict = policy(tensordict)
+ tensordict = env.step(tensordict)
+ out.append(tensordict)
+ tensordict = step_mdp(tensordict) # renames next_observation_* keys to observation_*
- obs, next_obs, action, log_prob, reward, done = [torch.stack(vals, 0) for vals in zip(*out)]
+ out = torch.stack(out, 0) # TensorDict supports multiple tensor operations
Using this, TorchRL abstracts away the input / output signatures of the modules, env, collectors, replay buffers and losses of the library, allowing all primitives to be easily recycled across settings.
Code
Here's another example of an off-policy training loop in TorchRL (assuming that a data collector, a replay buffer, a loss and an optimizer have been instantiated):
- for i, (obs, next_obs, action, hidden_state, reward, done) in enumerate(collector):
+ for i, tensordict in enumerate(collector):
- replay_buffer.add((obs, next_obs, action, log_prob, reward, done))
+ replay_buffer.add(tensordict)
for j in range(num_optim_steps):
- obs, next_obs, action, hidden_state, reward, done = replay_buffer.sample(batch_size)
- loss = loss_fn(obs, next_obs, action, hidden_state, reward, done)
+ tensordict = replay_buffer.sample(batch_size)
+ loss = loss_fn(tensordict)
loss.backward()
optim.step()
optim.zero_grad()
This training loop can be re-used across algorithms as it makes a minimal number of assumptions about the structure of the data.
TensorDict supports multiple tensor operations on its device and shape (the shape of TensorDict, or its batch size, is the common arbitrary N first dimensions of all its contained tensors):
Code
# stack and cat
tensordict = torch.stack(list_of_tensordicts, 0)
tensordict = torch.cat(list_of_tensordicts, 0)
# reshape
tensordict = tensordict.view(-1)
tensordict = tensordict.permute(0, 2, 1)
tensordict = tensordict.unsqueeze(-1)
tensordict = tensordict.squeeze(-1)
# indexing
tensordict = tensordict[:2]
tensordict[:, 2] = sub_tensordict
# device and memory location
tensordict.cuda()
tensordict.to("cuda:1")
tensordict.share_memory_()
TensorDict comes with a dedicated tensordict.nn
module that contains everything you might need to write your model with it.
And it is functorch
and torch.compile
compatible!
Code
transformer_model = nn.Transformer(nhead=16, num_encoder_layers=12)
+ td_module = SafeModule(transformer_model, in_keys=["src", "tgt"], out_keys=["out"])
src = torch.rand((10, 32, 512))
tgt = torch.rand((20, 32, 512))
+ tensordict = TensorDict({"src": src, "tgt": tgt}, batch_size=[20, 32])
- out = transformer_model(src, tgt)
+ td_module(tensordict)
+ out = tensordict["out"]
The TensorDictSequential
class allows to branch sequences of nn.Module
instances in a highly modular way.
For instance, here is an implementation of a transformer using the encoder and decoder blocks:
encoder_module = TransformerEncoder(...)
encoder = TensorDictSequential(encoder_module, in_keys=["src", "src_mask"], out_keys=["memory"])
decoder_module = TransformerDecoder(...)
decoder = TensorDictModule(decoder_module, in_keys=["tgt", "memory"], out_keys=["output"])
transformer = TensorDictSequential(encoder, decoder)
assert transformer.in_keys == ["src", "src_mask", "tgt"]
assert transformer.out_keys == ["memory", "output"]
TensorDictSequential
allows to isolate subgraphs by querying a set of desired input / output keys:
transformer.select_subsequence(out_keys=["memory"]) # returns the encoder
transformer.select_subsequence(in_keys=["tgt", "memory"]) # returns the decoder
Check TensorDict tutorials to learn more!
-
A common interface for environments which supports common libraries (OpenAI gym, deepmind control lab, etc.)(1) and state-less execution (e.g. Model-based environments). The batched environments containers allow parallel execution(2). A common PyTorch-first class of tensor-specification class is also provided. TorchRL's environments API is simple but stringent and specific. Check the documentation and tutorial to learn more!
Code
env_make = lambda: GymEnv("Pendulum-v1", from_pixels=True) env_parallel = ParallelEnv(4, env_make) # creates 4 envs in parallel tensordict = env_parallel.rollout(max_steps=20, policy=None) # random rollout (no policy given) assert tensordict.shape == [4, 20] # 4 envs, 20 steps rollout env_parallel.action_spec.is_in(tensordict["action"]) # spec check returns True
-
multiprocess and distributed data collectors(2) that work synchronously or asynchronously. Through the use of TensorDict, TorchRL's training loops are made very similar to regular training loops in supervised learning (although the "dataloader" -- read data collector -- is modified on-the-fly):
Code
env_make = lambda: GymEnv("Pendulum-v1", from_pixels=True) collector = MultiaSyncDataCollector( [env_make, env_make], policy=policy, devices=["cuda:0", "cuda:0"], total_frames=10000, frames_per_batch=50, ... ) for i, tensordict_data in enumerate(collector): loss = loss_module(tensordict_data) loss.backward() optim.step() optim.zero_grad() collector.update_policy_weights_()
Check our distributed collector examples to learn more about ultra-fast data collection with TorchRL.
-
efficient(2) and generic(1) replay buffers with modularized storage:
Code
storage = LazyMemmapStorage( # memory-mapped (physical) storage cfg.buffer_size, scratch_dir="/tmp/" ) buffer = TensorDictPrioritizedReplayBuffer( alpha=0.7, beta=0.5, collate_fn=lambda x: x, pin_memory=device != torch.device("cpu"), prefetch=10, # multi-threaded sampling storage=storage )
Replay buffers are also offered as wrappers around common datasets for offline RL:
Code
from torchrl.data.replay_buffers import SamplerWithoutReplacement from torchrl.data.datasets.d4rl import D4RLExperienceReplay data = D4RLExperienceReplay( "maze2d-open-v0", split_trajs=True, batch_size=128, sampler=SamplerWithoutReplacement(drop_last=True), ) for sample in data: # or alternatively sample = data.sample() fun(sample)
-
cross-library environment transforms(1), executed on device and in a vectorized fashion(2), which process and prepare the data coming out of the environments to be used by the agent:
Code
env_make = lambda: GymEnv("Pendulum-v1", from_pixels=True) env_base = ParallelEnv(4, env_make, device="cuda:0") # creates 4 envs in parallel env = TransformedEnv( env_base, Compose( ToTensorImage(), ObservationNorm(loc=0.5, scale=1.0)), # executes the transforms once and on device ) tensordict = env.reset() assert tensordict.device == torch.device("cuda:0")
Other transforms include: reward scaling (
RewardScaling
), shape operations (concatenation of tensors, unsqueezing etc.), concatenation of successive operations (CatFrames
), resizing (Resize
) and many more.Unlike other libraries, the transforms are stacked as a list (and not wrapped in each other), which makes it easy to add and remove them at will:
env.insert_transform(0, NoopResetEnv()) # inserts the NoopResetEnv transform at the index 0
Nevertheless, transforms can access and execute operations on the parent environment:
transform = env.transform[1] # gathers the second transform of the list parent_env = transform.parent # returns the base environment of the second transform, i.e. the base env + the first transform
-
various tools for distributed learning (e.g. memory mapped tensors)(2);
-
various architectures and models (e.g. actor-critic)(1):
Code
# create an nn.Module common_module = ConvNet( bias_last_layer=True, depth=None, num_cells=[32, 64, 64], kernel_sizes=[8, 4, 3], strides=[4, 2, 1], ) # Wrap it in a SafeModule, indicating what key to read in and where to # write out the output common_module = SafeModule( common_module, in_keys=["pixels"], out_keys=["hidden"], ) # Wrap the policy module in NormalParamsWrapper, such that the output # tensor is split in loc and scale, and scale is mapped onto a positive space policy_module = SafeModule( NormalParamsWrapper( MLP(num_cells=[64, 64], out_features=32, activation=nn.ELU) ), in_keys=["hidden"], out_keys=["loc", "scale"], ) # Use a SafeProbabilisticTensorDictSequential to combine the SafeModule with a # SafeProbabilisticModule, indicating how to build the # torch.distribution.Distribution object and what to do with it policy_module = SafeProbabilisticTensorDictSequential( # stochastic policy policy_module, SafeProbabilisticModule( in_keys=["loc", "scale"], out_keys="action", distribution_class=TanhNormal, ), ) value_module = MLP( num_cells=[64, 64], out_features=1, activation=nn.ELU, ) # Wrap the policy and value funciton in a common module actor_value = ActorValueOperator(common_module, policy_module, value_module) # standalone policy from this standalone_policy = actor_value.get_policy_operator()
-
exploration wrappers and modules to easily swap between exploration and exploitation(1):
Code
policy_explore = EGreedyWrapper(policy) with set_exploration_type(ExplorationType.RANDOM): tensordict = policy_explore(tensordict) # will use eps-greedy with set_exploration_type(ExplorationType.DETERMINISTIC): tensordict = policy_explore(tensordict) # will not use eps-greedy
-
A series of efficient loss modules and highly vectorized functional return and advantage computation.
Code
from torchrl.objectives import DQNLoss loss_module = DQNLoss(value_network=value_network, gamma=0.99) tensordict = replay_buffer.sample(batch_size) loss = loss_module(tensordict)
from torchrl.objectives.value.functional import vec_td_lambda_return_estimate advantage = vec_td_lambda_return_estimate(gamma, lmbda, next_state_value, reward, done, terminated)
-
a generic trainer class(1) that executes the aforementioned training loop. Through a hooking mechanism, it also supports any logging or data transformation operation at any given time.
-
various recipes to build models that correspond to the environment being deployed.
If you feel a feature is missing from the library, please submit an issue! If you would like to contribute to new features, check our call for contributions and our contribution page.
A series of State-of-the-Art implementations are provided with an illustrative purpose:
Algorithm | Compile Support** | Tensordict-free API | Modular Losses | Continuous and Discrete |
DQN | 1.9x | + | NA | + (through ActionDiscretizer transform) |
DDPG | 1.87x | + | + | - (continuous only) |
IQL | 3.22x | + | + | + |
CQL | 2.68x | + | + | + |
TD3 | 2.27x | + | + | - (continuous only) |
TD3+BC | untested | + | + | - (continuous only) |
A2C | 2.67x | + | - | + |
PPO | 2.42x | + | - | + |
SAC | 2.62x | + | - | + |
REDQ | 2.28x | + | - | - (continuous only) |
Dreamer v1 | untested | + | + (different classes) | - (continuous only) |
Decision Transformers | untested | + | NA | - (continuous only) |
CrossQ | untested | + | + | - (continuous only) |
Gail | untested | + | NA | + |
Impala | untested | + | - | + |
IQL (MARL) | untested | + | + | + |
DDPG (MARL) | untested | + | + | - (continuous only) |
PPO (MARL) | untested | + | - | + |
QMIX-VDN (MARL) | untested | + | NA | + |
SAC (MARL) | untested | + | - | + |
RLHF | NA | + | NA | NA |
** The number indicates expected speed-up compared to eager mode when executed on CPU. Numbers may vary depending on architecture and device.
and many more to come!
Code examples displaying toy code snippets and training scripts are also available
Check the examples directory for more details about handling the various configuration settings.
We also provide tutorials and demos that give a sense of what the library can do.
If you're using TorchRL, please refer to this BibTeX entry to cite this work:
@misc{bou2023torchrl,
title={TorchRL: A data-driven decision-making library for PyTorch},
author={Albert Bou and Matteo Bettini and Sebastian Dittert and Vikash Kumar and Shagun Sodhani and Xiaomeng Yang and Gianni De Fabritiis and Vincent Moens},
year={2023},
eprint={2306.00577},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Create a conda environment where the packages will be installed.
conda create --name torch_rl python=3.9
conda activate torch_rl
PyTorch
Depending on the use of functorch that you want to make, you may want to
install the latest (nightly) PyTorch release or the latest stable version of PyTorch.
See here for a detailed list of commands,
including pip3
or other special installation instructions.
Torchrl
You can install the latest stable release by using
pip3 install torchrl
This should work on linux, Windows 10 and OsX (Intel or Silicon chips). On certain Windows machines (Windows 11), one should install the library locally (see below).
The nightly build can be installed via
pip3 install torchrl-nightly
which we currently only ship for Linux and OsX (Intel) machines. Importantly, the nightly builds require the nightly builds of PyTorch too.
To install extra dependencies, call
pip3 install "torchrl[atari,dm_control,gym_continuous,rendering,tests,utils,marl,open_spiel,checkpointing]"
or a subset of these.
One may also desire to install the library locally. Three main reasons can motivate this:
- the nightly/stable release isn't available for one's platform (eg, Windows 11, nightlies for Apple Silicon etc.);
- contributing to the code;
- install torchrl with a previous version of PyTorch (any version >= 2.0) (note that this should also be doable via a regular install followed
by a downgrade to a previous pytorch version -- but the C++ binaries will not be available so some feature will not work,
such as prioritized replay buffers and the like.)
To install the library locally, start by cloning the repo:
git clone https://github.com/pytorch/rl
and don't forget to check out the branch or tag you want to use for the build:
git checkout v0.4.0
Go to the directory where you have cloned the torchrl repo and install it (after
installing ninja
)
cd /path/to/torchrl/
pip3 install ninja -U
python setup.py develop
One can also build the wheels to distribute to co-workers using
python setup.py bdist_wheel
Your wheels will be stored there ./dist/torchrl<name>.whl
and installable via
pip install torchrl<name>.whl
Warning: Unfortunately, pip3 install -e .
does not currently work. Contributions to help fix this are welcome!
On M1 machines, this should work out-of-the-box with the nightly build of PyTorch.
If the generation of this artifact in MacOs M1 doesn't work correctly or in the execution the message
(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
appears, then try
ARCHFLAGS="-arch arm64" python setup.py develop
To run a quick sanity check, leave that directory (e.g. by executing cd ~/
)
and try to import the library.
python -c "import torchrl"
This should not return any warning or error.
Optional dependencies
The following libraries can be installed depending on the usage one wants to make of torchrl:
# diverse
pip3 install tqdm tensorboard "hydra-core>=1.1" hydra-submitit-launcher
# rendering
pip3 install "moviepy<2.0.0"
# deepmind control suite
pip3 install dm_control
# gym, atari games
pip3 install "gym[atari]" "gym[accept-rom-license]" pygame
# tests
pip3 install pytest pyyaml pytest-instafail
# tensorboard
pip3 install tensorboard
# wandb
pip3 install wandb
Troubleshooting
If a ModuleNotFoundError: No module named โtorchrl._torchrl
errors occurs (or
a warning indicating that the C++ binaries could not be loaded),
it means that the C++ extensions were not installed or not found.
- One common reason might be that you are trying to import torchrl from within the
git repo location. The following code snippet should return an error if
torchrl has not been installed in
develop
mode:
If this is the case, consider executing torchrl from another location.cd ~/path/to/rl/repo python -c 'from torchrl.envs.libs.gym import GymEnv'
- If you're not importing torchrl from within its repo location, it could be
caused by a problem during the local installation. Check the log after the
python setup.py develop
. One common cause is a g++/C++ version discrepancy and/or a problem with theninja
library. - If the problem persists, feel free to open an issue on the topic in the repo, we'll make our best to help!
- On MacOs, we recommend installing XCode first.
With Apple Silicon M1 chips, make sure you are using the arm64-built python
(e.g. here).
Running the following lines of code
should displaywget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py python collect_env.py
and notOS: macOS *** (arm64)
OS: macOS **** (x86_64)
Versioning issues can cause error message of the type undefined symbol
and such. For these, refer to the versioning issues document
for a complete explanation and proposed workarounds.
If you spot a bug in the library, please raise an issue in this repo.
If you have a more generic question regarding RL in PyTorch, post it on the PyTorch forum.
Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs. You can checkout the detailed contribution guide here. As mentioned above, a list of open contributions can be found in here.
Contributors are recommended to install pre-commit hooks (using pre-commit install
). pre-commit will check for linting related issues when the code is committed locally. You can disable th check by appending -n
to your commit command: git commit -m <commit message> -n
This library is released as a PyTorch beta feature. BC-breaking changes are likely to happen but they will be introduced with a deprecation warranty after a few release cycles.
TorchRL is licensed under the MIT License. See LICENSE for details.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for rl
Similar Open Source Tools
rl
TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. It provides pytorch and **python-first** , low and high level abstractions for RL that are intended to be **efficient** , **modular** , **documented** and properly **tested**. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.
continuous-eval
Open-Source Evaluation for LLM Applications. `continuous-eval` is an open-source package created for granular and holistic evaluation of GenAI application pipelines. It offers modularized evaluation, a comprehensive metric library covering various LLM use cases, the ability to leverage user feedback in evaluation, and synthetic dataset generation for testing pipelines. Users can define their own metrics by extending the Metric class. The tool allows running evaluation on a pipeline defined with modules and corresponding metrics. Additionally, it provides synthetic data generation capabilities to create user interaction data for evaluation or training purposes.
litdata
LitData is a tool designed for blazingly fast, distributed streaming of training data from any cloud storage. It allows users to transform and optimize data in cloud storage environments efficiently and intuitively, supporting various data types like images, text, video, audio, geo-spatial, and multimodal data. LitData integrates smoothly with frameworks such as LitGPT and PyTorch, enabling seamless streaming of data to multiple machines. Key features include multi-GPU/multi-node support, easy data mixing, pause & resume functionality, support for profiling, memory footprint reduction, cache size configuration, and on-prem optimizations. The tool also provides benchmarks for measuring streaming speed and conversion efficiency, along with runnable templates for different data types. LitData enables infinite cloud data processing by utilizing the Lightning.ai platform to scale data processing with optimized machines.
zeta
Zeta is a tool designed to build state-of-the-art AI models faster by providing modular, high-performance, and scalable building blocks. It addresses the common issues faced while working with neural nets, such as chaotic codebases, lack of modularity, and low performance modules. Zeta emphasizes usability, modularity, and performance, and is currently used in hundreds of models across various GitHub repositories. It enables users to prototype, train, optimize, and deploy the latest SOTA neural nets into production. The tool offers various modules like FlashAttention, SwiGLUStacked, RelativePositionBias, FeedForward, BitLinear, PalmE, Unet, VisionEmbeddings, niva, FusedDenseGELUDense, FusedDropoutLayerNorm, MambaBlock, Film, hyper_optimize, DPO, and ZetaCloud for different tasks in AI model development.
LightRAG
LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.
clarifai-python-grpc
This is the official Clarifai gRPC Python client for interacting with their recognition API. Clarifai offers a platform for data scientists, developers, researchers, and enterprises to utilize artificial intelligence for image, video, and text analysis through computer vision and natural language processing. The client allows users to authenticate, predict concepts in images, and access various functionalities provided by the Clarifai API. It follows a versioning scheme that aligns with the backend API updates and includes specific instructions for installation and troubleshooting. Users can explore the Clarifai demo, sign up for an account, and refer to the documentation for detailed information.
ivy
Ivy is an open-source machine learning framework that enables users to convert code between different ML frameworks and write framework-agnostic code. It allows users to transpile code from one framework to another, making it easy to use building blocks from different frameworks in a single project. Ivy also serves as a flexible framework that breaks free from framework limitations, allowing users to publish code that is interoperable with various frameworks and future frameworks. Users can define trainable modules and layers using Ivy's stateful API, making it easy to build and train models across different backends.
ivy
Ivy is an open-source machine learning framework that enables you to: * ๐ **Convert code into any framework** : Use and build on top of any model, library, or device by converting any code from one framework to another using `ivy.transpile`. * โ๏ธ **Write framework-agnostic code** : Write your code once in `ivy` and then choose the most appropriate ML framework as the backend to leverage all the benefits and tools. Join our growing community ๐ to connect with people using Ivy. **Let's** unify.ai **together ๐ฆพ**
GraphRAG-SDK
Build fast and accurate GenAI applications with GraphRAG SDK, a specialized toolkit for building Graph Retrieval-Augmented Generation (GraphRAG) systems. It integrates knowledge graphs, ontology management, and state-of-the-art LLMs to deliver accurate, efficient, and customizable RAG workflows. The SDK simplifies the development process by automating ontology creation, knowledge graph agent creation, and query handling, enabling users to interact and query their knowledge graphs effectively. It supports multi-agent systems and orchestrates agents specialized in different domains. The SDK is optimized for FalkorDB, ensuring high performance and scalability for large-scale applications. By leveraging knowledge graphs, it enables semantic relationships and ontology-driven queries that go beyond standard vector similarity, enhancing retrieval-augmented generation capabilities.
SemanticKernel.Assistants
This repository contains an assistant proposal for the Semantic Kernel, allowing the usage of assistants without relying on OpenAI Assistant APIs. It runs locally planners and plugins for the assistants, providing scenarios like Assistant with Semantic Kernel plugins, Multi-Assistant conversation, and AutoGen conversation. The Semantic Kernel is a lightweight SDK enabling integration of AI Large Language Models with conventional programming languages, offering functions like semantic functions, native functions, and embeddings-based memory. Users can bring their own model for the assistants and host them locally. The repository includes installation instructions, usage examples, and information on creating new conversation threads with the assistant.
xlstm
xLSTM is a new Recurrent Neural Network architecture based on ideas of the original LSTM. Through Exponential Gating with appropriate normalization and stabilization techniques and a new Matrix Memory it overcomes the limitations of the original LSTM and shows promising performance on Language Modeling when compared to Transformers or State Space Models. The package is based on PyTorch and was tested for versions >=1.8. For the CUDA version of xLSTM, you need Compute Capability >= 8.0. The xLSTM tool provides two main components: xLSTMBlockStack for non-language applications or integrating in other architectures, and xLSTMLMModel for language modeling or other token-based applications.
litserve
LitServe is a high-throughput serving engine for deploying AI models at scale. It generates an API endpoint for a model, handles batching, streaming, autoscaling across CPU/GPUs, and more. Built for enterprise scale, it supports every framework like PyTorch, JAX, Tensorflow, and more. LitServe is designed to let users focus on model performance, not the serving boilerplate. It is like PyTorch Lightning for model serving but with broader framework support and scalability.
microchain
Microchain is a function calling-based LLM agents tool with no bloat. It allows users to define LLM and templates, use various functions like Sum and Product, and create LLM agents for specific tasks. The tool provides a simple and efficient way to interact with OpenAI models and create conversational agents for various applications.
Arcade-Learning-Environment
The Arcade Learning Environment (ALE) is a simple framework that allows researchers and hobbyists to develop AI agents for Atari 2600 games. It is built on top of the Atari 2600 emulator Stella and separates the details of emulation from agent design. The ALE currently supports three different interfaces: C++, Python, and OpenAI Gym.
wandb
Weights & Biases (W&B) is a platform that helps users build better machine learning models faster by tracking and visualizing all components of the machine learning pipeline, from datasets to production models. It offers tools for tracking, debugging, evaluating, and monitoring machine learning applications. W&B provides integrations with popular frameworks like PyTorch, TensorFlow/Keras, Hugging Face Transformers, PyTorch Lightning, XGBoost, and Sci-Kit Learn. Users can easily log metrics, visualize performance, and compare experiments using W&B. The platform also supports hosting options in the cloud or on private infrastructure, making it versatile for various deployment needs.
gritlm
The 'gritlm' repository provides all materials for the paper Generative Representational Instruction Tuning. It includes code for inference, training, evaluation, and known issues related to the GritLM model. The repository also offers models for embedding and generation tasks, along with instructions on how to train and evaluate the models. Additionally, it contains visualizations, acknowledgements, and a citation for referencing the work.
For similar tasks
rl
TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. It provides pytorch and **python-first** , low and high level abstractions for RL that are intended to be **efficient** , **modular** , **documented** and properly **tested**. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.
oss-fuzz-gen
This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customerโs subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.