habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Stars: 2237

Visit

Habitat-Lab is a modular high-level library for end-to-end development in embodied AI. It is designed to train agents to perform a wide variety of embodied AI tasks in indoor environments, as well as develop agents that can interact with humans in performing these tasks.

README:

Habitat-Lab

Towards this goal, Habitat-Lab is designed to support the following features:

Flexible task definitions: allowing users to train agents in a wide variety of single and multi-agent tasks (e.g. navigation, rearrangement, instruction following, question answering, human following), as well as define novel tasks.
Diverse embodied agents: configuring and instantiating a diverse set of embodied agents, including commercial robots and humanoids, specifying their sensors and capabilities.
Training and evaluating agents: providing algorithms for single and multi-agent training (via imitation or reinforcement learning, or no learning at all as in SensePlanAct pipelines), as well as tools to benchmark their performance on the defined tasks using standard metrics.
Human in the loop interaction: providing a framework for humans to interact with the simulator, enabling to collect embodied data or interact with trained agents.

Habitat-Lab uses Habitat-Sim as the core simulator. For documentation refer here.

Habitat-Lab

Citing Habitat

If you use the Habitat platform in your research, please cite the Habitat 1.0, Habitat 2.0, and Habitat 3.0 papers:

@misc{puig2023habitat3,
      title  = {Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots},
      author = {Xavi Puig and Eric Undersander and Andrew Szot and Mikael Dallaire Cote and Ruslan Partsey and Jimmy Yang and Ruta Desai and Alexander William Clegg and Michal Hlavac and Tiffany Min and Theo Gervet and Vladimír Vondruš and Vincent-Pierre Berges and John Turner and Oleksandr Maksymets and Zsolt Kira and Mrinal Kalakrishnan and Jitendra Malik and Devendra Singh Chaplot and Unnat Jain and Dhruv Batra and Akshara Rai and Roozbeh Mottaghi},
      year={2023},
      archivePrefix={arXiv},
}

@inproceedings{szot2021habitat,
  title     =     {Habitat 2.0: Training Home Assistants to Rearrange their Habitat},
  author    =     {Andrew Szot and Alex Clegg and Eric Undersander and Erik Wijmans and Yili Zhao and John Turner and Noah Maestre and Mustafa Mukadam and Devendra Chaplot and Oleksandr Maksymets and Aaron Gokaslan and Vladimir Vondrus and Sameer Dharur and Franziska Meier and Wojciech Galuba and Angel Chang and Zsolt Kira and Vladlen Koltun and Jitendra Malik and Manolis Savva and Dhruv Batra},
  booktitle =     {Advances in Neural Information Processing Systems (NeurIPS)},
  year      =     {2021}
}

@inproceedings{habitat19iccv,
  title     =     {Habitat: {A} {P}latform for {E}mbodied {AI} {R}esearch},
  author    =     {Manolis Savva and Abhishek Kadian and Oleksandr Maksymets and Yili Zhao and Erik Wijmans and Bhavana Jain and Julian Straub and Jia Liu and Vladlen Koltun and Jitendra Malik and Devi Parikh and Dhruv Batra},
  booktitle =     {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year      =     {2019}
}

Installation

Preparing conda env

Assuming you have conda installed, let's prepare a conda env:

# We require python>=3.9 and cmake>=3.14
conda create -n habitat python=3.9 cmake=3.14.0
conda activate habitat

conda install habitat-sim
- To install habitat-sim with bullet physics
```
conda install habitat-sim withbullet -c conda-forge -c aihabitat
```
  Note, for newer features added after the most recent release, you may need to install aihabitat-nightly. See Habitat-Sim's installation instructions for more details.

pip install habitat-lab stable version.

git clone --branch stable https://github.com/facebookresearch/habitat-lab.git
cd habitat-lab
pip install -e habitat-lab  # install habitat_lab

Install habitat-baselines.

The command above will install only core of Habitat-Lab. To include habitat_baselines along with all additional requirements, use the command below after installing habitat-lab:
```
pip install -e habitat-baselines  # install habitat_baselines
```

Testing

Let's download some 3D assets using Habitat-Sim's python data download utility:
- Download (testing) 3D scenes:
```
python -m habitat_sim.utils.datasets_download --uids habitat_test_scenes --data-path data/
```
  Note that these testing scenes do not provide semantic annotations.
- Download point-goal navigation episodes for the test scenes:
```
python -m habitat_sim.utils.datasets_download --uids habitat_test_pointnav_dataset --data-path data/
```
Non-interactive testing: Test the Pick task: Run the example pick task script
```
python examples/example.py
```
which uses habitat-lab/habitat/config/benchmark/rearrange/skills/pick.yaml for configuration of task and agent. The script roughly does this:
```
import gym
import habitat.gym

# Load embodied AI task (RearrangePick) and a pre-specified virtual robot
env = gym.make("HabitatRenderPick-v0")
observations = env.reset()

terminal = False

# Step through environment with random actions
while not terminal:
    observations, reward, terminal, info = env.step(env.action_space.sample())
```
To modify some of the configurations of the environment, you can also use the habitat.gym.make_gym_from_config method that allows you to create a habitat environment using a configuration.
```
config = habitat.get_config(
  "benchmark/rearrange/skills/pick.yaml",
  overrides=["habitat.environment.max_episode_steps=20"]
)
env = habitat.gym.make_gym_from_config(config)
```
If you want to know more about what the different configuration keys overrides do, you can use this reference.

See examples/register_new_sensors_and_measures.py for an example of how to extend habitat-lab from outside the source code.
Interactive testing: Using you keyboard and mouse to control a Fetch robot in a ReplicaCAD environment:
```
# Pygame for interactive visualization, pybullet for inverse kinematics
pip install pygame==2.0.1 pybullet==3.0.4

# Interactive play script
python examples/interactive_play.py --never-end
```
Use I/J/K/L keys to move the robot base forward/left/backward/right and W/A/S/D to move the arm end-effector forward/left/backward/right and E/Q to move the arm up/down. The arm can be difficult to control via end-effector control. More details in documentation. Try to move the base and the arm to touch the red bowl on the table. Have fun!

Note: Interactive testing currently fails on Ubuntu 20.04 with an error: X Error of failed request: BadAccess (attempt to access private resource denied). We are working on fixing this, and will update instructions once we have a fix. The script works without errors on MacOS.

Debugging an environment issue

Our vectorized environments are very fast, but they are not very verbose. When using VectorEnv some errors may be silenced, resulting in process hanging or multiprocessing errors that are hard to interpret. We recommend setting the environment variable HABITAT_ENV_DEBUG to 1 when debugging (export HABITAT_ENV_DEBUG=1) as this will use the slower, but more verbose ThreadedVectorEnv class. Do not forget to reset HABITAT_ENV_DEBUG (unset HABITAT_ENV_DEBUG) when you are done debugging since VectorEnv is much faster than ThreadedVectorEnv.

Documentation

Browse the online Habitat-Lab documentation and the extensive tutorial on how to train your agents with Habitat. For Habitat 2.0, use this quickstart guide.

Docker Setup

We provide docker containers for Habitat, updated approximately once per year for the Habitat Challenge. This works on machines with an NVIDIA GPU and requires users to install nvidia-docker. To setup the habitat stack using docker follow the below steps:

Pull the habitat docker image: docker pull fairembodied/habitat-challenge:testing_2022_habitat_base_docker
Start an interactive bash session inside the habitat docker: docker run --runtime=nvidia -it fairembodied/habitat-challenge:testing_2022_habitat_base_docker
Activate the habitat conda environment: conda init; source ~/.bashrc; source activate habitat
Run the testing scripts as above: cd habitat-lab; python examples/example.py. This should print out an output like:
```
Agent acting inside environment.
Episode finished after 200 steps.
```

Questions?

Can't find the answer to your question? Look up for common issues or try asking the developers and community on our Discussions forum.

Datasets

Common task and episode datasets used with Habitat-Lab.

Baselines

Habitat-Lab includes reinforcement learning (via PPO) baselines. For running PPO training on sample data and more details refer habitat_baselines/README.md.

ROS-X-Habitat

ROS-X-Habitat (https://github.com/ericchen321/ros_x_habitat) is a framework that bridges the AI Habitat platform (Habitat Lab + Habitat Sim) with other robotics resources via ROS. ROS-X-Habitat places emphasis on 1) leveraging Habitat Sim v2's physics-based simulation capability and 2) allowing roboticists to access simulation assets from ROS. The work has also been made public as a paper.

Note that ROS-X-Habitat was developed, and is maintained by the Lab for Computational Intelligence at UBC; it has not yet been officially supported by the Habitat Lab team. Please refer to the framework's repository for docs and discussions.

License

Habitat-Lab is MIT licensed. See the LICENSE file for details.

The trained models and the task datasets are considered data derived from the correspondent scene datasets.

Matterport3D based task datasets and trained models are distributed with Matterport3D Terms of Use and under CC BY-NC-SA 3.0 US license.
Gibson based task datasets, the code for generating such datasets, and trained models are distributed with Gibson Terms of Use and under CC BY-NC-SA 3.0 US license.

For Tasks:

Click tags to check more tools for each tasks

navigation object manipulation human interaction

For Jobs:

robotics engineer ai researcher game developer computer vision engineer machine learning engineer

Alternative AI tools for habitat-lab

Similar Open Source Tools

habitat-lab

github

: 2.2k

habitat-sim

Habitat-Sim is a high-performance physics-enabled 3D simulator with support for 3D scans of indoor/outdoor spaces, CAD models of spaces and piecewise-rigid objects, configurable sensors, robots described via URDF, and rigid-body mechanics. It prioritizes simulation speed over the breadth of simulation capabilities, achieving several thousand frames per second (FPS) running single-threaded and over 10,000 FPS multi-process on a single GPU when rendering a scene from the Matterport3D dataset. Habitat-Sim simulates a Fetch robot interacting in ReplicaCAD scenes at over 8,000 steps per second (SPS), where each ‘step’ involves rendering 1 RGBD observation (128×128 pixels) and rigid-body dynamics for 1/30sec.

github

: 2.6k

labo

LABO is a time series forecasting and analysis framework that integrates pre-trained and fine-tuned LLMs with multi-domain agent-based systems. It allows users to create and tune agents easily for various scenarios, such as stock market trend prediction and web public opinion analysis. LABO requires a specific runtime environment setup, including system requirements, Python environment, dependency installations, and configurations. Users can fine-tune their own models using LABO's Low-Rank Adaptation (LoRA) for computational efficiency and continuous model updates. Additionally, LABO provides a Python library for building model training pipelines and customizing agents for specific tasks.

github

: 160

OpenDevin

OpenDevin is an open-source project aiming to replicate Devin, an autonomous AI software engineer capable of executing complex engineering tasks and collaborating actively with users on software development projects. The project aspires to enhance and innovate upon Devin through the power of the open-source community. Users can contribute to the project by developing core functionalities, frontend interface, or sandboxing solutions, participating in research and evaluation of LLMs in software engineering, and providing feedback and testing on the OpenDevin toolset.

github

: 30.2k

aimo-progress-prize

This repository contains the training and inference code needed to replicate the winning solution to the AI Mathematical Olympiad - Progress Prize 1. It consists of fine-tuning DeepSeekMath-Base 7B, high-quality training datasets, a self-consistency decoding algorithm, and carefully chosen validation sets. The training methodology involves Chain of Thought (CoT) and Tool Integrated Reasoning (TIR) training stages. Two datasets, NuminaMath-CoT and NuminaMath-TIR, were used to fine-tune the models. The models were trained using open-source libraries like TRL, PyTorch, vLLM, and DeepSpeed. Post-training quantization to 8-bit precision was done to improve performance on Kaggle's T4 GPUs. The project structure includes scripts for training, quantization, and inference, along with necessary installation instructions and hardware/software specifications.

github

: 61

RLAIF-V

RLAIF-V is a novel framework that aligns MLLMs in a fully open-source paradigm for super GPT-4V trustworthiness. It maximally exploits open-source feedback from high-quality feedback data and online feedback learning algorithm. Notable features include achieving super GPT-4V trustworthiness in both generative and discriminative tasks, using high-quality generalizable feedback data to reduce hallucination of different MLLMs, and exhibiting better learning efficiency and higher performance through iterative alignment.

github

: 85

InstructGraph

InstructGraph is a framework designed to enhance large language models (LLMs) for graph-centric tasks by utilizing graph instruction tuning and preference alignment. The tool collects and decomposes 29 standard graph datasets into four groups, enabling LLMs to better understand and generate graph data. It introduces a structured format verbalizer to transform graph data into a code-like format, facilitating code understanding and generation. Additionally, it addresses hallucination problems in graph reasoning and generation through direct preference optimization (DPO). The tool aims to bridge the gap between textual LLMs and graph data, offering a comprehensive solution for graph-related tasks.

github

: 53

flashinfer

FlashInfer is a library for Language Languages Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, PageAttention and LoRA. FlashInfer focus on LLM serving and inference, and delivers state-the-art performance across diverse scenarios.

github

: 2.5k

axoned

Axone is a public dPoS layer 1 designed for connecting, sharing, and monetizing resources in the AI stack. It is an open network for collaborative AI workflow management compatible with any data, model, or infrastructure, allowing sharing of data, algorithms, storage, compute, APIs, both on-chain and off-chain. The 'axoned' node of the AXONE network is built on Cosmos SDK & Tendermint consensus, enabling companies & individuals to define on-chain rules, share off-chain resources, and create new applications. Validators secure the network by maintaining uptime and staking $AXONE for rewards. The blockchain supports various platforms and follows Semantic Versioning 2.0.0. A docker image is available for quick start, with documentation on querying networks, creating wallets, starting nodes, and joining networks. Development involves Go and Cosmos SDK, with smart contracts deployed on the AXONE blockchain. The project provides a Makefile for building, installing, linting, and testing. Community involvement is encouraged through Discord, open issues, and pull requests.

github

: 168

contracts

AXONE Smart Contracts repository hosts Smart Contracts for the AXONE network, compatible with any Cosmos blockchains using the CosmWasm framework. It includes storage, sovereignty, and resource management oriented Smart Contracts. Each contract has different functionalities and maturity stages, with detailed tech documentation and emojis indicating maturity levels. The repository provides tools for building, testing, deploying, and interacting with Smart Contracts, along with guidelines for contributing and community engagement.

github

: 121

agentdojo

AgentDojo is a dynamic environment designed to evaluate prompt injection attacks and defenses for large language models (LLM) agents. It provides a benchmark script to run different suites and tasks with specified LLM models, defenses, and attacks. The tool is under active development, and users can inspect the results through dedicated documentation pages and the Invariant Benchmark Registry.

github

: 112

MInference

MInference is a tool designed to accelerate pre-filling for long-context Language Models (LLMs) by leveraging dynamic sparse attention. It achieves up to a 10x speedup for pre-filling on an A100 while maintaining accuracy. The tool supports various decoding LLMs, including LLaMA-style models and Phi models, and provides custom kernels for attention computation. MInference is useful for researchers and developers working with large-scale language models who aim to improve efficiency without compromising accuracy.

github

: 853

doku

OpenLIT is an OpenTelemetry-native GenAI and LLM Application Observability tool. It's designed to make the integration process of observability into GenAI projects as easy as pie – literally, with just a single line of code. Whether you're working with popular LLM Libraries such as OpenAI and HuggingFace or leveraging vector databases like ChromaDB, OpenLIT ensures your applications are monitored seamlessly, providing critical insights to improve performance and reliability.

github

: 58

ArcticTraining

ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs). It offers modular trainer designs, simplified code structures, and integrated pipelines for creating and cleaning synthetic data, enabling users to enhance LLM capabilities like code generation and complex reasoning with greater efficiency and flexibility.

github

: 56

labelbox-python

Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

github

: 135

SmolChat-Android

SmolChat-Android is a mobile application that enables users to interact with local small language models (SLMs) on-device. Users can add/remove SLMs, modify system prompts and inference parameters, create downstream tasks, and generate responses. The app uses llama.cpp for model execution, ObjectBox for database storage, and Markwon for markdown rendering. It provides a simple, extensible codebase for on-device machine learning projects.

github

: 233

For similar tasks

habitat-lab

github

: 2.2k

Awesome-LLM-Robotics

This repository contains a curated list of **papers using Large Language/Multi-Modal Models for Robotics/RL**. Template from awesome-Implicit-NeRF-Robotics Please feel free to send me pull requests or email to add papers! If you find this repository useful, please consider citing and STARing this list. Feel free to share this list with others! ## Overview * Surveys * Reasoning * Planning * Manipulation * Instructions and Navigation * Simulation Frameworks * Citation

github

: 3.5k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

habitat-lab

README:

Habitat-Lab

Table of contents

Citing Habitat

Installation

Testing

Debugging an environment issue

Documentation

Docker Setup

Questions?

Datasets

Baselines

ROS-X-Habitat

License

For Tasks:

For Jobs:

Alternative AI tools for habitat-lab

Similar Open Source Tools

habitat-lab

habitat-sim

labo

OpenDevin

aimo-progress-prize

RLAIF-V

InstructGraph

flashinfer

axoned

contracts

agentdojo

MInference

doku

ArcticTraining

labelbox-python

SmolChat-Android

For similar tasks

habitat-lab

Awesome-LLM-Robotics

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick