Everything-LLMs-And-Robotics

Everything-LLMs-And-Robotics

The world's largest GitHub Repository for LLMs + Robotics

Stars: 718

Visit
 screenshot

The Everything-LLMs-And-Robotics repository is the world's largest GitHub repository focusing on the intersection of Large Language Models (LLMs) and Robotics. It provides educational resources, research papers, project demos, and Twitter threads related to LLMs, Robotics, and their combination. The repository covers topics such as reasoning, planning, manipulation, instructions and navigation, simulation frameworks, perception, and more, showcasing the latest advancements in the field.

README:

Everything-LLMs-And-Robotics

The world's largest GitHub Repository for the intersection of LLMs (multimodal included!) + Robotics

Heavily Inspired by Awesome-LLM-Robotics

Logistics

If you want to make a change this repository click here

Why I made this: Go here.

What Does This Repository Have?

LLMs Educational Resources

  • START HERE: "Transformers from Scratch", Brandon Rohrer, [Website]

  • Stanford Transformers Class: "CS25: Transformers United", Stanford, 2022, [Website]

  • Andrej Karpathy GPT Tutorial: "Let's build GPT: from scratch, in code, spelled out." Andrej Karpathy, 2023 [Youtube Video]

Robotics Educational Resources

  • AI-Enabled Robotics Class: "CS199: Stanford Robotics Independent Study", Stanford, 2023, [Website]

LLMs + Robotics Educational Resources

  • Google's 2022 Research: "Google Research, 2022 & beyond: Robotics", Google, 2023, [Website]

  • Controlling Robots Via Large Language Models: "Controlling Robots Via Large Language Models", Sanjiban Choudhury, CS 4756/5756, Cornell, 2023 [Slides]

Reasoning

  • AutoTAMP: "AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers", arXiv, June 2023. [Paper]

  • LLM Designs Robots: "CAN LARGE LANGUAGE MODELS DESIGN A ROBOT?", arXiv, Mar 2023. [Paper]

  • PaLM-E: "PaLM-E: An Embodied Multimodal Language Model", arXiV, Mar 2023. [Paper] [Website] [Demo]

  • RT-1: "RT-1: Robotics Transformer for Real-World Control at Scale", arXiv, Dec 2022. [Paper] [Code] [Website]

  • ProgPrompt: "Generating Situated Robot Task Plans using Large Language Models", arXiv, Sept 2022. [Paper] [Code Doesn't Really Exist here] [Website]

  • Code-As-Policies: "Code as Policies: Language Model Programs for Embodied Control", arXiv, Sept 2022. [Paper] [Code] [Website]

  • Say-Can: "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances", arXiv, Apr 2021. [Paper] [Code] [Website]

  • Socratic: "Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language", arXiv, Apr 2021. [Paper] [Code] [Website]

  • PIGLeT: "PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World", ACL, Jun 2021. [Paper] [Code] [Website]

Planning

  • LLM-GROP: "Task and Motion Planning with Large Language Models for Object Rearrangement", arXiv, Mar 2023 [Paper]

  • Bio Lab Task Planning: "LLMs can generate robotic scripts from goal-oriented instructions in biological laboratory automation", arXiv, April 2023 [Paper]

  • PromptCraft Robotics: "ChatGPT for Robotics: Design Principles and Model Abilities", Microsoft, 2023, [Paper], [Website], [Code]

  • CLARIFY "Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting", arXiv, March 2023 [Paper][Code][Website]

  • LM-Nav: "Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action", arXiv, July 2022. [Paper] [Pytorch Code] [Website]

  • InnerMonlogue: "Inner Monologue: Embodied Reasoning through Planning with Language Models", arXiv, July 2022. [Paper] [Website]

  • Housekeep: "Housekeep: Tidying Virtual Households using Commonsense Reasoning", arXiv, May 2022. [Paper] [Pytorch Code] [Website]

  • LID: "Pre-Trained Language Models for Interactive Decision-Making", arXiv, Feb 2022. [Paper] [Pytorch Code] [Website]

  • ZSP: "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents", ICML, Jan 2022. [Paper] [Pytorch Code] [Website]

Manipulation

  • MOO "Open-World Object Manipulation using Pre-trained Vision-Language Models" arXiv, March 2023 [Paper] [Website]

  • TidyBot: "TidyBot: Personalized Robot Assistance with Large Language Models", arXiV, May 2023, [Paper Website Paper Website]

  • DIAL:"Robotic Skill Acquistion via Instruction Augmentation with Vision-Language Models", arXiv, Nov 2022, [Paper] [Website]

  • CLIP-Fields:"CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory", arXiv, Oct 2022, [Paper] [PyTorch Code] [Website]

  • VIMA:"VIMA: General Robot Manipulation with Multimodal Prompts", arXiv, Oct 2022, [Paper] [Pytorch Code] [Website]

  • Perceiver-Actor:"A Multi-Task Transformer for Robotic Manipulation", CoRL, Sep 2022. [Paper] [Pytorch Code] [Website]

  • LaTTe: "LaTTe: Language Trajectory TransformEr", arXiv, Aug 2022. [Paper] [TensorFlow Code] [Website]

  • Robots Enact Malignant Stereotypes: "Robots Enact Malignant Stereotypes", FAccT, Jun 2022. [Paper] [Website] Washington Post] [Wired] (code access on request)

  • ATLA: "Leveraging Language for Accelerated Learning of Tool Manipulation", CoRL, Jun 2022. [Paper]

  • ZeST: "Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?", L4DC, Apr 2022. [Paper]

  • LSE-NGU: "Semantic Exploration from Language Abstractions and Pretrained Representations", arXiv, Apr 2022. [Paper]

  • Embodied-CLIP: "Simple but Effective: CLIP Embeddings for Embodied AI ", CVPR, Nov 2021. [Paper] [Pytorch Code]

  • CLIPort: "CLIPort: What and Where Pathways for Robotic Manipulation", CoRL, Sept 2021. [Paper] [Pytorch Code] [Website]

Instructions and Navigation

  • Text2Motion: "Text2Motion: From Natural Language Instructions to Feasible Plans", arXiv, Mar 2023 [Paper]

  • ChatGPT Robot Collaboration: "Improved Trust in Human-Robot Collaboration with ChatGPT", arXiv, April 2023. [Paper]

  • ADAPT: "ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts", CVPR, May 2022. [Paper]

  • Pre-Trained Vision Models for Control: "The Unsurprising Effectiveness of Pre-Trained Vision Models for Control", ICML, Mar 2022. [Paper] [Pytorch Code] [Website]

  • CoW: "CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration", arXiv, Mar 2022. [Paper]

  • Recurrent VLN-BERT: "A Recurrent Vision-and-Language BERT for Navigation", CVPR, Jun 2021 [Paper] [Pytorch Code]

  • VLN-BERT: "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web", ECCV, Apr 2020 [Paper] [Pytorch Code]

  • Interactive Language: "Interactive Language: Talking to Robots in Real Time", arXiv, Oct 2022 [Paper] [Website]

Simulation Frameworks

  • MineDojo: "MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge", arXiv, Jun 2022. [Paper] [Code] [Website] [Open Database]

  • Habitat 2.0: "Habitat 2.0: Training Home Assistants to Rearrange their Habitat", NeurIPS, Dec 2021. [Paper] [Code] [Website]

  • BEHAVIOR: "BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments", CoRL, Nov 2021. [Paper] [Code] [Website]

  • iGibson 1.0: "iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes", IROS, Sep 2021. [Paper] [Code] [Website]

  • ALFRED: "ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks", CVPR, Jun 2020. [Paper] [Code] [Website]

  • BabyAI: "BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning", ICLR, May 2019. [Paper] [Code]

Perception

  • Matcha agent: "Chat with the Environment: Interactive Multimodal Perception Using Large Language Models", IROS 2023. [Paper] [Poster] [Code] [Video] [Website]

  • LGX: "Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Based Zero-Shot Object Navigation", arXiv, Mar 2023. [Paper]

  • Robots Acquire Skills With VLMs: "Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models" arXiv, Nov 2022. [Paper]

  • From Occulation To Insight: "From Occlusion to Insight: Object Search in Semantic Shelves using Large Language Models", arXiv, Feb 2023, [Paper]

Project Demos

  • RobotGPT Pt.2 "Twitter Video Of Voice-Input LLM-Powered Robot Arm", Orangewood Labs, 2023, [Video]

  • SPOT GPT: "Boston Dynamics Integration of ChatGPT into SPOT Robot", Boston Dynamics, 2023, [Video]

  • RobotGPT: "Orangewood Labs RoboGPT Demo", Orangewood Labs, 2023, [Video]

  • Mona: "Vitruvian Works Robot Demonstration", Vitruvian Works, 2023, [Video]

  • Ameca: "Ameca Expressions with GPT-3 / 4", Engineered Arts, 2023, [Video]

  • Sarcastic Robot: "Sarcastic Robot powered by GPT-4", Gabrael Levine (Hackathon Project), 2023, [Video]

  • DroneFormer: "DroneFormer: Controlling UAVs with natural language!", Brian Wu (Hackathon Project), Stanford University, 2023 [Video]

Thoughtful Twitter Threads

  • Bitter Lesson 2.0: @hausman_k, 2023 [Thread]

Citation

If you find this repository useful, please consider citing this list:

@misc{rintamaki2023everythingllmsandroboticsrepo,
    title={Everything-LLMs-And-Robotics},
    author={Jacob Rintamaki},
    journal={GitHub repository},
    url={https://github.com/jrin771/Everything-LLMs-And-Robotics},
    year={2023},
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Everything-LLMs-And-Robotics

Similar Open Source Tools

For similar tasks

For similar jobs