slime

slime

slime is a LLM post-training framework aiming for RL Scaling.

Stars: 1564

Visit
 screenshot

Slime is an LLM post-training framework for RL scaling that provides high-performance training and flexible data generation capabilities. It connects Megatron with SGLang for efficient training and enables custom data generation workflows through server-based engines. The framework includes modules for training, rollout, and data buffer management, offering a comprehensive solution for RL scaling.

README:

slime

中文版

slime is an LLM post-training framework for RL scaling, providing two core capabilities:

  1. High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang;
  2. Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.

Blogs

Table of Contents

Architecture Overview

arch

Module Descriptions:

  • training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
  • rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer.
  • data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.

Quick Start

For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:

Arguments Walk Through

Arguments in slime are divided into three categories:

  1. Megatron arguments: slime reads all arguments set in Megatron via PYTHONPATH. You can configure Megatron by passing arguments like --tensor-model-parallel-size 2.
  2. SGLang arguments: All arguments for the installed SGLang are supported. These arguments must be prefixed with --sglang-. For example, --mem-fraction-static should be passed as --sglang-mem-fraction-static.
  3. slime-specific arguments: Please refer to: slime/utils/arguments.py

For complete usage instructions, please refer to the Usage Documentation.

Developer Guide

  • Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR 😊

  • Use pre-commit to ensure code style consistency for your commits:

    apt install pre-commit -y
    pre-commit install
  • For debugging tips, please refer to the Debugging Guide

FAQ & Acknowledgements

  • For frequently asked questions, please see the Q&A
  • Special thanks to the following projects & communities: SGLang, Megatron‑LM, mbridge, OpenRLHF, veRL, Pai-Megatron-Patch and others.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for slime

Similar Open Source Tools

For similar tasks

For similar jobs