alibaba

ROLL

AIReinforcement LearningLLMDistributed TrainingPPODeep Learning

3,046

+340

// summary

ROLL is an efficient, user-friendly reinforcement learning library specifically designed to scale training for Large Language Models across large-scale GPU clusters. It utilizes a multi-role distributed architecture with Ray to support complex tasks like human preference alignment, multi-turn agentic interactions, and advanced reasoning. The framework integrates high-performance backends such as Megatron-Core, vLLM, and SGLang to optimize training throughput and resource utilization.

// use cases

Multi-task Reinforcement Learning (RLVR) for domains including mathematics, coding, and instruction following.

Agentic RL for multi-turn interactions, tool use, and complex environment navigation.

Large-scale model post-training pipelines including SFT, DPO, and distillation for both LLMs and VLMs.