HubLens › Trending › alibaba/ROLL
alibaba

ROLL

AIReinforcement LearningLLMDistributed TrainingPPODeep Learning
View on GitHub
3,046
+340

// summary

ROLL is an efficient, user-friendly reinforcement learning library specifically designed to scale training for Large Language Models across large-scale GPU clusters. It utilizes a multi-role distributed architecture with Ray to support complex tasks like human preference alignment, multi-turn agentic interactions, and advanced reasoning. The framework integrates high-performance backends such as Megatron-Core, vLLM, and SGLang to optimize training throughput and resource utilization.

// use cases

01
Multi-task Reinforcement Learning (RLVR) for domains including mathematics, coding, and instruction following.
02
Agentic RL for multi-turn interactions, tool use, and complex environment navigation.
03
Large-scale model post-training pipelines including SFT, DPO, and distillation for both LLMs and VLMs.