siiRL
Quickstart
Installation
Quickstart: GRPO training on GSM8K dataset
Programming guide
siiRL: The DistFlow Programming Guide
Data Preparation
Prepare Data for Post-Training
Implementing Reward Functions for Datasets
Configurations
Config Explanation
Example
DeepScaleR Example with PPO
MM-Eureka Example with GRPO
DeepScaleR Example with CPGD
Hardware Support
Ascend NPU
siiRL
Index
Index