siiRL
Quickstart
Installation
Quickstart: GRPO training on GSM8K dataset
Programming guide
siiRL: The DistFlow Programming Guide
Data Preparation
Prepare Data for Post-Training
Implementing Reward Functions for Datasets
Configurations
Config Explanation
Example
DeepScaleR Example with PPO
MM-Eureka Example with GRPO
DeepScaleR Example with CPGD
Hardware Support
Ascend NPU
siiRL
siiRL documentation
View page source
siiRL documentation
Quickstart
Installation
Requirements
Method 1: Install from docker image
Method 2: Install from PIP
Method 3: Install from custom environment
Quickstart: GRPO training on GSM8K dataset
Introduction
Dataset Introduction
Step 1: Prepare the dataset
Step 2: Download a model for post-training
Step 3: Perform GRPO training with the instruct model
Programming guide
siiRL: The DistFlow Programming Guide
Motivation: Overcoming the Limits of Centralized Control
The DistFlow Architecture
Codebase Walkthrough: How DistFlow is Implemented
Key Takeaways
Data Preparation
Prepare Data for Post-Training
Implementing Reward Functions for Datasets
Configurations
Config Explanation
ppo_dag_trainer.yaml for RL FSDP Backend
workflow_grpo.yaml for GRPO
Example
DeepScaleR Example with PPO
MM-Eureka Example with GRPO
DeepScaleR Example with CPGD
Hardware Support
Ascend NPU