v0.7 release

Blog post: verl 0.7 release blog

Highlight

Model Engine

Integrate Megatron-Bridge and support LoRA/PEFT, see blog post: How We Build Trillion Parameter Reasoning RL with 10% GPUs
Support experimental fp8 training for megatron backend
Support new model for megatron backend: GPT-OSS, Qwen3-Next
Comprehensive support for new mode engine, FSDP and Megatron engine are production ready.
- Dispatch tensordict with nested tensor instead of padded DataProto
- Add TrainingWorker that resembles Tinker-like API
- Add VLM support for model engine, SFT and RL trainer
- Add model engine based critic model
- Implement ActorRolloutRefWorker by TrainingWorker, support different backend in one worker
New VeOmni engine added, still in alpha status.

Rollout Engine

Remove SPMD rollout mode
Support blockwise fp8 rollout for vllm and sglang; support online quant for vllm with torchao
Experimental router replay support for vllm
Optimize multi-modal data fetch and preprocess, support video input
Upgrade to vllm==0.12.0; sglang==0.5.6

Reward

Support hybrid reward scenarios, including generative, discriminative, rule-based rewards, and their combinations.
Refactor reward models into server mode, supporting both colocated and standalone deployments.
Introduce new reward managers to handle more complex scenarios, limited mode for request rate control and remote mode for CPU-intensive tasks.

Algorithm

Add CISPO: Clipped IS-weight Policy Optimization
Add SAPO: Soft Adaptive Policy Optimization

Recipe

[NEW] VLA: add experimental support for VLA model
[NEW] rhymerl: History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
TransferQueue: support multiple data partition and optimize tensor zero-copy serialization
One-step-off-policy/Fully async: optimize weight synchronization by checkpoint engine with bucket and pipeline support.

What's Changed

[data] fix: MultiturnSFTDataset handle messages with list args in tool call by @gongyisheng in https://github.com/volcengine/verl/pull/4125
[ci, doc] feat: Update Ascend Dockerfile and docker build workflow to 8.3.RC1 version by @FightingZhen in https://github.com/volcengine/verl/pull/4123
[data] fix: fix global_seqlen metric by @conver334 in https://github.com/volcengine/verl/pull/4129
[ci] fix: Optimize ascend docker build workflow and dockerfile to solve OOM problem by @FightingZhen in https://github.com/volcengine/verl/pull/4137
[ci] fix: fix error limiting MindSpeed cloning depth to one by @FightingZhen in https://github.com/volcengine/verl/pull/4140
[ci] feat: specify torch and torch_npu version into ascend dockerfile by @FightingZhen in https://github.com/volcengine/verl/pull/4141
[ci] fix: move torch and torch_npu install order in ascend dockerfile to ensure installed version correct by @FightingZhen in https://github.com/volcengine/verl/pull/4142
[ci] fix: Correct version relationship between torch and torchvision in ascend dockerfile by @FightingZhen in https://github.com/volcengine/verl/pull/4143
[doc] chore: Add one_step_off_policy support doc of Ascend NPU by @baymax591 in https://github.com/volcengine/verl/pull/4151
[rollout] fix: resource pool name in standalone mode by @PeterSH6 in https://github.com/volcengine/verl/pull/4149
[ci] feat: Update e2e_ascend CI image to 8.3.RC1 version, remove weekly validation workflow by @FightingZhen in https://github.com/volcengine/verl/pull/4146
[doc] chore: add pytorch conference materials by @hongpeng-guo in https://github.com/volcengine/verl/pull/4161
[rollout] fixup load_format=dummy update_weights not do process_weight… by @Annarine in https://github.com/volcengine/verl/pull/4130
[vllm] fix: Change parameter validation to align with vllm validation by @HelloWorldBeginner in https://github.com/volcengine/verl/pull/4153
[trainer] fix: reproducible problem when resume training by @wlhgtc in https://github.com/volcengine/verl/pull/4156
[recipe, tool] feat: support multi-turn and tool call for recipe/fully_async_policy by @sl-1314 in https://github.com/volcengine/verl/pull/4067
[cfg] fix: add rollout_correcton config field with omegaconf.open_dict by @tongyx361 in https://github.com/volcengine/verl/pull/4167
[doc] fix: Misc doc fixes by @kerrickstaley in https://github.com/volcengine/verl/pull/4171
[recipe] feat: add qwen3 8b grpo one_step_off_policy script on ASCEND NPU by @baymax591 in https://github.com/volcengine/verl/pull/4163
[BREAKING][rollout] feat: change rollout to server mode by default by @wuxibin89 in https://github.com/volcengine/verl/pull/4106
[algo] feat: Add RateLimitedRewardLoopManager with three-layer rate limiting for API-based rewards by @JoyboyBrian in https://github.com/volcengine/verl/pull/4107

New Contributors

@gongyisheng made their first contribution in https://github.com/volcengine/verl/pull/4125
@Annarine made their first contribution in https://github.com/volcengine/verl/pull/4130
@HelloWorldBeginner made their first contribution in https://github.com/volcengine/verl/pull/4153
@wlhgtc made their first contribution in https://github.com/volcengine/verl/pull/4156
@sl-1314 made their first contribution in https://github.com/volcengine/verl/pull/4067
@kerrickstaley made their first contribution in https://github.com/volcengine/verl/pull/4171
@JoyboyBrian made their first contribution in https://github.com/volcengine/verl/pull/4107
@shevateng0 made their first contribution in https://github.com/volcengine/verl/pull/4139
@ashvinnihalani made their first contribution in https://github.com/volcengine/verl/pull/4091
@johnjunjun7 made their first contribution in https://github.com/volcengine/verl/pull/3427
@zjchenn made their first contribution in https://github.com/volcengine/verl/pull/4184
@HzZHoO made their first contribution in https://github.com/volcengine/verl/pull/4183
@EricMarcus-ai made their first contribution in https://github.com/volcengine/verl/pull/4185
@Shiguang-Guo made their first contribution in https://github.com/volcengine/verl/pull/4187
@Agoniii made their first contribution in https://github.com/volcengine/verl/pull/3519
@jQizhang made their first contribution in https://github.com/volcengine/verl/pull/4222
@JobQiu made their first contribution in https://github.com/volcengine/verl/pull/4248
@momo609 made their first contribution in https://github.com/volcengine/verl/pull/4166
@Kite0011 made their first contribution in https://github.com/volcengine/verl/pull/4250
@LLLLxmmm made their first contribution in https://github.com/volcengine/verl/pull/4175
@jprellberg made their first contribution in https://github.com/volcengine/verl/pull/4196
@chengminhua made their first contribution in https://github.com/volcengine/verl/pull/4209
@Leem-Li made their first contribution in https://github.com/volcengine/verl/pull/4253
@nuerxiati made their first contribution in https://github.com/volcengine/verl/pull/4165
@litianjian made their first contribution in https://github.com/volcengine/verl/pull/4101
@appletea233 made their first contribution in https://github.com/volcengine/verl/pull/4410
@jsfanfanfan made their first contribution in https://github.com/volcengine/verl/pull/4408
@icerain-alt made their first contribution in https://github.com/volcengine/verl/pull/4406
@Lokiscripter made their first contribution in https://github.com/volcengine/verl/pull/4398

Full Changelog: https://github.com/volcengine/verl/compare/v0.6.1...v0.7.0

Highlight

Model Engine

Integrate Megatron-Bridge and support LoRA/PEFT, see blog post: How We Build Trillion Parameter Reasoning RL with 10% GPUs

Support experimental fp8 training for megatron backend

Support new model for megatron backend: GPT-OSS, Qwen3-Next

Comprehensive support for new mode engine, FSDP and Megatron engine are production ready.

Dispatch tensordict with nested tensor instead of padded DataProto
Add TrainingWorker that resembles Tinker-like API
Add VLM support for model engine, SFT and RL trainer
Add model engine based critic model
Implement ActorRolloutRefWorker by TrainingWorker, support different backend in one worker

New VeOmni engine added, still in alpha status.

Rollout Engine

Remove SPMD rollout mode

Support blockwise fp8 rollout for vllm and sglang; support online quant for vllm with torchao

Experimental router replay support for vllm

Optimize multi-modal data fetch and preprocess, support video input

Upgrade to vllm==0.12.0; sglang==0.5.6

Reward

Support hybrid reward scenarios, including generative, discriminative, rule-based rewards, and their combinations.

Refactor reward models into server mode, supporting both colocated and standalone deployments.

Introduce new reward managers to handle more complex scenarios, limited mode for request rate control and remote mode for CPU-intensive tasks.

Algorithm

Add CISPO: Clipped IS-weight Policy Optimization

Add SAPO: Soft Adaptive Policy Optimization

Recipe

[NEW] VLA: add experimental support for VLA model

[NEW] rhymerl: History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL

TransferQueue: support multiple data partition and optimize tensor zero-copy serialization

One-step-off-policy/Fully async: optimize weight synchronization by checkpoint engine with bucket and pipeline support.

What's Changed

[data] fix: MultiturnSFTDataset handle messages with list args in tool call by @gongyisheng in https://github.com/volcengine/verl/pull/4125

[ci, doc] feat: Update Ascend Dockerfile and docker build workflow to 8.3.RC1 version by @FightingZhen in https://github.com/volcengine/verl/pull/4123

[data] fix: fix global_seqlen metric by @conver334 in https://github.com/volcengine/verl/pull/4129

[ci] fix: Optimize ascend docker build workflow and dockerfile to solve OOM problem by @FightingZhen in https://github.com/volcengine/verl/pull/4137

[ci] fix: fix error limiting MindSpeed cloning depth to one by @FightingZhen in https://github.com/volcengine/verl/pull/4140

[ci] feat: specify torch and torch_npu version into ascend dockerfile by @FightingZhen in https://github.com/volcengine/verl/pull/4141

[ci] fix: move torch and torch_npu install order in ascend dockerfile to ensure installed version correct by @FightingZhen in https://github.com/volcengine/verl/pull/4142

[ci] fix: Correct version relationship between torch and torchvision in ascend dockerfile by @FightingZhen in https://github.com/volcengine/verl/pull/4143

[doc] chore: Add one_step_off_policy support doc of Ascend NPU by @baymax591 in https://github.com/volcengine/verl/pull/4151

[rollout] fix: resource pool name in standalone mode by @PeterSH6 in https://github.com/volcengine/verl/pull/4149

[ci] feat: Update e2e_ascend CI image to 8.3.RC1 version, remove weekly validation workflow by @FightingZhen in https://github.com/volcengine/verl/pull/4146

[doc] chore: add pytorch conference materials by @hongpeng-guo in https://github.com/volcengine/verl/pull/4161

[rollout] fixup load_format=dummy update_weights not do process_weight… by @Annarine in https://github.com/volcengine/verl/pull/4130

[vllm] fix: Change parameter validation to align with vllm validation by @HelloWorldBeginner in https://github.com/volcengine/verl/pull/4153

[trainer] fix: reproducible problem when resume training by @wlhgtc in https://github.com/volcengine/verl/pull/4156

[recipe, tool] feat: support multi-turn and tool call for recipe/fully_async_policy by @sl-1314 in https://github.com/volcengine/verl/pull/4067

[cfg] fix: add rollout_correcton config field with omegaconf.open_dict by @tongyx361 in https://github.com/volcengine/verl/pull/4167

[doc] fix: Misc doc fixes by @kerrickstaley in https://github.com/volcengine/verl/pull/4171

[recipe] feat: add qwen3 8b grpo one_step_off_policy script on ASCEND NPU by @baymax591 in https://github.com/volcengine/verl/pull/4163

[BREAKING][rollout] feat: change rollout to server mode by default by @wuxibin89 in https://github.com/volcengine/verl/pull/4106

[algo] feat: Add RateLimitedRewardLoopManager with three-layer rate limiting for API-based rewards by @JoyboyBrian in https://github.com/volcengine/verl/pull/4107

New Contributors

@gongyisheng made their first contribution in https://github.com/volcengine/verl/pull/4125

@Annarine made their first contribution in https://github.com/volcengine/verl/pull/4130

@HelloWorldBeginner made their first contribution in https://github.com/volcengine/verl/pull/4153

@wlhgtc made their first contribution in https://github.com/volcengine/verl/pull/4156

@sl-1314 made their first contribution in https://github.com/volcengine/verl/pull/4067

@kerrickstaley made their first contribution in https://github.com/volcengine/verl/pull/4171

@JoyboyBrian made their first contribution in https://github.com/volcengine/verl/pull/4107

@shevateng0 made their first contribution in https://github.com/volcengine/verl/pull/4139

@ashvinnihalani made their first contribution in https://github.com/volcengine/verl/pull/4091

@johnjunjun7 made their first contribution in https://github.com/volcengine/verl/pull/3427

@zjchenn made their first contribution in https://github.com/volcengine/verl/pull/4184

@HzZHoO made their first contribution in https://github.com/volcengine/verl/pull/4183

@EricMarcus-ai made their first contribution in https://github.com/volcengine/verl/pull/4185

@Shiguang-Guo made their first contribution in https://github.com/volcengine/verl/pull/4187

@Agoniii made their first contribution in https://github.com/volcengine/verl/pull/3519

@jQizhang made their first contribution in https://github.com/volcengine/verl/pull/4222

@JobQiu made their first contribution in https://github.com/volcengine/verl/pull/4248

@momo609 made their first contribution in https://github.com/volcengine/verl/pull/4166

@Kite0011 made their first contribution in https://github.com/volcengine/verl/pull/4250

@LLLLxmmm made their first contribution in https://github.com/volcengine/verl/pull/4175

@jprellberg made their first contribution in https://github.com/volcengine/verl/pull/4196

@chengminhua made their first contribution in https://github.com/volcengine/verl/pull/4209

@Leem-Li made their first contribution in https://github.com/volcengine/verl/pull/4253

@nuerxiati made their first contribution in https://github.com/volcengine/verl/pull/4165

@litianjian made their first contribution in https://github.com/volcengine/verl/pull/4101

@appletea233 made their first contribution in https://github.com/volcengine/verl/pull/4410

@jsfanfanfan made their first contribution in https://github.com/volcengine/verl/pull/4408

@icerain-alt made their first contribution in https://github.com/volcengine/verl/pull/4406

@Lokiscripter made their first contribution in https://github.com/volcengine/verl/pull/4398

Full Changelog: https://github.com/volcengine/verl/compare/v0.6.1...v0.7.0

verl

v0.7.0

v0.7 release

Highlight

What's Changed

New Contributors

v0.7.0

v0.7 release

Highlight

What's Changed

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

yt-dlp