Unclaimed project

Are you a maintainer of verl? Claim this project to take control of your public changelog and roadmap.

Changelog

verl

verl: Volcano Engine Reinforcement Learning for LLMs

verl-project/verl·

19k3.3kPythonApache-2.0

·Website

Last updated about 1 month ago

More Python Projects

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

183.0k

Python

stable-diffusion-webui

Stable Diffusion web UI

162.1k

Python

transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

v0.4.1 patch release: checkpoint fixes for MoE EP & LoRA, OpenAI/MCP tool calling schema, and SGLang memory optimizations - verl Release Notes | AnnounceHQ

checkpoint_config

checkpoint_contents

[feat] Wandb Timing: Add more detailed timing of gen_sequence and weights resharding by @ETOgaosion in https://github.com/volcengine/verl/pull/1834
[rollout] feat: follow OpenAI tool calling schema in chat scheduler by @wuxibin89 in https://github.com/volcengine/verl/pull/1831
[release] chore: bump version to v0.4 by @eric-haibin-lin in https://github.com/volcengine/verl/pull/1897
Dockerfile.rocm update tensordict==0.6.2 by @vickytsang in https://github.com/volcengine/verl/pull/1898
[feat] add validation shuffle by @mlpod in https://github.com/volcengine/verl/pull/1886
[feat][BREAKING] Megatron: Support learning rate scheduler by @ETOgaosion in https://github.com/volcengine/verl/pull/1701
fix errors in megatron_workers.py by @davidjsonn in https://github.com/volcengine/verl/pull/1906
[tests] chore: add PR title check by @eric-haibin-lin in https://github.com/volcengine/verl/pull/1901
fix qwen2vl grpo for vllm 0.9 and transformers 4.52 by @hiyouga in https://github.com/volcengine/verl/pull/1880
[rollout] fix: error in __collect_lora_params() in FSDPVLLMShardingManager by @rocke2020 in https://github.com/volcengine/verl/pull/1909
[recipe] feat: char count by @vermouth1992 in https://github.com/volcengine/verl/pull/1908
fix typos by @davidjsonn in https://github.com/volcengine/verl/pull/1912
[trainer] refactor: refactor reward manager, advantage estimator by @eric-haibin-lin in https://github.com/volcengine/verl/pull/1916
set CUDA and HIP VISIBLE DEVICES by @YangWang92 in https://github.com/volcengine/verl/pull/1914
[ppo] feat: add critic valuehead model support for multi-modal PPO by @Yangruipis in https://github.com/volcengine/verl/pull/1839
[bugfix] fix megatron model merger by @ShareLer in https://github.com/volcengine/verl/pull/1774
revert HIP_VISIBLE_DEVICES in worker.py by @YangWang92 in https://github.com/volcengine/verl/pull/1920
[worker] fix: do not break dynamic bsz in dp critic by @hiyouga in https://github.com/volcengine/verl/pull/1922
[sglang] feat: Efficient and model-agnostic multi-turn messages tokenization and masking by @jybsuper in https://github.com/volcengine/verl/pull/1668
[rollout] fix: fix async llm config passing by @eric-haibin-lin in https://github.com/volcengine/verl/pull/1933
[sglang] fix: Fix tool call parser not found error for SGLang==0.4.6.post5 by @jybsuper in https://github.com/volcengine/verl/pull/1852
fix sequence parallelism conflict in kimiVL by @ShareLer in https://github.com/volcengine/verl/pull/1899
[megatron] refactor: support MLATransformerConfig abstraction for DeepSeek V3 by @jinqinn in https://github.com/volcengine/verl/pull/1836
[rollout] feat: add async llm perf script by @wuxibin89 in https://github.com/volcengine/verl/pull/1930
[megatron] feat: qwen2.5vl by @ISEEKYAN in https://github.com/volcengine/verl/pull/1286
[ckpt] feat: model_merger.py support processing checkpoints with LoRA adapters by @thelongestusernameofall in https://github.com/volcengine/verl/pull/1821
[hardware] fix: fix issue when sp>1 on ASCEND NPU by @as12138 in https://github.com/volcengine/verl/pull/1942
[megatron] fix: rope_type typo in config_converter.py by @donpromax in https://github.com/volcengine/verl/pull/1944
[training_utils] Add qwen3 multi-turn sft support by @SwordFaith in https://github.com/volcengine/verl/pull/1889
[fsdp] fix: fsdp entropy metrics by @ETOgaosion in https://github.com/volcengine/verl/pull/1943
[FSDP] feat: Add FSDP forward pefetch and recompute chunking entropy by @CurryRice233 in https://github.com/volcengine/verl/pull/1927
[rollout] fix: set repetition_penalty=1.0 to AsyncLLM by @wuxibin89 in https://github.com/volcengine/verl/pull/1949
[fsdp] feat: Memory efficient cross entropy with a linear layer fused by @Jianbing-D in https://github.com/volcengine/verl/pull/462
[recipe] feat: qwen2.5vl 7b report and guide by @ISEEKYAN in https://github.com/volcengine/verl/pull/1969
[ckpt] refactor: enhance FSDP checkpoint manager flexibility by @0x404 in https://github.com/volcengine/verl/pull/1350
[env] fix: npu ray verion to 2.46.0 for CI problem by @wyz649296016 in https://github.com/volcengine/verl/pull/1987
Fix TypeError by Removing Duplicate Arguments in run_deepseek671b_math_megatron.sh by @none0663 in https://github.com/volcengine/verl/pull/1996
[megatron] feat: Config NCCL Timeout for Megatron Backend Model Loading by @none0663 in https://github.com/volcengine/verl/pull/1983
[tests] chore: ppo workflow runs on volcengine machine learning platform by @htc070011 in https://github.com/volcengine/verl/pull/1979
[megatron] fix: multiple key error when trying to override megatron tr… by @donpromax in https://github.com/volcengine/verl/pull/1990
[megatron] feat: robust and efficient mcore converter with meta device init and numel check for dpsk by @Yangruipis in https://github.com/volcengine/verl/pull/1995
Stabilize loss calculations by clamping KL divergence values by @syo093c in https://github.com/volcengine/verl/pull/1779
[ckpt] fix: run converter_hf_to_mcore with --test will raise an AttributeError by @lxg2015 in https://github.com/volcengine/verl/pull/2010
[algo] fix: vf_loss factor by @tongyx361 in https://github.com/volcengine/verl/pull/2016
[data] fix: fix retool sft data source by @vermouth1992 in https://github.com/volcengine/verl/pull/2018
[fsdp] fix: position_ids in qwen-vl by @ShareLer in https://github.com/volcengine/verl/pull/1947
[hardware] refactor: refactor part of device management by @FightingZhen in https://github.com/volcengine/verl/pull/1974
[trainer] fix: fix sft max_position_embeddings by @vermouth1992 in https://github.com/volcengine/verl/pull/2019
[misc] fix: fix format by @vermouth1992 in https://github.com/volcengine/verl/pull/2023
[megatron] fix: dpskv3 convert src and dst mixed up bug by @Yangruipis in https://github.com/volcengine/verl/pull/2029
fix: TensorDict usage error by @zhihe-wang in https://github.com/volcengine/verl/pull/2046
[hardware] feat: support qwen2_5_vl on ASCEND NPU by @as12138 in https://github.com/volcengine/verl/pull/1924
[trainer] chore: Reducing the number of calls to the write by @RuixiangMa in https://github.com/volcengine/verl/pull/2043
[Bug] fix None check in DataProto print_size() by @GHGmc2 in https://github.com/volcengine/verl/pull/2067
[perf] feat: Add verl profiling support from Nvidia Nsight System by @davidmlw in https://github.com/volcengine/verl/pull/1820
[data] fix: multimodal overlong prompt length filtering by @dirtyDan0 in https://github.com/volcengine/verl/pull/2063
[sglang] fix: AsyncSglangServer use async wake_up/sleep by @feifeibear in https://github.com/volcengine/verl/pull/2062
[training_utils] feat: Add project and experiment name to tensorboard log path by @Geaming2002 in https://github.com/volcengine/verl/pull/2080
[trainer] fix: Fix trainer config for val_only by @hscspring in https://github.com/volcengine/verl/pull/20842083
[megatron] fix: fix qwen2_vl on plain-text data and mix data of plain-text and image-text by @MaoChouHJM in https://github.com/volcengine/verl/pull/1999
[vllm] fix: mv disable_mm_preprocessor_cache to vllm engine_kwargs by @yyDing1 in https://github.com/volcengine/verl/pull/2068
[misc] feat: update instruction for running dapo on qwen2.5 7b math and add reference wandb by @vermouth1992 in https://github.com/volcengine/verl/pull/2094
[rollout] refactor: Add option for rollout_log_probs, and default as False by @GHGmc2 in https://github.com/volcengine/verl/pull/2072
[tool] feat: Add Search Tool implemented with MCP by @AlecHenx in https://github.com/volcengine/verl/pull/1948
[trainer] fix: make reward_extra_info optional in reward_result by @HollowMan6 in https://github.com/volcengine/verl/pull/2109
[algo] feat: integrate Clip-Cov and KL-Cov methods by @Raf-Chen in https://github.com/volcengine/verl/pull/1830
[rollout] fix: error in sgyang async mode by @chenhaiq in https://github.com/volcengine/verl/pull/2098
[rollout] fix: fix rollout key not found by @ETOgaosion in https://github.com/volcengine/verl/pull/2116
[recipe] feat: Move entropy reward to the entropy recipe by @Raf-Chen in https://github.com/volcengine/verl/pull/2118
[cfg, perf] refactor: add omega_conf_to_dataclass API, rename WorkerProfiler to DistProfiler, add unit test based on ProfilerConfig by @eric-haibin-lin in https://github.com/volcengine/verl/pull/2117
[worker] feat: add support for dynamic batch size of multimodal data by @wang-zerui in https://github.com/volcengine/verl/pull/2049
[fsdp] refactor: set actor's strategy as default for critic and ref by @0x404 in https://github.com/volcengine/verl/pull/2130
[ray] feat: add a test to demonstrate how to perform p2p communication inside wor… by @vermouth1992 in https://github.com/volcengine/verl/pull/2131
[sglang] feat: Support async multi-turn rollout with simulation feedback in sglang by @kinza99 in https://github.com/volcengine/verl/pull/1630
[tool] feat: Add memory limit configuration for sandbox fusion by @plutoZZZZ in https://github.com/volcengine/verl/pull/2105
[sglang] feat: add multimodal input to multiturn async rollout by @nanjiangwill in https://github.com/volcengine/verl/pull/2014
[fsdp] feat: support fsdp2 save hugging face model by @0x404 in https://github.com/volcengine/verl/pull/2138
[rollout]fix: vllm_rollout_spmd.py when return_raw_chat=True by @zyfzjsc988 in https://github.com/volcengine/verl/pull/2156
[rollout] feat: Support Multi-stage Awake for SGLang by @hebiao064 in https://github.com/volcengine/verl/pull/1911
[worker] feat: allow dist shared file-system initialization by @Cccei000 in https://github.com/volcengine/verl/pull/2154
[model] feat: Add MiniCPM-o 2.6 support by @RanchiZhao in https://github.com/volcengine/verl/pull/1833
[model] fix: Revert "[model] feat: Add MiniCPM-o 2.6 support" by @hiyouga in https://github.com/volcengine/verl/pull/2176
[misc] fix: fix timer importance error in split_placement by @FightingZhen in https://github.com/volcengine/verl/pull/2169
[megatron,vllm] fix: megatron vllm async rollout server by @Yangruipis in https://github.com/volcengine/verl/pull/2122
[model] feat: Add MiniCPM-o 2.6 support by @hiyouga in https://github.com/volcengine/verl/pull/2178
[megatron] feat: Support of dist checkpoint by @ETOgaosion in https://github.com/volcengine/verl/pull/2125
[data] fix: fix the type of parquet_files in SFTDataset by @xuuHuang in https://github.com/volcengine/verl/pull/2203
[trainer] fix: add missing qwen2_moe flops counter by @ETOgaosion in https://github.com/volcengine/verl/pull/2190
[trainer] fix: Add init.py to verl.trainer.config by @ultmaster in https://github.com/volcengine/verl/pull/2214
[model] fix: make vlm patch forward compatible by @hiyouga in https://github.com/volcengine/verl/pull/2215
[recipe] fix: parameter order in RayPRIMETrainer super().init() call by @xxnpark in https://github.com/volcengine/verl/pull/2172
[misc] feat: support ValidationGenerationsLogger in vemlp_wandb by @chenhaiq in https://github.com/volcengine/verl/pull/2191

verl

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

v0.4.1 patch release: checkpoint fixes for MoE EP & LoRA, OpenAI/MCP tool calling schema, and SGLang memory optimizations

v0.4.1 patch release: checkpoint fixes for MoE EP & LoRA, OpenAI/MCP tool calling schema, and SGLang memory optimizations

Key changes

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

What's Changed

New Contributors

yt-dlp