v0.18.4 Patch Release

What's Changed

Update version by @sfc-gh-truwase in https://github.com/deepspeedai/DeepSpeed/pull/7719
Disable deterministic option in compile tests by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7720
Fix SuperOffloadOptimizer_Stage3 crash due to missing param_names parameter by @ImaGoodFella in https://github.com/deepspeedai/DeepSpeed/pull/7715
[AMD][ROCm] Improve support of AMD by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7448
fix typo by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7722
Skip none in backward hook by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7725
[Engine] Only scale gradients if scale_wrt_gas is True by @kashif in https://github.com/deepspeedai/DeepSpeed/pull/7724
Fix testcases that depends on triton by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7731
Fix rare hang in DeepSpeed Async I/O wait by releasing the Python GIL by @xylian86 in https://github.com/deepspeedai/DeepSpeed/pull/7727
Fix #7733: Replace torch.sqrt with math.sqrt in scale_lr for sqrt method by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7735
replace moe checkpoint dp_world_size with seq_dp_world_size by @wukong1992 in https://github.com/deepspeedai/DeepSpeed/pull/7732
[BUG] Fix UlyssesSPAttentionHF.register_with_transformers() crash with PEFT models by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7737
Add core api update blog by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7738
Fix Nebula checkpoint engine commit() API mismatch by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7740
Fix DecoupledCheckpointEngine deadlock and improve reliability by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7742
Fix OnebitLamb NaN propagation with empty parameters by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7736
fix: remove premature MPI environment variable check in OpenMPIRunner by @leejianwoo-collab in https://github.com/deepspeedai/DeepSpeed/pull/7751
Enable python 3.11 and 3.12 tests by @loadams in https://github.com/deepspeedai/DeepSpeed/pull/7007
Add CI workflow to run tests on AWS by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7753
Add fallback to BF16 support check by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7754
Fix DeepCompile for PyTorch 2.8/2.9 compatibility by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7755
Removed amp testcases by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7745
fix: avoid IndexError in BF16_Optimizer.destroy() when using DummyOptim by @leejianwoo-collab in https://github.com/deepspeedai/DeepSpeed/pull/7763

New Contributors

@ImaGoodFella made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7715
@k-artem made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7448
@kashif made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7724
@Rakshit-gen made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7735
@leejianwoo-collab made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7751

Full Changelog: https://github.com/deepspeedai/DeepSpeed/compare/v0.18.3...v0.18.4

What's Changed

Update version by @sfc-gh-truwase in https://github.com/deepspeedai/DeepSpeed/pull/7719

Disable deterministic option in compile tests by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7720

Fix SuperOffloadOptimizer_Stage3 crash due to missing param_names parameter by @ImaGoodFella in https://github.com/deepspeedai/DeepSpeed/pull/7715

[AMD][ROCm] Improve support of AMD by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7448

fix typo by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7722

Skip none in backward hook by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7725

[Engine] Only scale gradients if scale_wrt_gas is True by @kashif in https://github.com/deepspeedai/DeepSpeed/pull/7724

Fix testcases that depends on triton by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7731

Fix rare hang in DeepSpeed Async I/O wait by releasing the Python GIL by @xylian86 in https://github.com/deepspeedai/DeepSpeed/pull/7727

Fix #7733: Replace torch.sqrt with math.sqrt in scale_lr for sqrt method by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7735

replace moe checkpoint dp_world_size with seq_dp_world_size by @wukong1992 in https://github.com/deepspeedai/DeepSpeed/pull/7732

[BUG] Fix UlyssesSPAttentionHF.register_with_transformers() crash with PEFT models by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7737

Add core api update blog by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7738

Fix Nebula checkpoint engine commit() API mismatch by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7740

Fix DecoupledCheckpointEngine deadlock and improve reliability by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7742

Fix OnebitLamb NaN propagation with empty parameters by @Rakshit-gen in https://github.com/deepspeedai/DeepSpeed/pull/7736

fix: remove premature MPI environment variable check in OpenMPIRunner by @leejianwoo-collab in https://github.com/deepspeedai/DeepSpeed/pull/7751

Enable python 3.11 and 3.12 tests by @loadams in https://github.com/deepspeedai/DeepSpeed/pull/7007

Add CI workflow to run tests on AWS by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7753

Add fallback to BF16 support check by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7754

Fix DeepCompile for PyTorch 2.8/2.9 compatibility by @tohtana in https://github.com/deepspeedai/DeepSpeed/pull/7755

Removed amp testcases by @k-artem in https://github.com/deepspeedai/DeepSpeed/pull/7745

fix: avoid IndexError in BF16_Optimizer.destroy() when using DummyOptim by @leejianwoo-collab in https://github.com/deepspeedai/DeepSpeed/pull/7763

New Contributors

@ImaGoodFella made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7715

@k-artem made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7448

@kashif made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7724

@Rakshit-gen made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7735

@leejianwoo-collab made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7751

Full Changelog: https://github.com/deepspeedai/DeepSpeed/compare/v0.18.3...v0.18.4

DeepSpeed

What's Changed

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

yt-dlp

v0.18.4 Patch Release

What's Changed

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

yt-dlp