v0.18.1 Patch Release

What's Changed

Add ZenFlow code for Stage 3 by @JoshWoo2003 in https://github.com/deepspeedai/DeepSpeed/pull/7516
[XPU][CI] recover xpu-max1100 workflow by @Liangliang-Ma in https://github.com/deepspeedai/DeepSpeed/pull/7630
Take **kwargs in init of DeepSpeedZeroOptimizer subclasses by @eternalNight in https://github.com/deepspeedai/DeepSpeed/pull/7634
add support for tensor learning rate (vs scalar) by @NirSonnenschein in https://github.com/deepspeedai/DeepSpeed/pull/7633
Fix illegal memory access with multi_tensor_apply size above INT_MAX by @wangyan-mms in https://github.com/deepspeedai/DeepSpeed/pull/7639
No Muon optimizer for embeding and lm_head layer by @delock in https://github.com/deepspeedai/DeepSpeed/pull/7641
z2: report param name and not zero id in assert by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7637
z2: don't pass dtype to report_ipg_memory_usage by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7636
Ulysses HF Accelerate integration by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7638
Add DataStates-LLM: Asynchronous Checkpointing Engine Support by @mauryaavinash95 in https://github.com/deepspeedai/DeepSpeed/pull/7166

@JoshWoo2003 made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7516
@wangyan-mms made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7639

Full Changelog: https://github.com/deepspeedai/DeepSpeed/compare/v0.18.0...v0.18.1

Add ZenFlow code for Stage 3 by @JoshWoo2003 in https://github.com/deepspeedai/DeepSpeed/pull/7516
[XPU][CI] recover xpu-max1100 workflow by @Liangliang-Ma in https://github.com/deepspeedai/DeepSpeed/pull/7630
Take **kwargs in init of DeepSpeedZeroOptimizer subclasses by @eternalNight in https://github.com/deepspeedai/DeepSpeed/pull/7634
add support for tensor learning rate (vs scalar) by @NirSonnenschein in https://github.com/deepspeedai/DeepSpeed/pull/7633
Fix illegal memory access with multi_tensor_apply size above INT_MAX by @wangyan-mms in https://github.com/deepspeedai/DeepSpeed/pull/7639
No Muon optimizer for embeding and lm_head layer by @delock in https://github.com/deepspeedai/DeepSpeed/pull/7641
z2: report param name and not zero id in assert by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7637
z2: don't pass dtype to report_ipg_memory_usage by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7636
Ulysses HF Accelerate integration by @stas00 in https://github.com/deepspeedai/DeepSpeed/pull/7638
Add DataStates-LLM: Asynchronous Checkpointing Engine Support by @mauryaavinash95 in https://github.com/deepspeedai/DeepSpeed/pull/7166

@JoshWoo2003 made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7516
@wangyan-mms made their first contribution in https://github.com/deepspeedai/DeepSpeed/pull/7639

Full Changelog: https://github.com/deepspeedai/DeepSpeed/compare/v0.18.0...v0.18.1