Release candidate 5.0.0rc1
What's Changed
This release candidate was focused mostly on quantization support with the new dynamic weight loader, and a few notable 🚨 breaking changes🚨:
- Default dtype for any model when using
from_pretrainedis nowauto!
- Default auto 🚨 🚨 by @ArthurZucker in https://github.com/huggingface/transformers/pull/42805
- Default shard size when saving a model is now 50GB:
- 🚨🚨 [saving] Default to 50GB shards, and remove non-safe serialization by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42734 This is now as fast as before thanks to xet, and is just more convenient on the hub.
- Kwargs. They are fundamental to enable integration with vllm and other toosl:
- Every model forward() should have **kwargs by @Rocketknight1 in https://github.com/huggingface/transformers/pull/42603
Dynamic weight loader updates:
Mostly QOL and fixed + support back CPU offloading.
- mark params as _is_hf_initialized with DS Zero3 from weight conversion by @winglian in https://github.com/huggingface/transformers/pull/42626
- [loading] Allow loading to happen without threading by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42619
- [loading] Correctly load params during offloading & careful memory considerations by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42632
- allow registration of custom checkpoint conversion mappings by @winglian in https://github.com/huggingface/transformers/pull/42634
New models:
- Add FastVLM by @camilla-deckard in https://github.com/huggingface/transformers/pull/41112
- Lasr model by @eustlb in https://github.com/huggingface/transformers/pull/42648
- [Model] Add PaddleOCR-VL Model Support by @zhang-prog in https://github.com/huggingface/transformers/pull/42178
Some notable quantization fixes:
Mostly added support for fbgemme , quanto,
- Fix fp8 + some enhancement by @SunMarc in https://github.com/huggingface/transformers/pull/42455
- Fix eetq quanto quant methods by @SunMarc in https://github.com/huggingface/transformers/pull/42557
- [Quantization] per tensor quantization kernel by @MekkCyber in https://github.com/huggingface/transformers/pull/42560
- [Quantization] fix fbgemm by @MekkCyber in https://github.com/huggingface/transformers/pull/42561
- [Quantization] Fix FP8 experts replacing by @MekkCyber in https://github.com/huggingface/transformers/pull/42654
- [Quantization]Â Fix Static FP8 Quantization by @MekkCyber in https://github.com/huggingface/transformers/pull/42775
- [core] fix fp-quant by @MekkCyber in https://github.com/huggingface/transformers/pull/42613
Peft:
The dynamic weight loader broke small things, this adds glue for all models but MoEs.
- FIX Error when trying to load non-LoRA PEFT by @BenjaminBossan in https://github.com/huggingface/transformers/pull/42663
- Fix PEFT integration with new weight loader by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42701
Misc
Tokenization needed more refactoring, this time its a lot cleaner!
- Refactor-tokenization-more by @ArthurZucker in https://github.com/huggingface/transformers/pull/42563
- Only default
rope_parametersto emptydictif there is something to put in it by @hmellor in https://github.com/huggingface/transformers/pull/42651
We omitted a lot of other commits for clarity, but thanks to everyone and the new contributors!
New Contributors
- @camilla-deckard made their first contribution in https://github.com/huggingface/transformers/pull/41112
- @Aaraviitkgp made their first contribution in https://github.com/huggingface/transformers/pull/42466
- @ngazagna-qc made their first contribution in https://github.com/huggingface/transformers/pull/40691
- @arrdel made their first contribution in https://github.com/huggingface/transformers/pull/42577
- @marconaguib made their first contribution in https://github.com/huggingface/transformers/pull/42587
- @Xiao-Chenguang made their first contribution in https://github.com/huggingface/transformers/pull/42436
- @Furkan-rgb made their first contribution in https://github.com/huggingface/transformers/pull/42465
- @mertunsall made their first contribution in https://github.com/huggingface/transformers/pull/42615
- @anranlee99 made their first contribution in https://github.com/huggingface/transformers/pull/42438
- @UserChen666 made their first contribution in https://github.com/huggingface/transformers/pull/42335
- @efazal made their first contribution in https://github.com/huggingface/transformers/pull/41723
- @Harrisonyong made their first contribution in https://github.com/huggingface/transformers/pull/36416
- @hawon223 made their first contribution in https://github.com/huggingface/transformers/pull/42384
- @Bissmella made their first contribution in https://github.com/huggingface/transformers/pull/42647
- @AgainstEntropy made their first contribution in https://github.com/huggingface/transformers/pull/42689
- @dongluw made their first contribution in https://github.com/huggingface/transformers/pull/42642
- @hqkqn32 made their first contribution in https://github.com/huggingface/transformers/pull/42620
- @zhang-prog made their first contribution in https://github.com/huggingface/transformers/pull/42178
Full Changelog: https://github.com/huggingface/transformers/compare/v5.0.0rc0...v5.0.0rc1