Release candidate 5.0.0rc1

What's Changed

This release candidate was focused mostly on quantization support with the new dynamic weight loader, and a few notable 🚨 breaking changes🚨:

Default dtype for any model when using from_pretrained is now auto!

Default auto 🚨 🚨 by @ArthurZucker in https://github.com/huggingface/transformers/pull/42805

Default shard size when saving a model is now 50GB:

🚨🚨 [saving] Default to 50GB shards, and remove non-safe serialization by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42734 This is now as fast as before thanks to xet, and is just more convenient on the hub.

Kwargs. They are fundamental to enable integration with vllm and other toosl:

Every model forward() should have **kwargs by @Rocketknight1 in https://github.com/huggingface/transformers/pull/42603

Dynamic weight loader updates:

Mostly QOL and fixed + support back CPU offloading.

mark params as _is_hf_initialized with DS Zero3 from weight conversion by @winglian in https://github.com/huggingface/transformers/pull/42626
[loading] Allow loading to happen without threading by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42619
[loading] Correctly load params during offloading & careful memory considerations by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42632
allow registration of custom checkpoint conversion mappings by @winglian in https://github.com/huggingface/transformers/pull/42634

New models:

Add FastVLM by @camilla-deckard in https://github.com/huggingface/transformers/pull/41112
Lasr model by @eustlb in https://github.com/huggingface/transformers/pull/42648
[Model] Add PaddleOCR-VL Model Support by @zhang-prog in https://github.com/huggingface/transformers/pull/42178

Some notable quantization fixes:

Mostly added support for fbgemme , quanto,

Fix fp8 + some enhancement by @SunMarc in https://github.com/huggingface/transformers/pull/42455
Fix eetq quanto quant methods by @SunMarc in https://github.com/huggingface/transformers/pull/42557
[Quantization] per tensor quantization kernel by @MekkCyber in https://github.com/huggingface/transformers/pull/42560
[Quantization] fix fbgemm by @MekkCyber in https://github.com/huggingface/transformers/pull/42561
[Quantization] Fix FP8 experts replacing by @MekkCyber in https://github.com/huggingface/transformers/pull/42654
[Quantization] Fix Static FP8 Quantization by @MekkCyber in https://github.com/huggingface/transformers/pull/42775
[core] fix fp-quant by @MekkCyber in https://github.com/huggingface/transformers/pull/42613

Peft:

The dynamic weight loader broke small things, this adds glue for all models but MoEs.

FIX Error when trying to load non-LoRA PEFT by @BenjaminBossan in https://github.com/huggingface/transformers/pull/42663
Fix PEFT integration with new weight loader by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42701

Misc

Tokenization needed more refactoring, this time its a lot cleaner!

Refactor-tokenization-more by @ArthurZucker in https://github.com/huggingface/transformers/pull/42563
Only default rope_parameters to empty dict if there is something to put in it by @hmellor in https://github.com/huggingface/transformers/pull/42651

We omitted a lot of other commits for clarity, but thanks to everyone and the new contributors!

New Contributors

@camilla-deckard made their first contribution in https://github.com/huggingface/transformers/pull/41112
@Aaraviitkgp made their first contribution in https://github.com/huggingface/transformers/pull/42466
@ngazagna-qc made their first contribution in https://github.com/huggingface/transformers/pull/40691
@arrdel made their first contribution in https://github.com/huggingface/transformers/pull/42577
@marconaguib made their first contribution in https://github.com/huggingface/transformers/pull/42587
@Xiao-Chenguang made their first contribution in https://github.com/huggingface/transformers/pull/42436
@Furkan-rgb made their first contribution in https://github.com/huggingface/transformers/pull/42465
@mertunsall made their first contribution in https://github.com/huggingface/transformers/pull/42615
@anranlee99 made their first contribution in https://github.com/huggingface/transformers/pull/42438
@UserChen666 made their first contribution in https://github.com/huggingface/transformers/pull/42335
@efazal made their first contribution in https://github.com/huggingface/transformers/pull/41723
@Harrisonyong made their first contribution in https://github.com/huggingface/transformers/pull/36416
@hawon223 made their first contribution in https://github.com/huggingface/transformers/pull/42384
@Bissmella made their first contribution in https://github.com/huggingface/transformers/pull/42647
@AgainstEntropy made their first contribution in https://github.com/huggingface/transformers/pull/42689
@dongluw made their first contribution in https://github.com/huggingface/transformers/pull/42642
@hqkqn32 made their first contribution in https://github.com/huggingface/transformers/pull/42620
@zhang-prog made their first contribution in https://github.com/huggingface/transformers/pull/42178

Full Changelog: https://github.com/huggingface/transformers/compare/v5.0.0rc0...v5.0.0rc1

What's Changed

This release candidate was focused mostly on quantization support with the new dynamic weight loader, and a few notable 🚨 breaking changes🚨:

Default dtype for any model when using from_pretrained is now auto!

Default auto 🚨 🚨 by @ArthurZucker in https://github.com/huggingface/transformers/pull/42805

Default shard size when saving a model is now 50GB:

🚨🚨 [saving] Default to 50GB shards, and remove non-safe serialization by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42734 This is now as fast as before thanks to xet, and is just more convenient on the hub.

Kwargs. They are fundamental to enable integration with vllm and other toosl:

Every model forward() should have **kwargs by @Rocketknight1 in https://github.com/huggingface/transformers/pull/42603

Dynamic weight loader updates:

Mostly QOL and fixed + support back CPU offloading.

mark params as _is_hf_initialized with DS Zero3 from weight conversion by @winglian in https://github.com/huggingface/transformers/pull/42626
[loading] Allow loading to happen without threading by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42619
[loading] Correctly load params during offloading & careful memory considerations by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42632
allow registration of custom checkpoint conversion mappings by @winglian in https://github.com/huggingface/transformers/pull/42634

New models:

Add FastVLM by @camilla-deckard in https://github.com/huggingface/transformers/pull/41112
Lasr model by @eustlb in https://github.com/huggingface/transformers/pull/42648
[Model] Add PaddleOCR-VL Model Support by @zhang-prog in https://github.com/huggingface/transformers/pull/42178

Some notable quantization fixes:

Mostly added support for fbgemme , quanto,

Fix fp8 + some enhancement by @SunMarc in https://github.com/huggingface/transformers/pull/42455
Fix eetq quanto quant methods by @SunMarc in https://github.com/huggingface/transformers/pull/42557
[Quantization] per tensor quantization kernel by @MekkCyber in https://github.com/huggingface/transformers/pull/42560
[Quantization] fix fbgemm by @MekkCyber in https://github.com/huggingface/transformers/pull/42561
[Quantization] Fix FP8 experts replacing by @MekkCyber in https://github.com/huggingface/transformers/pull/42654
[Quantization] Fix Static FP8 Quantization by @MekkCyber in https://github.com/huggingface/transformers/pull/42775
[core] fix fp-quant by @MekkCyber in https://github.com/huggingface/transformers/pull/42613

Peft:

The dynamic weight loader broke small things, this adds glue for all models but MoEs.

FIX Error when trying to load non-LoRA PEFT by @BenjaminBossan in https://github.com/huggingface/transformers/pull/42663
Fix PEFT integration with new weight loader by @Cyrilvallez in https://github.com/huggingface/transformers/pull/42701

Misc

Tokenization needed more refactoring, this time its a lot cleaner!

Refactor-tokenization-more by @ArthurZucker in https://github.com/huggingface/transformers/pull/42563
Only default rope_parameters to empty dict if there is something to put in it by @hmellor in https://github.com/huggingface/transformers/pull/42651

We omitted a lot of other commits for clarity, but thanks to everyone and the new contributors!

New Contributors

@camilla-deckard made their first contribution in https://github.com/huggingface/transformers/pull/41112
@Aaraviitkgp made their first contribution in https://github.com/huggingface/transformers/pull/42466
@ngazagna-qc made their first contribution in https://github.com/huggingface/transformers/pull/40691
@arrdel made their first contribution in https://github.com/huggingface/transformers/pull/42577
@marconaguib made their first contribution in https://github.com/huggingface/transformers/pull/42587
@Xiao-Chenguang made their first contribution in https://github.com/huggingface/transformers/pull/42436
@Furkan-rgb made their first contribution in https://github.com/huggingface/transformers/pull/42465
@mertunsall made their first contribution in https://github.com/huggingface/transformers/pull/42615
@anranlee99 made their first contribution in https://github.com/huggingface/transformers/pull/42438
@UserChen666 made their first contribution in https://github.com/huggingface/transformers/pull/42335
@efazal made their first contribution in https://github.com/huggingface/transformers/pull/41723
@Harrisonyong made their first contribution in https://github.com/huggingface/transformers/pull/36416
@hawon223 made their first contribution in https://github.com/huggingface/transformers/pull/42384
@Bissmella made their first contribution in https://github.com/huggingface/transformers/pull/42647
@AgainstEntropy made their first contribution in https://github.com/huggingface/transformers/pull/42689
@dongluw made their first contribution in https://github.com/huggingface/transformers/pull/42642
@hqkqn32 made their first contribution in https://github.com/huggingface/transformers/pull/42620
@zhang-prog made their first contribution in https://github.com/huggingface/transformers/pull/42178

Full Changelog: https://github.com/huggingface/transformers/compare/v5.0.0rc0...v5.0.0rc1

transformers

What's Changed

Dynamic weight loader updates:

New models:

Some notable quantization fixes:

Peft:

Misc

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

yt-dlp

langflow

Release candidate 5.0.0rc1

What's Changed

Dynamic weight loader updates:

New models:

Some notable quantization fixes:

Peft:

Misc

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

yt-dlp

langflow