November Release + FP8 Training!
We’re getting close to our final release of 2025! Thanks so much for sticking with us this year. We’ve got lots of new features so please update Unsloth & our Docker to use the latest updates! 🦥
-
Introducing FP8 Reinforcement Learning in Unsloth! Train on any FP8 supported GPU and get 1.4x faster with 60% less VRAM: Read our Blog/Guide • Notebooks: Qwen3-8B FP8 GRPO and Llama-3.2-1B FP8 GRPO
-
You may notice Unsloth now uses much less VRAM than before, enabling even longer context. We’re also implementing faster training very soon and we’ll share all the details in an upcoming blog.
-
DeepSeek-OCR fine-tuning is here! We fine-tuned DeepSeek-OCR, improving its language understanding by 89%. Read our Blog • Free notebook
-
Qwen3-VL models supported including GGUFs to run locally: Blogpost + fixes • GGUFs
-
We analyzed RL training-inference mismatch for FP16 vs. BF16 and concluded that Unsloth does not have this issue: Analysis and Results
-
We’ve partnered with Docker to let you run LLMs locally with zero setup. Docker GGUFs are now powered by Unsloth Dynamic. Example:
docker model run hf.co/unsloth/gpt-oss-20b-GGUF:F16Read guide -
Baidu ERNIE models are now supported. Notebooks coming soon.
-
Unsloth now supports SGLang. Read our guide
-
We wrote guides for LoRA Hot Swapping and vLLM Engine Arguments
-
Run Kimi-K2-Thinking the most powerful open model locally. Kimi-K2 Guide
-
Lots of bug fixes! See further below.
[!TIP] Update Unsloth via
pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zooIf you want PyTorch 2.9:pip install --upgrade unsloth unsloth_zoo
Bug Fixes and Enhancements
- Supports
trl>=0.25.0andvllm>=0.11.2andtransformers>=4.57.1 - Fixed gpt-oss GRPO, RL excessive re-compilations on
torch>=2.9.0 - Fixes Sleep mode and reduces memory usage by 5 to 15% further for RL, GRPO
- Fix propagation of
trust_remote_code = True - Fix Unsloth offloaded gradient checkpointing not offloading on 1st step - reduces VRAM by >20%
- Add
logits.detach()to GRPO to solve double backwards on some pathways - Add
int64kernels & fixed RoPE embeddings to allow super ultra long context training - Fixed 📓 OpenEnv gpt-oss RL notebook
- DGX Spark docker image fixed
What's Changed
- Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
- Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
- Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
- Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
- Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
- Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
- Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
- DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
- pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
- Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
- Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
- Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
- Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
- Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
- fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
- Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
- Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
- remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
- Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
- Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
- Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
- Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
- Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623
Unsloth Zoo Changes
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/327
- Fix GRPO by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/328
- fix gpt oss memory calculation for intel device by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/330
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/331
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/332
- fixed unbound local error tokenizer-model from cache by @rolandtannous in https://github.com/unslothai/unsloth-zoo/pull/333
- Now it works on a uv venv by @kittawere in https://github.com/unslothai/unsloth-zoo/pull/336
- Gemma3n fix by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/338
- [Intel] remove triton windows for intel by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/243
- FP8 training enhancements by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/337
- GRPO gradient accumulation steps update and DAPO support by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/308
- Fix/video collate by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/342
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/344
- FP8, Standby and vLLM updates by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/340
- Put importance sampling into no grad by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/343
- Detach hidden states to avoid gradient carry by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/345
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/347
- MoE: Cast routing_weights dtype correctly by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/349
- return local model in determine_base_model_source with any quantization by @noah1510 in https://github.com/unslothai/unsloth-zoo/pull/334
- Enable FP8 + RL training by @andrewor14 in https://github.com/unslothai/unsloth-zoo/pull/351
- Tiled MLP Implementation by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/350
- Fix gradient checkpointing layer caller kwargs by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/353
- vLLM weight scale FP8 and standby override by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/354
- Fix docstring removing regex to support empty parentheses by @noisycat3 in https://github.com/unslothai/unsloth-zoo/pull/360
Unsloth Notebooks Changes
- Feat/qwen3 vl by @Erland366 in https://github.com/unslothai/notebooks/pull/119
- Feat/double footer fix by @Erland366 in https://github.com/unslothai/notebooks/pull/121
- Add GGUF section for Qwen3-VL by @Etherll in https://github.com/unslothai/notebooks/pull/123
- Fix TypeError in unsloth_push_to_hub_gguf() when pushing GGUF model to Hugging Face by @samanta-sc in https://github.com/unslothai/notebooks/pull/125
- fix TorchAOConfig' object has no attribute 'base_config' error by @rolandtannous in https://github.com/unslothai/notebooks/pull/129
- Updated Dockerfile for DGX Spark by @sameersegal in https://github.com/unslothai/notebooks/pull/133
- gemma3-270m: reduce batch size for sample packing by @djsaunde in https://github.com/unslothai/notebooks/pull/135
- fix dataset formatting and mapping for Magistral reasoning by @rolandtannous in https://github.com/unslothai/notebooks/pull/136
- fix magistral inference by @rolandtannous in https://github.com/unslothai/notebooks/pull/138
Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025
What's Changed
- Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
- Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
- Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
- Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
- Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
- Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
- Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
- Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
- DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
- pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
- Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
- Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
- Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
- Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
- Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
- fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
- Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
- Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
- remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
- Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
- Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
- Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
- Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
- Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623
New Contributors
- @mk0walsk made their first contribution in https://github.com/unslothai/unsloth/pull/3557
- @pre-commit-ci[bot] made their first contribution in https://github.com/unslothai/unsloth/pull/3576
- @Giuseppe5 made their first contribution in https://github.com/unslothai/unsloth/pull/3534
- @jarrycyx made their first contribution in https://github.com/unslothai/unsloth/pull/3578
- @MercuryYen made their first contribution in https://github.com/unslothai/unsloth/pull/3623
Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025