We’re getting close to our final release of 2025! Thanks so much for sticking with us this year. We’ve got lots of new features so please update Unsloth & our Docker to use the latest updates! 🦥
Unsloth November Release

Introducing FP8 Reinforcement Learning in Unsloth! Train on any FP8 supported GPU and get 1.4x faster with 60% less VRAM: Read our Blog/Guide • Notebooks: Qwen3-8B FP8 GRPO and Llama-3.2-1B FP8 GRPO
You may notice Unsloth now uses much less VRAM than before, enabling even longer context. We’re also implementing faster training very soon and we’ll share all the details in an upcoming blog.
DeepSeek-OCR fine-tuning is here! We fine-tuned DeepSeek-OCR, improving its language understanding by 89%. Read our Blog • Free notebook
Qwen3-VL models supported including GGUFs to run locally: Blogpost + fixes • GGUFs
We analyzed RL training-inference mismatch for FP16 vs. BF16 and concluded that Unsloth does not have this issue: Analysis and Results
We’ve partnered with Docker to let you run LLMs locally with zero setup. Docker GGUFs are now powered by Unsloth Dynamic. Example: docker model run hf.co/unsloth/gpt-oss-20b-GGUF:F16 Read guide
Baidu ERNIE models are now supported. Notebooks coming soon.
Unsloth now supports SGLang. Read our guide
We wrote guides for LoRA Hot Swapping and vLLM Engine Arguments
Run Kimi-K2-Thinking the most powerful open model locally. Kimi-K2 Guide
Lots of bug fixes! See further below.

[!TIP] Update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo If you want PyTorch 2.9: pip install --upgrade unsloth unsloth_zoo

Bug Fixes and Enhancements

Supports trl>=0.25.0 and vllm>=0.11.2 and transformers>=4.57.1
Fixed gpt-oss GRPO, RL excessive re-compilations on torch>=2.9.0
Fixes Sleep mode and reduces memory usage by 5 to 15% further for RL, GRPO
Fix propagation of trust_remote_code = True
Fix Unsloth offloaded gradient checkpointing not offloading on 1st step - reduces VRAM by >20%
Add logits.detach() to GRPO to solve double backwards on some pathways
Add int64 kernels & fixed RoPE embeddings to allow super ultra long context training
Fixed 📓 OpenEnv gpt-oss RL notebook
DGX Spark docker image fixed

What's Changed

Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623

Unsloth Zoo Changes

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/327
Fix GRPO by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/328
fix gpt oss memory calculation for intel device by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/330
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/331
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/332
fixed unbound local error tokenizer-model from cache by @rolandtannous in https://github.com/unslothai/unsloth-zoo/pull/333
Now it works on a uv venv by @kittawere in https://github.com/unslothai/unsloth-zoo/pull/336
Gemma3n fix by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/338
[Intel] remove triton windows for intel by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/243
FP8 training enhancements by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/337
GRPO gradient accumulation steps update and DAPO support by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/308
Fix/video collate by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/342
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/344
FP8, Standby and vLLM updates by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/340
Put importance sampling into no grad by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/343
Detach hidden states to avoid gradient carry by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/345
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/347
MoE: Cast routing_weights dtype correctly by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/349
return local model in determine_base_model_source with any quantization by @noah1510 in https://github.com/unslothai/unsloth-zoo/pull/334
Enable FP8 + RL training by @andrewor14 in https://github.com/unslothai/unsloth-zoo/pull/351
Tiled MLP Implementation by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/350
Fix gradient checkpointing layer caller kwargs by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/353
vLLM weight scale FP8 and standby override by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/354
Fix docstring removing regex to support empty parentheses by @noisycat3 in https://github.com/unslothai/unsloth-zoo/pull/360

Unsloth Notebooks Changes

Feat/qwen3 vl by @Erland366 in https://github.com/unslothai/notebooks/pull/119
Feat/double footer fix by @Erland366 in https://github.com/unslothai/notebooks/pull/121
Add GGUF section for Qwen3-VL by @Etherll in https://github.com/unslothai/notebooks/pull/123
Fix TypeError in unsloth_push_to_hub_gguf() when pushing GGUF model to Hugging Face by @samanta-sc in https://github.com/unslothai/notebooks/pull/125
fix TorchAOConfig' object has no attribute 'base_config' error by @rolandtannous in https://github.com/unslothai/notebooks/pull/129
Updated Dockerfile for DGX Spark by @sameersegal in https://github.com/unslothai/notebooks/pull/133
gemma3-270m: reduce batch size for sample packing by @djsaunde in https://github.com/unslothai/notebooks/pull/135
fix dataset formatting and mapping for Magistral reasoning by @rolandtannous in https://github.com/unslothai/notebooks/pull/136
fix magistral inference by @rolandtannous in https://github.com/unslothai/notebooks/pull/138

Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025

What's Changed

Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390
Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532
Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540
Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546
Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517
Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554
Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557
Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563
DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564
pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576
Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586
Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584
Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534
Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578
Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587
fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598
Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606
Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611
remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618
Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614
Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609
Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440
Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621
Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623

New Contributors

@mk0walsk made their first contribution in https://github.com/unslothai/unsloth/pull/3557
@pre-commit-ci[bot] made their first contribution in https://github.com/unslothai/unsloth/pull/3576
@Giuseppe5 made their first contribution in https://github.com/unslothai/unsloth/pull/3534
@jarrycyx made their first contribution in https://github.com/unslothai/unsloth/pull/3578
@MercuryYen made their first contribution in https://github.com/unslothai/unsloth/pull/3623

Full Changelog: https://github.com/unslothai/unsloth/compare/October-2025...November-2025

Bug Fixes and Enhancements

Supports trl>=0.25.0 and vllm>=0.11.2 and transformers>=4.57.1

Fixed gpt-oss GRPO, RL excessive re-compilations on torch>=2.9.0

Fixes Sleep mode and reduces memory usage by 5 to 15% further for RL, GRPO

Fix propagation of trust_remote_code = True

Fix Unsloth offloaded gradient checkpointing not offloading on 1st step - reduces VRAM by >20%

Add logits.detach() to GRPO to solve double backwards on some pathways

Add int64 kernels & fixed RoPE embeddings to allow super ultra long context training

Fixed 📓 OpenEnv gpt-oss RL notebook

DGX Spark docker image fixed

What's Changed

Grpo gradient accumulation edits by @pluesclues in https://github.com/unslothai/unsloth/pull/3390

Nightly by @danielhanchen in https://github.com/unslothai/unsloth/pull/3532

Handle TRL version compatibility in rl_replacements.py by @pluesclues in https://github.com/unslothai/unsloth/pull/3540

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3546

Sleep trl patch by @Datta0 in https://github.com/unslothai/unsloth/pull/3517

Detach logits before returning from function by @pluesclues in https://github.com/unslothai/unsloth/pull/3554

Fix typos in comment by @mk0walsk in https://github.com/unslothai/unsloth/pull/3557

Formatting & bug fixes by @danielhanchen in https://github.com/unslothai/unsloth/pull/3563

DeepseekOCR: add trust_remote_code kwarg by @mmathew23 in https://github.com/unslothai/unsloth/pull/3564

pre-commit CI config by @djsaunde in https://github.com/unslothai/unsloth/pull/3565

[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3576

Resize rope embeddings for long sequence training by @mmathew23 in https://github.com/unslothai/unsloth/pull/3586

Patch in tiled mlp by @mmathew23 in https://github.com/unslothai/unsloth/pull/3584

Support for out-of-source quantizers by @Giuseppe5 in https://github.com/unslothai/unsloth/pull/3534

Fix: prevent rope_embedding AssertionError by checking kv_seq_len before reuse by @jarrycyx in https://github.com/unslothai/unsloth/pull/3578

Extend TorchAOConfig to support mobile usecases by @metascroy in https://github.com/unslothai/unsloth/pull/3587

fix qwen3 vl gradient accumulation by @mmathew23 in https://github.com/unslothai/unsloth/pull/3598

Do not force set beta to 0 for DAPO by @Datta0 in https://github.com/unslothai/unsloth/pull/3604

[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/unslothai/unsloth/pull/3606

Fix broken links and typo in README by @mk0walsk in https://github.com/unslothai/unsloth/pull/3611

remove pre-commit workflow (covered by pre-commit app) by @djsaunde in https://github.com/unslothai/unsloth/pull/3618

Add an int64 path for mlp kernels by @mmathew23 in https://github.com/unslothai/unsloth/pull/3614

Remove grpo requirement bs=num_generations by @mmathew23 in https://github.com/unslothai/unsloth/pull/3609

Enable FP8 + RL training for bf16 models by @andrewor14 in https://github.com/unslothai/unsloth/pull/3440

Fix/save torchao model loading logic by @rolandtannous in https://github.com/unslothai/unsloth/pull/3621

Fix LlamaModel_fast_forward signature to match HF Transformers (Support inputs_embeds) by @MercuryYen in https://github.com/unslothai/unsloth/pull/3623

Unsloth Zoo Changes

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/327

Fix GRPO by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/328

fix gpt oss memory calculation for intel device by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/330

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/331

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/332

fixed unbound local error tokenizer-model from cache by @rolandtannous in https://github.com/unslothai/unsloth-zoo/pull/333

Now it works on a uv venv by @kittawere in https://github.com/unslothai/unsloth-zoo/pull/336

Gemma3n fix by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/338

[Intel] remove triton windows for intel by @leizhenyuan in https://github.com/unslothai/unsloth-zoo/pull/243

FP8 training enhancements by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/337

GRPO gradient accumulation steps update and DAPO support by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/308

Fix/video collate by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/342

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/344

FP8, Standby and vLLM updates by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/340

Put importance sampling into no grad by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/343

Detach hidden states to avoid gradient carry by @pluesclues in https://github.com/unslothai/unsloth-zoo/pull/345

Bug fixes by @danielhanchen in https://github.com/unslothai/unsloth-zoo/pull/347

MoE: Cast routing_weights dtype correctly by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/349

return local model in determine_base_model_source with any quantization by @noah1510 in https://github.com/unslothai/unsloth-zoo/pull/334

Enable FP8 + RL training by @andrewor14 in https://github.com/unslothai/unsloth-zoo/pull/351

Tiled MLP Implementation by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/350

Fix gradient checkpointing layer caller kwargs by @mmathew23 in https://github.com/unslothai/unsloth-zoo/pull/353

vLLM weight scale FP8 and standby override by @Datta0 in https://github.com/unslothai/unsloth-zoo/pull/354

Fix docstring removing regex to support empty parentheses by @noisycat3 in https://github.com/unslothai/unsloth-zoo/pull/360

Unsloth Notebooks Changes

Feat/qwen3 vl by @Erland366 in https://github.com/unslothai/notebooks/pull/119

Feat/double footer fix by @Erland366 in https://github.com/unslothai/notebooks/pull/121

Add GGUF section for Qwen3-VL by @Etherll in https://github.com/unslothai/notebooks/pull/123

Fix TypeError in unsloth_push_to_hub_gguf() when pushing GGUF model to Hugging Face by @samanta-sc in https://github.com/unslothai/notebooks/pull/125

fix TorchAOConfig' object has no attribute 'base_config' error by @rolandtannous in https://github.com/unslothai/notebooks/pull/129

Updated Dockerfile for DGX Spark by @sameersegal in https://github.com/unslothai/notebooks/pull/133

gemma3-270m: reduce batch size for sample packing by @djsaunde in https://github.com/unslothai/notebooks/pull/135

fix dataset formatting and mapping for Magistral reasoning by @rolandtannous in https://github.com/unslothai/notebooks/pull/136

fix magistral inference by @rolandtannous in https://github.com/unslothai/notebooks/pull/138