v0.15.5
New models
- GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
- Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
What's Changed
- Sub-agent support for
ollama launchfor planning, deep research, and similar tasks ollama signinwill now open a browser window to make signing in easier- Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
- GLM-4.7-Flash support on Ollama's experimental MLX engine
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4...v0.15.5-rc0