RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes 🧠✨🚆

This release brings first-class extended thinking across providers, full Gemini 3 Pro/Flash thinking-signature support (chat + tools), a Rails upgrade path to persist it, and a tighter streaming pipeline. Plus official Ruby 4.0 support, safer model registry refreshes, a Vertex AI global endpoint fix, and a docs refresh.

🧠 Extended Thinking Everywhere

Tune reasoning depth and budget across providers with with_thinking, and get thinking output back when available:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :high, budget: 8000)

response = chat.ask("Prove it with numbers.")
response.thinking&.text
response.thinking&.signature
response.thinking_tokens

response.thinking and chunk.thinking expose thinking content during normal and streaming requests.
response.thinking_tokens and response.tokens.thinking track thinking token usage when providers report it.
Gemini 3 Pro/Flash fully support thought signatures across chat and tool calls, so multi-step sessions stay consistent.

ruby_llm

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

haze

1.10.0

RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes 🧠✨🚆

🧠 Extended Thinking Everywhere

🧰 Rails + ActiveRecord Persistence

📊 Unified Token Tracking

✅ Official Ruby 4.0 Support

🧩 Model Registry Updates

🌍 Vertex AI Global Endpoint Fix

📚 Docs Updates

Installation

Upgrading from 1.9.x

Merged PRs

New Contributors

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

haze