1.9.0
RubyLLM 1.9.0: Tool Schemas, Prompt Caching & Transcriptions ✨🎙️
Major release that makes tool definitions feel like Ruby, lets you lean on Anthropic prompt caching everywhere, and turns audio transcription into a one-liner—plus better Gemini structured output and Nano Banana image responses.
🧰 JSON Schema Tooling That Feels Native
The new RubyLLM::Schema params DSL supports full JSON Schema for tool parameter definitions, including nested objects, arrays, enums, and nullable fields.
class Scheduler < RubyLLM::Tool
description "Books a meeting"
params do
object :window, description: "Time window to reserve" do
string :start, description: "ISO8601 start"
string :finish, description: "ISO8601 finish"
end
array :participants, of: :string, description: "Email invitees"
any_of :format, description: "Optional meeting format" do
string enum: %w[virtual in_person]
null
end
end
def execute(window:, participants:, format: nil)
Booking.reserve(window:, participants:, format:)
end
end
- Powered by
RubyLLM::Schema, the same awesome Ruby DSL we recommend to use for Structured Output'schat.with_schema. - Already handles Anthropic/Gemini quirks like nullable unions and enums - no more ad-hoc translation layers.
- Prefer raw hashes? Pass
params schema: { ... }to keep your existing JSON Schema verbatim.
🧱 Raw Content Blocks & Anthropic Prompt Caching Everywhere
When you need to handcraft message envelopes:
chat = RubyLLM.chat(model: "claude-sonnet-4-5")
raw_request = RubyLLM::Content::Raw.new([
{ type: "text", text: File.read("prompt.md"), cache_control: { type: "ephemeral" } },
{ type: "text", text: "Summarize today’s work." }
])
chat.ask(raw_request)
We also provide an helper specifically for Anthropic Prompt Caching:
system_block = RubyLLM::Providers::Anthropic::Content.new(
"You are our release-notes assistant.",
cache: true
)
chat.add_message(role: :system, content: system_block)
RubyLLM::Content::Rawlets you ship provider-native payloads for content blocks.- Anthropic helpers keep
cache_controlhints readable while still producing the right JSON structure. - Every
RubyLLM::Messagenow exposescached_tokensandcache_creation_tokens, so you can see exactly what the provider pulled from cache versus what it had to recreate.
Please run rails generate ruby_llm:upgrade_to_v1_9 in your Rails app if you come from 1.8.x.
⚙️ Tool.with_params Plays Nice with Anthropic Caching
Similarly to Raw Content Blocks, .with_params lets you set arbitrary params in tool definitions. Perfect for Anthropic’s cache_control hints.
class ChangelogTool < RubyLLM::Tool
description "Formats commits into release notes"
params do
array :commits, of: :string
end
with_params cache_control: { type: "ephemeral" }
def execute(commits:)
ReleaseNotes.format(commits)
end
end
🎙️ RubyLLM.transcribe Turns Audio into Text (With Diarization)
One method call gives you transcripts, diarized segments, and consistent token tallies across providers.
transcription = RubyLLM.transcribe(
"all-hands.m4a",
model: "gpt-4o-transcribe-diarize",
language: "en",
prompt: "Focus on action items."
)
transcription.segments.each do |segment|
puts "#{segment['speaker']}: #{segment['text']} (#{segment['start']}s – #{segment['end']}s)"
end
- Supports OpenAI (
whisper-1,gpt-4o-transcribe, diarization variants), Gemini 2.5 Flash, and Vertex AI with the same API. - Optional speaker references map diarized voices to real names.
🛠️ Gemini Structured Output Fixes & Nano Banana Inline Images
We went deep on Gemini’s edges so you don’t have to.
- Nullables and
anyOfnow translate cleanly, and Gemini 2.5 finally respectsresponseJsonSchema, so complex structured output works out of the box. - Parallel tool calls return one single message with the right role. This should increase its accuracy in using and responding to tool calls.
- Gemini 2.5 Flash Image (“Nano Banana”) surfaces inline images as actual attachments—pair it with your UI immediately.
chat = RubyLLM.chat(model: "gemini-2.5-flash-image")
reply = chat.ask("Sketch a Nano Banana wearing aviators.")
image = reply.content.attachments.first
File.binwrite("nano-banana.png", image.read)
(If you missed the backstory, my blog post Nano Banana with RubyLLM has the full walkthrough.)
🗂️ Configurable Model Registry file path
Deploying to read-only filesystems? Point RubyLLM at a writable JSON registry and keep refreshing models without hacks.
RubyLLM.models.save_to_json("/var/app/models.json")
RubyLLM.configure do |config|
config.model_registry_file = "/var/app/models.json"
end
Just remember that RubyLLM.models.refresh! only updates the in-memory registry. To persist changes to disk, call:
RubyLLM.models.refresh!
RubyLLM.models.save_to_json
- Plays nicely with the ActiveRecord integration (which still stores models in the DB).
Installation
gem "ruby_llm", "1.9.0"
Upgrading from 1.8.x
bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_9`
Merged PRs
- Feat: Support Gemini's Different API versions by @thefishua in https://github.com/crmne/ruby_llm/pull/444
New Contributors
- @thefishua made their first contribution in https://github.com/crmne/ruby_llm/pull/444
Full Changelog: https://github.com/crmne/ruby_llm/compare/1.8.2...1.9.0