Whisper Models

Whisper Speech Recognition Models

High-Performance Model (e.g., ggml-large-v3-turbo.bin):

Models like large-v3 or the optimized large-v3-turbo offer high accuracy for speech recognition.
However, larger models require significant computational resources (CPU, RAM, VRAM).
This resource demand can lead to higher latency (slower processing times). The large-v3-turbo variant is a distilled version of large-v3, designed to be faster with a minor trade-off in accuracy.

Improving Latency with Alternative Models:

If lower latency (faster processing) is a priority, especially on less powerful hardware, consider using smaller or quantized Whisper models.
These models trade some accuracy for reduced size and faster inference speed.
Common sizes include: tiny, base, small, medium. Quantized versions (e.g., q5_0, q8_0) further reduce resource usage.

Where to Find Models:

You can download various pre-converted Whisper models in the ggml format from Hugging Face repositories:

[!WARNING] If you download an alternative model from the Hugging Face links provided (i.e., any model other than the default one provided below, such as ggml-base.bin, ggml-small.bin, or a quantized version):

You must rename the downloaded file exactly to: ggml-large-v3-turbo.bin

This renaming step is crucial because the engine is configured to load only a file with this exact name. Failing to rename the alternative model file will likely result in the application being unable to find and load it.

handcrafted-persona-engine

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

haze