v3.0.0
Changelog - Version 3.0.0
This update primarily focuses on code refactoring, resulting in a more streamlined and readable codebase, along with improvements to the prompt and fixes for several stability issues.
🚀 New Features
- Significantly improved transcription quality: now performs ASR on the original audio, then uses Demucs-denoised audio for force alignment, greatly reducing missed sentences.
- Added support for WhisperX 302 Cloud API (recommended for users without local GPUs or who prefer not to deal with complex installation), and preliminary support for 11labs Scribe model (still in development—stability may be lower than WhisperX, use with caution).
- Enhanced segmentation stability through longer chain-of-thought reasoning.
- Improved translation prompt to optimize overly concise translations.
- Added a JSON format support button to the sidebar LLM settings.
🐛 Bug Fixes
- Increased the word deletion threshold from 20 to 30, fixing the issue of incorrectly deleting valid words.
- Fixed errors when processing longer audio/text segments, resolving WhisperX cloud audio segmentation issues.
- Implemented stricter validation of LLM response formats, fixing translation line alignment errors.