0.3.0

What's new:

2023.3.17, funasr-0.3.0, modelscope-1.4.1

New Features:
- Added support for GPU runtime solution, nv-triton, which allows easy export of Paraformer models from ModelScope and deployment as services. We conducted benchmark tests on a single GPU-V100, and achieved an RTF of 0.0032 and a speedup of 300.
- Added support for CPU runtime quantization solution, which supports export of quantized ONNX and Libtorch models from ModelScope. We conducted benchmark tests on a CPU-8369B, and found that RTF increased by 50% (0.00438 -> 0.00226) and double speedup (228 -> 442).
- Added support for C++ version of the gRPC service deployment solution. The C++ version of ONNXRuntime and quantization solution, provides double higher efficiency compared to the Python runtime, demo.
- Added streaming inference pipeline to the 16k VAD model, 8k VAD model, with support for audio input streams (>= 10ms) , demo.
- Improved the punctuation prediction model, resulting in increased accuracy (F-score increased from 55.6 to 56.5).
- Added real-time subtitle example based on gRPC service, using a 2-pass recognition model. Paraformer streaming model is used to output text in real time, while Paraformer-large offline model is used to correct recognition results, demo.
New Models:
- Added 16k Paraformer streaming model, which supports real-time speech recognition with streaming audio input, demo. It can be deployed using the gRPC service to implement real-time subtitle function.
- Added streaming punctuation model, which supports real-time punctuation marking in streaming speech recognition scenarios, with real-time calls based on VAD points. It can be used along with real-time ASR models to achieve readable real-time subtitle function, demo.
- Added TP-Aligner timestamp model, which takes audio and corresponding text as input and outputs word-level timestamps. Its performance is comparable to that of the Kaldi FA model (60.3ms vs. 69.3ms). It can be combined freely with ASR models, demo.
- Added financial domain model (), which is fine-tuned using 1000 hours of data. The recognition accuracy on the financial domain test set increased by 5%, and the recall rate of domain keywords increased by 7%.

New Contributors

@dingbig made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/147
@yuekaizhang made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/161
@zhuzizyf made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/180
@znsoftm made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/185
@songtaoshi made their first contribution in https://github.com/alibaba-damo-academy/FunASR/pull/227

Full Changelog: https://github.com/alibaba-damo-academy/FunASR/compare/v0.2.0...v0.3.0

FunASR

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

JPProject.IdentityServer4.SSO

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

JPProject.IdentityServer4.SSO

What's new:

2023.3.17, funasr-0.3.0, modelscope-1.4.1

最新更新：

New Contributors