asr-wav2vec2-large-xlsr-cantonese
Audio transcription model for converting Cantonese audio to Chinese text
8f310a80544c48c2838d49b2815d8fcc
8f310a80544c48c2838d49b2815d8fcc
Notes
huggingface model id: scottykwok/wav2vec2-large-xlsr-cantonese
Wav2vec2-large-xlsr-cantonese
This model was based on wav2vec2-large-xlsr-53, finetuned using Common Voice/zh-HK/6.1.0. The training code is similar to user ctl, except that the number of training epochs was 80 (doubled) and fp16_backend is apex. The model was trained using a single RTX 3090 and docker image is nvidia/cuda:11.1-cudnn8-devel.
CER is 15.11% when evaluate against common voice zh-HK test set.
Result (CER)
15.11%
Source Code
See this GitHub Repo cantonese-selfish-project and demo video.
- ID
- Model Type IDAudio To Text
- DescriptionAudio transcription model for converting Cantonese audio to Chinese text
- Last UpdatedJun 28, 2022
- PrivacyPUBLIC
- Toolkit
- License
- Share
- Badge
Concept | Date |
---|