- Community
- Model
- asr-wav2vec2-large-xlsr-53-swedish
asr-wav2vec2-large-xlsr-53-swedish
Audio transcription model for converting Swedish audio to Swedish text
4a0b2f4918bb4be594c6f00746a6521a
4a0b2f4918bb4be594c6f00746a6521a
Notes
huggingface model id: KBLab/wav2vec2-large-xlsr-53-swedish
Wav2Vec2-Large-XLSR-53-Swedish
Fine-tuned facebook/wav2vec2-large-xlsr-53 in Swedish using the NST Swedish Dictation. When using this model, make sure that your speech input is sampled at 16kHz.
Evaluation
The model can be evaluated on the Swedish test data of Common Voice.
WER: 14.298610%
CER: 4.925294%
Training
First the XLSR model was further pre-trained for 50 epochs with a corpus consisting of 1000 hours spoken Swedish from various radio stations. Secondly NST Swedish Dictation was used for fine tuning as well as Common Voice. Lastly only Common Voice dataset was used for final finetuning. The Fairseq scripts were used.
- ID
- Namewav2vec2-large-xlsr-53-swedish
- Model Type IDAudio To Text
- DescriptionAudio transcription model for converting Swedish audio to Swedish text
- Last UpdatedJun 28, 2022
- PrivacyPUBLIC
- Use Case
- Toolkit
- License
- Share
- Badge
Concept | Date |
---|