• Community
  • Model
  • asr-wav2vec2-large-xlsr-53-swedish

asr-wav2vec2-large-xlsr-53-swedish

Audio transcription model for converting Swedish audio to Swedish text

Notes

huggingface model id: KBLab/wav2vec2-large-xlsr-53-swedish

Wav2Vec2-Large-XLSR-53-Swedish

Fine-tuned facebook/wav2vec2-large-xlsr-53 in Swedish using the NST Swedish Dictation. When using this model, make sure that your speech input is sampled at 16kHz.

Evaluation

The model can be evaluated on the Swedish test data of Common Voice.

WER: 14.298610%

CER: 4.925294%

Training

First the XLSR model was further pre-trained for 50 epochs with a corpus consisting of 1000 hours spoken Swedish from various radio stations. Secondly NST Swedish Dictation was used for fine tuning as well as Common Voice. Lastly only Common Voice dataset was used for final finetuning. The Fairseq scripts were used.

  • ID
  • Name
    wav2vec2-large-xlsr-53-swedish
  • Model Type ID
    Audio To Text
  • Description
    Audio transcription model for converting Swedish audio to Swedish text
  • Last Updated
    Jun 28, 2022
  • Privacy
    PUBLIC
  • Use Case
  • Toolkit
  • License
  • Share
    • Badge
      asr-wav2vec2-large-xlsr-53-swedish