Wav2Vec2-Large-XLSR-53-Turkish

Fine-tuned facebook/wav2vec2-large-xlsr-53 on Turkish using the Common Voice. When using this model, make sure that your speech input is sampled at 16kHz.

Evaluation

The model can be evaluated on the Turkish test data of Common Voice.

Test Result: 17.46 %

Training

unicode_tr package is used for converting sentences to lower case since regular lower() does not work well with Turkish. Since training data is very limited for Turkish, all data is employed with a K-Fold (k=5) training approach. Best model out of the 5 trainings is uploaded. Training arguments: --num_train_epochs="30" \ --per_device_train_batch_size="32" \ --evaluation_strategy="steps" \ --activation_dropout="0.055" \ --attention_dropout="0.094" \ --feat_proj_dropout="0.04" \ --hidden_dropout="0.047" \ --layerdrop="0.041" \ --learning_rate="2.34e-4" \ --mask_time_prob="0.082" \ --warmup_steps="250" \ All trainings took ~20 hours with a GeForce RTX 3090 Graphics Card.

ID
Name
wav2vec2-large-xlsr-53-turkish
Model Type ID
Audio To Text
Description
Audio transcription model for converting Turkish audio to Turkish text
Last Updated
Jun 28, 2022
Privacy
PUBLIC
Use Case
Toolkit
License
Share
Badge