Fine-Tuning ECAPA-TDNN For Turkish Speaker Verification
Küçük Resim Yok
Tarih
2024
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Institute of Electrical and Electronics Engineers Inc.
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Compared to Turkish speech databases, English speech databases are significantly larger, featuring many more speakers. This creates a trade-off between data adequacy and language for Turkish ASV systems. This paper explores this trade-off by comparing three different approaches using the state-of-the-art ECAPA-TDNN model: utilizing the pre-trained English ECAPA-TDNN model, training the ECAPA-TDNN model from scratch with the Turkish Common Voice dataset, and fine-tuning the pre-trained English ECAPA-TDNN model with Turkish data. Experimental results reveal that the pre-trained English ECAPA-TDNN model outperforms the model trained from scratch on Turkish data and the fine-tuned model in terms of the equal error rate (EER) criterion. However, the fine-tuning approach demonstrates the best performance according to the minimum detection cost function (min-DCF) metric when security is prioritized over user convenience. © 2024 IEEE.
Açıklama
8th International Artificial Intelligence and Data Processing Symposium, IDAP 2024 -- 2024-09-21 through 2024-09-22 -- Malatya -- 203423
Anahtar Kelimeler
automatic speaker verification, fine-tuning
Kaynak
WoS Q Değeri
Scopus Q Değeri
N/A












