Fine-Tuning ECAPA-TDNN For Turkish Speaker Verification

Küçük Resim Yok

Tarih

2024

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Institute of Electrical and Electronics Engineers Inc.

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Compared to Turkish speech databases, English speech databases are significantly larger, featuring many more speakers. This creates a trade-off between data adequacy and language for Turkish ASV systems. This paper explores this trade-off by comparing three different approaches using the state-of-the-art ECAPA-TDNN model: utilizing the pre-trained English ECAPA-TDNN model, training the ECAPA-TDNN model from scratch with the Turkish Common Voice dataset, and fine-tuning the pre-trained English ECAPA-TDNN model with Turkish data. Experimental results reveal that the pre-trained English ECAPA-TDNN model outperforms the model trained from scratch on Turkish data and the fine-tuned model in terms of the equal error rate (EER) criterion. However, the fine-tuning approach demonstrates the best performance according to the minimum detection cost function (min-DCF) metric when security is prioritized over user convenience. © 2024 IEEE.

Açıklama

8th International Artificial Intelligence and Data Processing Symposium, IDAP 2024 -- 2024-09-21 through 2024-09-22 -- Malatya -- 203423

Anahtar Kelimeler

automatic speaker verification, fine-tuning

Kaynak

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

Sayı

Künye