AN EXPERIMENTAL STUDY ON AUDIO REPLAY ATTACK DETECTION USING DEEP NEURAL NETWORKS
dc.contributor.author | Bakar, Bekir | |
dc.contributor.author | Hanilçi, Cemal | |
dc.date.accessioned | 2021-03-20T20:13:25Z | |
dc.date.available | 2021-03-20T20:13:25Z | |
dc.date.issued | 2018 | |
dc.department | BTÜ, Mühendislik ve Doğa Bilimleri Fakültesi, Elektrik Elektronik Mühendisliği Bölümü | en_US |
dc.description | IEEE Workshop on Spoken Language Technology (SLT) -- DEC 18-21, 2018 -- Athens, GREECE | en_US |
dc.description.abstract | Automatic speaker verification (ASV) systems can be easily spoofed by previously recorded speech, synthesized speech and speech signal that artificially generated by voice conversion techniques. In order to increase the reliability of the ASV systems, detecting spoofing attacks whether a given speech signal is genuine or spoofed plays an important role. In this paper, we consider the detection of replay attacks which is the most accessible attack type against ASV systems. To this end, we utilize a deep neural network (DNN) based classifier using features extracted from the long-term average spectrum. The experiments are conducted on the latest edition of Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) database. The results are compared with the ASVspoof 2017 baseline system which consists of Gaussian mixture model (GMM) classifier with constant-Q transform cepstral coefficients (CQCC) front-end as well as the GMM with standard mel-frequency cepstrum coefficients (MFCC) features. Experimental results reveal that DNN considerably outperforms the well-known and successful GMM classifier. It is found that long term average spectrum (LTAS) based features are superior to CQCC and MFCC in terms of equal error rate (EER). Finally, we find that high-frequency components convey much more discriminative information for replay attack detection independent of features and classifiers. | en_US |
dc.description.sponsorship | Inst Elect & Elect Engineers, IEEE Signal Proc Soc | en_US |
dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TUBITAK)Turkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [115E916] | en_US |
dc.description.sponsorship | This work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) (project no. 115E916). | en_US |
dc.identifier.endpage | 138 | en_US |
dc.identifier.isbn | 978-1-5386-4334-1 | |
dc.identifier.issn | 2639-5479 | |
dc.identifier.scopusquality | N/A | en_US |
dc.identifier.startpage | 132 | en_US |
dc.identifier.uri | https://hdl.handle.net/20.500.12885/865 | |
dc.identifier.wos | WOS:000463141800020 | en_US |
dc.identifier.wosquality | N/A | en_US |
dc.indekslendigikaynak | Web of Science | en_US |
dc.indekslendigikaynak | Scopus | en_US |
dc.institutionauthor | Bakar, Bekir | |
dc.language.iso | en | en_US |
dc.publisher | Ieee | en_US |
dc.relation.ispartof | 2018 Ieee Workshop On Spoken Language Technology (Slt 2018) | en_US |
dc.relation.ispartofseries | IEEE Workshop on Spoken Language Technology | |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | speaker verification | en_US |
dc.subject | replay attack detection | en_US |
dc.subject | deep neural networks | en_US |
dc.subject | countermeasures | en_US |
dc.title | AN EXPERIMENTAL STUDY ON AUDIO REPLAY ATTACK DETECTION USING DEEP NEURAL NETWORKS | en_US |
dc.type | Conference Object | en_US |