Toward robust replay attack detection in Automatic Speaker Verification: A study of spectrum estimation and channel magnitude response modeling

Bekiryazici, Sule; Hanilci, Cemal; Ozcan, Neyir

Toward robust replay attack detection in Automatic Speaker Verification: A study of spectrum estimation and channel magnitude response modeling

dc.contributor.author	Bekiryazici, Sule
dc.contributor.author	Hanilci, Cemal
dc.contributor.author	Ozcan, Neyir
dc.date.accessioned	2026-02-08T15:15:11Z
dc.date.available	2026-02-08T15:15:11Z
dc.date.issued	2026
dc.department	Bursa Teknik Üniversitesi
dc.description.abstract	Automatic Speaker Verification (ASV) systems are increasingly adopted for biometric authentication but remain highly vulnerable to spoofing, particularly replay attacks. Existing countermeasures (CMs) for replay attack detection rely predominantly on discrete Fourier transform (DFT)-based spectral features, which are sensitive to noise and channel distortions common in physical access (PA) scenarios. This work presents the first comprehensive study of Channel Magnitude Response (CMR) representations for replay detection, explicitly analyzing the impact of spectrum estimation and feature design. The contribution of this work are fourfold: (i) CMR estimation is generalized beyond MFCCs to LFCC and CQCC features, with LFCC-based CMRs offering superior discrimination; (ii) alternative spectrum estimators - linear prediction (LP) and multitaper (MT) - are integrated into the CMR pipeline, yielding substantial gains over conventional DFT (iii) robustness is investigated under silence-free (voiced-only) conditions, mitigating known biases in ASVspoof datasets and (iv) a systematic evaluation of CMR is provided on the recently released ReplayDF corpus, a challenging benchmark combining replay and synthetic speech variability. Experiments on ASVspoof 2017, 2019, 2021, and ReplayDF using both baseline classifiers (ResNet18 and LCNN) and stronger models (Res2Net50 and SE-Res2Net50) show that the proposed approach consistently outperforms conventional features. Particularly, LFCC-CMR features with LP spectra achieve an Equal Error Rate (EER) as low as 1.34% on ASVspoof 2019 (PA), representing considerable relative improvements over traditional methods. Moreover, CMR-based systems retain high performance even when silent segments are removed, unlike conventional approaches. These results establish CMR with principled spectral modeling as a robust and generalizable framework for replay attack detection, opening new directions for resilient spoofing countermeasures.
dc.description.sponsorship	Scientific and Technological Research Council of Turkiye (TUBITAK) [123E384]
dc.description.sponsorship	This study was supported by Scientific and Technological Research Council of Turkiye (TUBITAK) under Grant Number 123E384. The authors thank TUBITAK for their support.
dc.identifier.doi	10.1016/j.csl.2025.101906
dc.identifier.issn	0885-2308
dc.identifier.issn	1095-8363
dc.identifier.scopus	2-s2.0-105023580455
dc.identifier.scopusquality	Q1
dc.identifier.uri	https://doi.org/10.1016/j.csl.2025.101906
dc.identifier.uri	https://hdl.handle.net/20.500.12885/5652
dc.identifier.volume	98
dc.identifier.wos	WOS:001632372400001
dc.identifier.wosquality	Q2
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Academic Press Ltd- Elsevier Science Ltd
dc.relation.ispartof	Computer Speech and Language
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.snmz	WOS_KA_20260207
dc.subject	Spoofing countermeasures
dc.subject	Replay attack detection
dc.subject	Blind channel magnitude response
dc.subject	Spectrum estimation
dc.title	Toward robust replay attack detection in Automatic Speaker Verification: A study of spectrum estimation and channel magnitude response modeling
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Toward robust replay attack detection in Automatic Speaker Verification: A study of spectrum estimation and channel magnitude response modeling

Dosyalar

Koleksiyon