Source cell-phone recognition from recorded speech using non-speech segments

dc.authorid0000-0002-9174-0367en_US
dc.contributor.authorHanilçi, Cemal
dc.contributor.authorKinnunen, Tomi
dc.date.accessioned2021-03-20T20:15:24Z
dc.date.available2021-03-20T20:15:24Z
dc.date.issued2014
dc.departmentBTÜ, Mühendislik ve Doğa Bilimleri Fakültesi, Elektrik Elektronik Mühendisliği Bölümüen_US
dc.description.abstractIn a recent study, we have introduced the problem of identifying cell-phones using recorded speech and shown that speech signals convey information about the source device, making it possible to identify the source with some accuracy. In this paper, we consider recognizing source cell-phone microphones using non-speech segments of recorded speech. Taking an information-theoretic approach, we use Gaussian Mixture Model (GMM) trained with maximum mutual information (MMI) to represent device-specific features. Experimental results using Mel-frequency and linear frequency cepstral coefficients (MFCC and LFCC) show that features extracted from the non-speech segments of speech contain higher mutual information and yield higher recognition rates than those from speech portions or the whole utterance. Identification rate improves from 96.42% to 98.39% and equal error rate (EER) reduces from 1.20% to 0.47% when non-speech parts are used to extract features. Recognition results are provided with classical GMM trained both with maximum likelihood (ML) and maximum mutual information (MMI) criteria, as well as support vector machines (SVMs). Identification under additive noise case is also considered and it is shown that identification rates reduces dramatically in case of additive noise. (C) 2014 Elsevier Inc. All rights reserved.en_US
dc.description.sponsorshipAcademy of FinlandAcademy of FinlandEuropean Commission [253120, 283256]en_US
dc.description.sponsorshipThe work was partially funded by Academy of Finland (projects 253120 and 283256). The authors would like to thank the anonymous reviewers for their detailed feedback.en_US
dc.identifier.doi10.1016/j.dsp.2014.08.008en_US
dc.identifier.endpage85en_US
dc.identifier.issn1051-2004
dc.identifier.issn1095-4333
dc.identifier.scopusqualityQ2en_US
dc.identifier.startpage75en_US
dc.identifier.urihttp://doi.org/10.1016/j.dsp.2014.08.008
dc.identifier.urihttps://hdl.handle.net/20.500.12885/1189
dc.identifier.volume35en_US
dc.identifier.wosWOS:000344827700008en_US
dc.identifier.wosqualityQ2en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.institutionauthorHanilçi, Cemal
dc.language.isoenen_US
dc.publisherAcademic Press Inc Elsevier Scienceen_US
dc.relation.ispartofDigital Signal Processingen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectSource cell-phone recognitionen_US
dc.subjectMel-frequency cepstrum coefficientsen_US
dc.subjectMutual informationen_US
dc.subjectSource microphone identificationen_US
dc.subjectGaussian mixture modelen_US
dc.titleSource cell-phone recognition from recorded speech using non-speech segmentsen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Hanilci-2014-Source-cell-phone-recognition-from-.pdf
Boyut:
532.52 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text