Arşiv logosu
  • Türkçe
  • English
  • Giriş
    Yeni kullanıcı mısınız? Kayıt için tıklayın. Şifrenizi mi unuttunuz?
Arşiv logosu
  • Koleksiyonlar
  • DSpace İçeriği
  • Analiz
  • Türkçe
  • English
  • Giriş
    Yeni kullanıcı mısınız? Kayıt için tıklayın. Şifrenizi mi unuttunuz?
  1. Ana Sayfa
  2. Yazara Göre Listele

Yazar "Buker, Aykut" seçeneğine göre listele

Listeleniyor 1 - 3 / 3
Sayfa Başına Sonuç
Sıralama seçenekleri
  • Küçük Resim Yok
    Öğe
    Double Compressed Wideband AMR Speech Detection Using Deep Neural Networks
    (Springer Birkhauser, 2024) Buker, Aykut; Hanilci, Cemal
    Detecting double compressed (DC) speech signals is an important audio forensics task since it is highly related to the integrity and the authenticity of the recording. Adaptive multi-rate (AMR) speech codec is a popular audio compression technique specifically optimized for speech signals and it is a standard audio recording format in the vast majority of the smart phones. All of the previous studies addressing the detection of DC AMR signals report their findings for the speech signals compressed using the narrowband AMR codec (AMR-NB). Meanwhile, wideband AMR codec (AMR-WB) has been used by several mobile phone manufacturers, but DC AMR-WB speech signal detection performance remains unknown. To the best of our knowledge, this is the first study focusing on detecting the DC signals compressed using the AMR-WB speech codec. To this end, we propose three different deep neural network-based DC AMR-WB signal detection systems where the spectrogram representations of the speech signals are used as the input features. Experimental results conducted on TIMIT database provide several important findings regarding the DC AMR-WB speech detection. Firstly, DC AMR-WB detection is found to be a more challenging task than detecting the AMR-NB signals. For example, convolutional neural network (CNN)-based system yields 74.83% and 99.93% detection rates on AMR-WB and AMR-NB coded signals, respectively. Secondly, capturing the temporal information using long short-term memory (LSTM) network with the DC AMR-WB signal detection accuracy of 86.25% is found to be superior to the CNN system. Thirdly, combining the deep feature representations learned by CNN and LSTM networks further improves the performance. Fourthly, the detection rates are found to deteriorate when the signals are first encoded using different audio codecs prior to AMR-WB compression. Finally, applying score level or decision level fusion to the proposed three systems improves the detection rates, in general.
  • Küçük Resim Yok
    Öğe
    Evaluating Parameter Sharing for Spoofing-Aware Speaker Verification: A Case Study on the ASVspoof 5 Dataset
    (Isca-Int Speech Communication Assoc, 2025) Buker, Aykut; Kurnaz, Oguzhan; Bekiryazici, Yule; Demirtac, Selim Can; Hanilci, Cemal
    Spoofing-aware speaker verification (SASV) is an important but challenging task and has been a primary focus of the recently organized ASVspoof 5 challenge. As SASV integrates automatic speaker verification (ASV) and countermeasure (CM) systems, its performance depends on the effectiveness of each system. This study systematically examines the impact of different parameter-sharing (PS) strategies, which facilitate joint optimization, on SASV performance using the ASVspoof 5 dataset. Experimental results indicate that PS enhances performance for specific attack types and codec conditions. For example, the baseline system achieves a min a-DCF of 0.329 on the A26 attack, which improves to 0.233 with PS. Similarly, for AMR-compressed signals, PS yields a 14.09% performance gain. These observations show that PS techniques are effective in mitigating certain spoofing attacks and improving robustness to degraded audio conditions in SASV systems.
  • Küçük Resim Yok
    Öğe
    Exploring the Effectiveness of the Phase Features on Double Compressed AMR Speech Detection
    (Mdpi, 2024) Buker, Aykut; Hanilci, Cemal
    Determining whether an audio signal is single compressed (SC) or double compressed (DC) is a crucial task in audio forensics, as it is closely linked to the integrity of the recording. In this paper, we propose the utilization of phase spectrum-based features for detecting DC narrowband and wideband adaptive multi-rate (AMR-NB and AMR-WB) speech. To the best of our knowledge, phase spectrum features have not been previously explored for DC audio detection. In addition to introducing phase spectrum features, we propose a novel parallel LSTM system that simultaneously learns the most representative features from both the magnitude and phase spectrum of the speech signal and integrates both sets of information to further enhance its performance. Analyses demonstrate significant differences between the phase spectra of SC and DC speech signals, suggesting their potential as representative features for DC AMR speech detection. The proposed phase spectrum features are found to perform as well as magnitude spectrum features for the AMR-NB codec, while outperforming the magnitude spectrum in detecting AMR-WB speech. The proposed phase spectrum features yield 8% performance improvement in terms of true positive rate over the magnitude spectrogram features. The proposed parallel LSTM system further improves DC AMR-WB speech detection.

| Bursa Teknik Üniversitesi | Kütüphane | Açık Erişim Politikası | Rehber | OAI-PMH |

Bu site Creative Commons Alıntı-Gayri Ticari-Türetilemez 4.0 Uluslararası Lisansı ile korunmaktadır.


Mimar Sinan Mahallesi Mimar, Sinan Bulvarı, Eflak Caddesi, No: 177, 16310, Yıldırım, Bursa, Türkiye
İçerikte herhangi bir hata görürseniz lütfen bize bildirin

DSpace 7.6.1, Powered by İdeal DSpace

DSpace yazılımı telif hakkı © 2002-2026 LYRASIS

  • Çerez ayarları
  • Gizlilik politikası
  • Son Kullanıcı Sözleşmesi
  • Geri bildirim Gönder