A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play

Sevdi, Onur Eren; Topaloglu, Yakup; Kayaarma, Selma Yilmazyildiz

A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play

dc.contributor.author	Sevdi, Onur Eren
dc.contributor.author	Topaloglu, Yakup
dc.contributor.author	Kayaarma, Selma Yilmazyildiz
dc.date.accessioned	2026-02-08T15:11:11Z
dc.date.available	2026-02-08T15:11:11Z
dc.date.issued	2025
dc.department	Bursa Teknik Üniversitesi
dc.description	2025 Innovations in Intelligent Systems and Applications Conference, ASYU 2025 -- 2025-09-10 through 2025-09-12 -- Bursa -- 214381
dc.description.abstract	Speaker identification plays a key role in various applications, such as security, biometrics, and human-computer interaction. As a specific task under the domain of audio classification, speaker identification aims to recognize individuals based on their voice characteristics. This paper presents a comparison between three widely adopted neural network architectures and evaluates their performance as classifiers for a real-time speaker identification system. A custom-collected dataset was gathered using publicly shared YouTube videos of a single speaker imitating multiple characters from traditional Turkish shadow play Karagoz and Hacivat. Both MFCC and Log-Mel filterbank energy features were used during the training of CRNN, 2D-CNN and Bi-LSTM architectures. Among these architectures, 2D-CNN achieved the highest accuracy with a value of 94.4% and was approximately 2.7 times faster than its closest follower Bi-LSTM during real-time testing on RTX 4070 Super GPU. © 2025 IEEE.
dc.identifier.doi	10.1109/ASYU67174.2025.11208275
dc.identifier.isbn	9798331597276
dc.identifier.scopus	2-s2.0-105022444330
dc.identifier.scopusquality	N/A
dc.identifier.uri	https://doi.org/10.1109/ASYU67174.2025.11208275
dc.identifier.uri	https://hdl.handle.net/20.500.12885/5295
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.snmz	Scopus_KA_20260207
dc.subject	Audio Classification
dc.subject	Bidirectional Long Short Term Memory
dc.subject	Convolutional Neural Networks
dc.subject	Convolutional Recurrent Neural Networks
dc.subject	Speaker Identification
dc.title	A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play
dc.type	Conference Object

Koleksiyon

Scopus İndeksli Yayınlar Koleksiyonu

A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play

Dosyalar

Koleksiyon