A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play
| dc.contributor.author | Sevdi, Onur Eren | |
| dc.contributor.author | Topaloglu, Yakup | |
| dc.contributor.author | Kayaarma, Selma Yilmazyildiz | |
| dc.date.accessioned | 2026-02-08T15:11:11Z | |
| dc.date.available | 2026-02-08T15:11:11Z | |
| dc.date.issued | 2025 | |
| dc.department | Bursa Teknik Üniversitesi | |
| dc.description | 2025 Innovations in Intelligent Systems and Applications Conference, ASYU 2025 -- 2025-09-10 through 2025-09-12 -- Bursa -- 214381 | |
| dc.description.abstract | Speaker identification plays a key role in various applications, such as security, biometrics, and human-computer interaction. As a specific task under the domain of audio classification, speaker identification aims to recognize individuals based on their voice characteristics. This paper presents a comparison between three widely adopted neural network architectures and evaluates their performance as classifiers for a real-time speaker identification system. A custom-collected dataset was gathered using publicly shared YouTube videos of a single speaker imitating multiple characters from traditional Turkish shadow play Karagoz and Hacivat. Both MFCC and Log-Mel filterbank energy features were used during the training of CRNN, 2D-CNN and Bi-LSTM architectures. Among these architectures, 2D-CNN achieved the highest accuracy with a value of 94.4% and was approximately 2.7 times faster than its closest follower Bi-LSTM during real-time testing on RTX 4070 Super GPU. © 2025 IEEE. | |
| dc.identifier.doi | 10.1109/ASYU67174.2025.11208275 | |
| dc.identifier.isbn | 9798331597276 | |
| dc.identifier.scopus | 2-s2.0-105022444330 | |
| dc.identifier.scopusquality | N/A | |
| dc.identifier.uri | https://doi.org/10.1109/ASYU67174.2025.11208275 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12885/5295 | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
| dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | Scopus_KA_20260207 | |
| dc.subject | Audio Classification | |
| dc.subject | Bidirectional Long Short Term Memory | |
| dc.subject | Convolutional Neural Networks | |
| dc.subject | Convolutional Recurrent Neural Networks | |
| dc.subject | Speaker Identification | |
| dc.title | A Near-Real Time Automatic Audio Classification: Special Case for Hacivat and Karagoz Shadow Play | |
| dc.type | Conference Object |












