Pafan Doungpaisan
King Mongkut’s University of Technology

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

From audio to image: gunshot classification using Mel spectrogram convolutional neural networks Peerapol Khunarsa; Pafan Doungpaisan
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 15, No 3: June 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v15.i3.pp2166-2180

Abstract

Accurate identification of firearm types from acoustic signals is essential for modern public safety and forensic applications. Traditional gunshot analysis methods often rely on physical evidence or handcrafted audio features, which can be unreliable under noisy and reverberant conditions. This study presents a systematic investigation of gunshot sound classification using Mel spectrogram representations and convolutional neural networks (CNNs). Raw audio signals are transformed into Mel spectrogram images, enabling firearm classification to be formulated as an image recognition problem. Thirteen CNN architectures, ranging from lightweight to deep models, are evaluated under a unified experimental protocol to analyze both classification performance and computational efficiency. Experiments are conducted on a publicly available multi-firearm dataset recorded in semi-controlled real-world environments. The results demonstrate that Mel spectrogram–based CNN models achieve classification accuracy exceeding 94%, while moderate-complexity architectures provide a favorable balance between accuracy and efficiency. The findings highlight the importance of representation–architecture alignment and offer practical design guidelines for selecting deployable CNN models in real-time gunshot detection systems.