The rapid detection of emergency vehicle sirens is critical for enhancing road safety and traffic management. This study proposes an automated classification system for ambulance sirens using a Convolutional Neural Network (CNN). The method utilizes Mel-Frequency Cepstral Coefficients (MFCC) to transform audio signals into 2D feature maps, allowing the model to capture distinct spectral and temporal patterns. The dataset was preprocessed using a stratified split to ensure balanced class distribution and prevent data leakage. Experimental results demonstrate that the CNN model achieves a high performance with an accuracy of 0.95, significantly outperforming baseline models such as Multi-Layer Perceptron (MLP) and XGBoost. Detailed evaluation through a confusion matrix indicates a consistent precision, recall, and F1-score of 0.95, proving the model’s robustness in distinguishing sirens from complex urban noise. The implementation of the Adam optimizer and early stopping mechanism ensured stable convergence and prevented overfitting. These findings suggest that the proposed CNN-MFCC framework provides a reliable solution for real-time emergency signal detection, offering a substantial contribution to intelligent transportation systems.
Copyrights © 2026