Journal of Information Technology and Computer Science
Vol. 11 No. 1: April 2026

Comparing Audio and Visual Transfer Learning for Environmental Sound Classification

Sugianta, I Kadek Arya (Unknown)



Article Info

Publish Date
25 May 2026

Abstract

Environmental Sound Classification (ESC) faces significant challenges related to data scarcity and unstructured acoustic signal variability. This study evaluates the effectiveness of a Visual Transfer Learning approach by transforming audio signals into Mel-Spectrogram representations for classification using Computer Vision architectures. A comparative study was conducted on the ESC-50 dataset, benchmarking visual-based models (EfficientNet-B0, ResNet-50) against specialized audio models (Pre-trained Audio Neural Networks/PANNs). Experimental results demonstrate that EfficientNet-B0, optimized with MixUp augmentation, achieved the highest performance with 83.33% accuracy and 83.50% F1-Score, outperforming ResNet-50 (80.00%) and significantly surpassing the PANNs (Cnn14) model, which only reached 66.33%. The underperformance of PANNs indicates issues with over-parameterization on small-scale datasets. Further validation using Gradient-weighted Class Activation Mapping (Grad-CAM) confirmed that the EfficientNet-B0 model precisely learned semantic features by distinguishing active sound patterns from silence and background noise. These findings confirm that lightweight visual architectures possess superior transferability and generalization compared to massive audio models in data-constrained scenarios.

Copyrights © 2026






Journal Info

Abbrev

jitecs

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Information Technology and Computer Science (JITeCS) is a peer-reviewed open access journal published by Faculty of Computer Science, Universitas Brawijaya (UB), Indonesia. The journal is an archival journal serving the scientist and engineer involved in all aspects of information ...