Garuda - Garba Rujukan Digital

Journal of Information Technology and Computer Science

Vol. 11 No. 1: April 2026

Sugianta, I Kadek Arya (Unknown)

Publish Date
25 May 2026

Environmental Sound Classification (ESC) faces significant challenges related to data scarcity and unstructured acoustic signal variability. This study evaluates the effectiveness of a Visual Transfer Learning approach by transforming audio signals into Mel-Spectrogram representations for classification using Computer Vision architectures. A comparative study was conducted on the ESC-50 dataset, benchmarking visual-based models (EfficientNet-B0, ResNet-50) against specialized audio models (Pre-trained Audio Neural Networks/PANNs). Experimental results demonstrate that EfficientNet-B0, optimized with MixUp augmentation, achieved the highest performance with 83.33% accuracy and 83.50% F1-Score, outperforming ResNet-50 (80.00%) and significantly surpassing the PANNs (Cnn14) model, which only reached 66.33%. The underperformance of PANNs indicates issues with over-parameterization on small-scale datasets. Further validation using Gradient-weighted Class Activation Mapping (Grad-CAM) confirmed that the EfficientNet-B0 model precisely learned semantic features by distinguishing active sound patterns from silence and background noise. These findings confirm that lightweight visual architectures possess superior transferability and generalization compared to massive audio models in data-constrained scenarios.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Journal of Information Technology and Computer Science

Website

Abbrev

jitecs

Publisher

Universitas Brawijaya

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Information Technology and Computer Science (JITeCS) is a peer-reviewed open access journal published by Faculty of Computer Science, Universitas Brawijaya (UB), Indonesia. The journal is an archival journal serving the scientist and engineer involved in all aspects of information ...

Article Info

Abstract

Comparing Audio and Visual Transfer Learning for Environmental Sound Classification

Article Info

Abstract