Garuda - Garba Rujukan Digital

JURNAL NASIONAL TEKNIK ELEKTRO

Vol 14, No 3: November 2025

irmawan, Irmawan (Unknown)
Dwijayanti, Suci (Unknown)
Suprapto, Bhakti Yudho (Unknown)

Publish Date
12 Dec 2025

Spoken digit recognition (SDR) plays a critical role in biometric authentication and human–computer interaction, yet existing approaches often rely on small datasets, limited feature representations, or architectures prone to overfitting. To address these limitations, this study proposes a robust end-to-end pipeline that integrates Wavelet Time Scattering (WTS), Mel-Frequency Cepstral Coefficients (MFCC), and a 2D Deep Convolutional Neural Network (2D-CNN) to enhance the accuracy and generalization of SDR systems in realistic environments. The Free-Spoken Digit Dataset (FSDD), consisting of 3000 audio samples from speakers with diverse accents, was pre-processed using zero-padding normalization and transformed into high-resolution time–frequency spectrograms via WTS. The proposed CNN architecture, optimized through systematic experimentation on batch size and learning rate, demonstrated stable convergence and superior discriminative capability. Using a learning rate of 0.001 and a batch size of 50, the model achieved the highest performance with 99.2% accuracy, outperforming established methods including SVM, MFCC-LSTM, and Multiple RNN architectures. Comparative evaluations further revealed that the combined WTS–MFCC feature extraction significantly enhances spectral–temporal representation quality, contributing to improved classification precision across all digit classes. These findings demonstrate that the proposed WTS-MFCC-CNN framework not only advances SDR accuracy but also provides a scalable and computationally efficient approach suitable for real-world biometric, financial, and voice-controlled applications. The results highlight the potential of hybrid time–frequency representations integrated with deep architectures to set a new benchmark for robust spoken digit recognition.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

JURNAL NASIONAL TEKNIK ELEKTRO

Website

Abbrev

JNTE

Publisher

Universitas Andalas

Subject

Electrical & Electronics Engineering

Description

Jurnal Nasional Teknik Elektro (JNTE) adalah jurnal ilmiah peer-reviewed yang diterbitkan oleh Jurusan Teknik Elektro Universitas Andalas dengan versi cetak (p-ISSN:2302-2949) dan versi elektronik (e-ISSN:2407-7267). JNTE terbit dua kali dalam setahun untuk naskah hasil/bagian penelitian yang ...

Article Info

Abstract

A Hybrid Wavelet Scattering and Mel Spectrogram Feature with Deep Convolution Neural Network for Robust Spoken Digit Recognition

Article Info

Abstract