This Author published in this journals
All Journal bit-Tech
Enryco Hidayat
Universitas Pembangunan Nasional "Veteran" Jawa Timur

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Indonesian Sign Language (SIBI) Recognition from Audio Mel-Spectrograms Using LSTM Architecture Enryco Hidayat; Mohammad Idhom; Afina Lina Nurlaili
bit-Tech Vol. 8 No. 2 (2025): bit-Tech
Publisher : Komunitas Dosen Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32877/bt.v8i2.3229

Abstract

Persistent communication barriers continue to challenge Deaf and Hard of Hearing (DHH) individuals in accessing spoken language, underscoring the need for effective and inclusive translation technologies. Existing audio-to-sign language systems typically employ multi-stage pipelines involving speech-to-text transcription, which may propagate recognition errors and fail to preserve acoustic nuances. Addressing these limitations, this study developed and evaluated a deep learning framework for translating spoken Indonesian audio directly into classifications of the Indonesian Sign Language System (SIBI), eliminating explicit text conversion. The dataset comprised 495 eight-second WAV recordings (22,050 Hz) representing five SIBI phrase classes, augmented through time stretching, pitch shifting, and noise addition to improve generalization. Mel-Spectrogram features were extracted and input to a stacked Long Short-Term Memory (LSTM) network implemented in TensorFlow/Keras, trained to learn temporal–spectral mappings between audio patterns and SIBI categories. Evaluation on a held-out test set demonstrated robust performance, achieving 98 % accuracy with consistently high precision, recall, and F1-scores. The trained model was further integrated into a prototype web application built with Flask and React, confirming its feasibility for real-time assistive communication. While results highlight the viability of direct Mel-Spectrogram-to-LSTM translation for SIBI recognition, current findings are constrained by the limited dataset size and restricted speaker diversity. Future research should therefore expand the dataset to include more speakers, varied acoustic environments, and continuous-speech inputs to ensure broader applicability and real-world robustness.