Akbar, Gebran
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Development of AI-Based Presentation Application using Deep Learning for Individuals With Disabilities Hutagalung, Carli Apriansyah; Fitrianto, Adi; Akbar, Gebran
Building of Informatics, Technology and Science (BITS) Vol 6 No 3 (2024): December 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i3.6162

Abstract

This study addresses the challenges individuals with disabilities face in controlling presentation devices, particularly in noisy environments, by developing an AI-based application using a hybrid LSTM-GRU model. The primary objective is to improve voice command recognition accuracy for commonly used presentation commands, such as “next” and “back,” even under varying noise conditions. The research employs a hybrid deep learning architecture combining Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) with an attention mechanism to focus on the most relevant temporal features. The model was trained using the Speech Commands Dataset and further fine-tuned with noise-augmented data to simulate real-world environments. Results show that the LSTM-GRU model achieved high accuracy in clean environments and maintained reasonable performance in noisy conditions, outperforming traditional models like Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM). The fine-tuned model, at its optimal epoch, demonstrated robust performance with a balanced precision and recall, making it suitable for deployment in real-world scenarios. The study concludes that while deep learning models offer significant improvements, further refinement is necessary to enhance noise resilience in practical applications