Indonesian Sign Language (BISINDO) serves as a primary communication medium for the deaf community; however, limited public understanding often creates barriers during daily interactions. This study aims to develop a real-time BISINDO word-level translation system using hand landmark extraction and temporal modeling with Long Short-Term Memory (LSTM). The system employs MediaPipe Hands to detect 21 hand landmarks per frame, which are then processed as sequential motion patterns to classify five BISINDO words: saya, terima kasih, maaf, nama, and kamu. A total of 250 gesture samples were recorded under controlled lighting conditions as the primary dataset. The processed sequences were used to train the LSTM model, which was subsequently integrated with an ESP32 microcontroller and a DFPlayer Mini module to produce direct audio output. Experimental results show that the model achieved an average accuracy of 86%, with precision and recall values ranging from 0.81 to 0.94. The confusion matrix analysis indicates that most gestures were correctly classified, although some errors occurred in gestures with similar initial motion trajectories. Integration testing demonstrated an average system latency of 3.8 seconds and an audio output success rate of 85%. These findings indicate that the proposed system is capable of translating BISINDO word-level gestures accurately, responsively, and consistently in real-time conditions. This study provides a strong foundation for the broader development of sign language translation systems, with potential enhancements in vocabulary expansion, multi-user datasets, and hardware optimization for deployment in real-world environments.
Copyrights © 2025