The number of elderly people in Indonesia continues to rise every year, posing serious challenges regarding their well-being and safety, particularly in nursing homes where caregiver shortages and limited monitoring capabilities remain critical issues. Existing elderly monitoring systems are typically limited to single-modality approaches, such as vision-only fall detection or single-sensor environmental monitoring, lacking real-time multimodal integration. To address these gaps, the “SIKELAS” (Smart Safety System for the Elderly) system was developed as a novel integrated solution combining AI technology, anomaly detection, voice recognition, and five environmental sensors into a single unified real-time monitoring platform. The novelty of SIKELAS lies in its simultaneous integration of MediaPipe-based fall and gesture detection, Speech-to-Text voice recognition, and multi-sensor environmental monitoring coordinated through an ESP32 microcontroller and Flask back-end, delivering automated Telegram alerts with an average response time under 1 second. This research employs an experimental quantitative approach with controlled laboratory testing across five datasets. Key results: gesture detection accuracy 98.38%, fall detection accuracy 89.16%, gas detection above 2500 threshold, flame detection below 500 threshold, and ultrasonic error below 3%. SIKELAS outperforms previous single-modality systems, delivering a comprehensive, measurable solution that reduces caregiver workload through automated multimodal monitoring and real-time emergency notifications.