The process of interaction between humans, computers, and electronic equipment can now be made more interactive, natural, and intuitive. In several previous studies, this interactive process was carried out through sensors or detection of finger gestures using computer vision based on MediaPipe. In this research, we designed and built a system that can control the fan rotation speed level using voice commands from three languages, namely Indonesian, English, and Javanese in real time through an audio classification process with YAMNet. The research results in the training process with 15 epochs had 100% accuracy, loss 0.46, ROC curve class 0 (fan off) was 100%, class 1 (low rotation fan) was 100%, class 2 (medium rotation fan) was 99%, and class 3 (high rotation fan) was 100%. Meanwhile, the results of testing the subset test dataset model using 15 epochs for all commands produced a percentage value of 97.5%.
Copyrights © 2025