Hand gesture recognition (HGR) enables natural and contactless interaction between humans and intelligent systems. This paper proposes a real-time gesture recognition framework based on a hybrid architecture combining classical computer vision techniques with deep learning. The system integrates fast hand localization using MediaPipe-based region-of-interest extraction, Histogram of Oriented Gradients (HOG) feature encoding, and a lightweight convolutional neural network (CNN) for gesture classification, followed by temporal stabilization to improve prediction consistency across video frames. A dataset containing 900 gesture images (open, fist, and peace) was automatically collected using a webcam-based acquisition module and divided into training and validation subsets using an 85/15 split with data augmentation. Experimental evaluation includes quantitative performance analysis, ablation studies, and real-time testing. The proposed framework achieves 96.8% accuracy, 96.5% precision, 96.2% recall, and 96.3% F1-score, while maintaining real-time processing speed of approximately 28 FPS.
Copyrights © 2026