Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
Vol 12, No 2: June 2024

Bengali Word Detection from Lip Movements Using Mask RCNN and Generalized Linear Model

Bhuiyan, Abul Bashar (BRAC University, Dhaka, Bangladesh)
Uddin, Jia (Woosong University)



Article Info

Publish Date
30 Jun 2024

Abstract

Speech processing with the help of lip detection and lip reading is an advancing field. For this, we need proper algorithms and techniques to detect lips and movements of lips perfectly. Lip detection and configuration are the most important parts of speech recognition. In this paper, we focus on detecting the lip segment properly. Mask R-CNN (Regional Convolutional Neural Network) performs object detection and instance segmentation per video frame to detect the lip segment. The process of mask R-CNN adds only a small overhead to Faster R-CNN and is quite simple to train, running at 5 frames per second. The Mask R-CNN involves keypoint detection which helps to extract the location of the lip landmarks pixel by pixel. Once the lip region is extracted and the landmarks are highlighted, we observe how the lip landmarks change as the object's lips move over time to each Bengali word. The keypoint changes that are observed during each millisecond are then the landmarks used to train the GLM (Generalized Linear Model). In addition, we compare the performance of GLM with Naive Bayes, Logistic Regression, and Decision Tree. The GLM has exhibited the highest 91.8% accuracy, whereas the Naive Bayes, Logistic Regression, and Decision Tree show the accuracy of 87.1%, 38.3%, and 82.2%, respectively.

Copyrights © 2024






Journal Info

Abbrev

IJEEI

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Indonesian Journal of Electrical Engineering and Informatics (IJEEI) is a peer reviewed International Journal in English published four issues per year (March, June, September and December). The aim of Indonesian Journal of Electrical Engineering and Informatics (IJEEI) is to publish high-quality ...