Mohd Fadzil Hassan
Universiti Teknologi PETRONAS

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Vietnamese character recognition based on CNN model with reduced character classes Thi Ha Phan; Duc Chung Tran; Mohd Fadzil Hassan
Bulletin of Electrical Engineering and Informatics Vol 10, No 2: April 2021
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v10i2.2810

Abstract

This article will detail the steps to build and train the convolutional neural network (CNN) model for Vietnamese character recognition in educational books. Based on this model, a mobile application for extracting text content from images in Vietnamese textbooks was built using OpenCV and Canny edge detection algorithm. There are 178 characters classes in Vietnamese with accents. However, within the scope of Vietnamese character recognition in textbooks, some classes of characters only differ in terms of actual sizes, such as “c” and “C”, “o” and “O”. Therefore, the authors built the classification model for 138 Vietnamese character classes after filtering out similar character classes to increase the model's effectiveness.