Bulletin of Electrical Engineering and Informatics
Vol 10, No 2: April 2021

Vietnamese character recognition based on CNN model with reduced character classes

Thi Ha Phan (Posts and Telecommunications Institute of Technology)
Duc Chung Tran (FPT University)
Mohd Fadzil Hassan (Universiti Teknologi PETRONAS)



Article Info

Publish Date
01 Apr 2021

Abstract

This article will detail the steps to build and train the convolutional neural network (CNN) model for Vietnamese character recognition in educational books. Based on this model, a mobile application for extracting text content from images in Vietnamese textbooks was built using OpenCV and Canny edge detection algorithm. There are 178 characters classes in Vietnamese with accents. However, within the scope of Vietnamese character recognition in textbooks, some classes of characters only differ in terms of actual sizes, such as “c” and “C”, “o” and “O”. Therefore, the authors built the classification model for 138 Vietnamese character classes after filtering out similar character classes to increase the model's effectiveness.

Copyrights © 2021






Journal Info

Abbrev

EEI

Publisher

Subject

Electrical & Electronics Engineering

Description

Bulletin of Electrical Engineering and Informatics (Buletin Teknik Elektro dan Informatika) ISSN: 2089-3191, e-ISSN: 2302-9285 is open to submission from scholars and experts in the wide areas of electrical, electronics, instrumentation, control, telecommunication and computer engineering from the ...