Journal of Advances in Information Systems and Technology
Vol 2 No 2 (2020): October

The Effect of Rescaling on the Performance of Recognition with Arabic Characters Using Tesseract OCR Based on Long Short Term Memory




Article Info

Publish Date
30 Oct 2020

Abstract

The development of the ability to recognize handwritten character images is one of the branches of science that includes pattern recognition and image processing using Optical Character Recognition (OCR) technology. The performance achieved in the case of Arabic characters is not optimal, because of it is cursive nature and relatively high difficulty. Tesseract OCR Engine is a popular OCR framework that is open source and accurate in character recognition development. The Tesseract OCR Engine works well with images that are 300 dpi (dots per inch). This study focuses on rescaling analysis on the recognition of Arabic handwritten characters using Tesseract OCR Engine based Long Short-Term Memory, with scaling sizes 90%, 80%, 70%, and 60% of the source image size. And effect performance on recognized character will be measured with character accuracy as a method of success. This study used 70 images from publicly available IFN / ENIT image samples.

Copyrights © 2020






Journal Info

Abbrev

jaist

Publisher

Subject

Computer Science & IT

Description

Journal of advances in Information Systems and Technology (JAIST) seeks to promote high quality research that is of interest to the international ...