Jurnal Infra
Vol 8, No 2 (2020)

Perbandingan Character Recognition dan Text Recognition Menggunakan Extended MNIST dan IAM Database dan Tesseract pada Tulisan Tangan Ijazah

Made Yoga Mahardika (Program Studi Informatika)
Kartika Gunadi (Program Studi Informatika)
Alexander Setiawan (Program Studi Informatika)



Article Info

Publish Date
03 Oct 2020

Abstract

The problem with handwriting is how a technique can recognize various types of writing in various forms. Different from computer letters that consistent, each human’s handwriting is unique in the form and consistency. These problems can be found in ijazah documents where the data is handwriting.Data location segmentation uses run length smoothing algorithm with dots as segmentation features. Handwritten text recognition (HTR) technique requires data segmented into words. Handwritten character recognition (HCR) technique requires data segmented into characters. HCR uses the LeNet5 model with the EMNIST dataset. HTR uses tesseract tool and convolutional recurrent neural networks with the IAM database.Experiment on 10 samples of scan images, segmentation obtained an average accuracy of 95.6%. The HCR technique failed in the letter segmentation process in cursive handwriting. The best technique is the HTR with tesseract tool managed to get word accuracy above 69% tested on 5 scan samples, 15 data fields.

Copyrights © 2020