Garuda - Garba Rujukan Digital

JEECS (Journal of Electrical Engineering and Computer Sciences)

Vol. 10 No. 2 (2025): JEECS (Journal of Electrical Engineering and Computer Sciences)

Reksiyano, Rendy Dwi (Unknown)
Pane, Syafrial Fachri (Unknown)
Awangga, Rolly Maulana (Unknown)

Publish Date
14 Dec 2025

Manual transcription of data from Indonesian identity cards (KTP) remains prevalent in public institutions, often resulting in inefficiencies and human errors that compromise data accuracy. While Optical Character Recognition (OCR) technologies such as Tesseract have been widely adopted. However, the performance on KTP images is still inconsistent due to non-uniform layouts, low contrast, and background noise. This study proposes a dual-pipeline OCR framework designed to enhance the recognition accuracy of Indonesian KTPs under real-world conditions. First, the pipeline performs static region segmentation based on predefined Regions of Interest (ROI), then uses dynamic keyword heuristics to locate text adaptively across varying layouts. The outputs of both pipelines are merged through a voting and regex-based post-processing mechanism, which includes character normalization and field validation using predefined dictionaries. Experiments were conducted on 78 annotated KTP samples with diverse resolutions and quality of images. Evaluation using Character Error Rate (CER), Word Error Rate (WER), and field-level accuracy metrics resulted in an average CER of 69.82%, WER of 80.20%, and character-level accuracy of 30.18%. Despite moderate performance in free-text areas such as address or occupation, structured fields achieved higher accuracy above 60%. The method runs efficiently in a CPU-only environment without requiring large annotated datasets, demonstrating its suitability for low-resource OCR deployment. Compared to conventional single-pipeline approaches, the proposed framework improves robustness across heterogeneous document layouts and illumination conditions. These findings highlight the potential of lightweight, rule-based OCR systems for practical e-KYC digitization and form a foundation for integrating deep-learning-based layout detection in future research.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

JEECS (Journal of Electrical Engineering and Computer Sciences)

Website

Abbrev

jeecs

Publisher

Universitas Bhayangkara Surabaya

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

We aims to promote high-quality Electrical Engineering and Computer Sciences research among academics and practitioners alike, including power system, electrical engineering, industry automation, mechatronics, computer sciences, informatics, and information system. This journal is dedicated for the ...

Article Info

Abstract

Enhancing OCR Accuracy on Indonesian ID Cards Using Dual-Pipeline Tesseract and Post-Processing

Article Info

Abstract