Manual transcription of data from Indonesian identity cards (KTP) remains prevalent in public institutions, often resulting in inefficiencies and human errors that compromise data accuracy. While Optical Character Recognition (OCR) technologies such as Tesseract have been widely adopted. However, the performance on KTP images is still inconsistent due to non-uniform layouts, low contrast, and background noise. This study proposes a dual-pipeline OCR framework designed to enhance the recognition accuracy of Indonesian KTPs under real-world conditions. First, the pipeline performs static region segmentation based on predefined Regions of Interest (ROI), then uses dynamic keyword heuristics to locate text adaptively across varying layouts. The outputs of both pipelines are merged through a voting and regex-based post-processing mechanism, which includes character normalization and field validation using predefined dictionaries. Experiments were conducted on 78 annotated KTP samples with diverse resolutions and quality of images. Evaluation using Character Error Rate (CER), Word Error Rate (WER), and field-level accuracy metrics resulted in an average CER of 69.82%, WER of 80.20%, and character-level accuracy of 30.18%. Despite moderate performance in free-text areas such as address or occupation, structured fields achieved higher accuracy above 60%. The method runs efficiently in a CPU-only environment without requiring large annotated datasets, demonstrating its suitability for low-resource OCR deployment. Compared to conventional single-pipeline approaches, the proposed framework improves robustness across heterogeneous document layouts and illumination conditions. These findings highlight the potential of lightweight, rule-based OCR systems for practical e-KYC digitization and form a foundation for integrating deep-learning-based layout detection in future research.
Copyrights © 2025