Optical Character Recognition (OCR) technology plays an important role in automating information extraction from identity documents such as the Electronic Identity Card (e-KTP). However, recognizing long text sequences and handling complex character variations remain significant challenges. These issues can lead to high error rates. This study aims to address these limitations by exploring a deep learning–based OCR model that integrates Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Connectionist Temporal Classification (CTC) in an end-to-end framework without explicit character segmentation. CNN is employed to extract visual features, LSTM captures sequential dependencies, and CTC enables flexible alignment between input images and output text. The main contribution of this study lies in analysing the performance of a CNN-LSTM model with CTC in extracting e-KTP information across text categories with different complexity levels, namely Date and Place of Birth (TTL), name, and national identification number (NIK). Performance is evaluated using the Character Error Rate (CER). The results show that the model achieves the best performance on TTL with a CER of 0.84%, followed by NIK at 1.29%, and Name at 4.33% indicating higher difficulty in recognizing more complex text patterns. These findings demonstrate that model performance is influenced by text characteristics, particularly variability and sequence length. Overall, the proposed approach is effective for end-to-end e-KTP information extraction and provides insights for developing more adaptive OCR models.
Copyrights © 2026