Kristanto, Samuel Miracle
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Implementation of YOLOv12 and PaddleOCR for Indonesian Bank Statement Table Extraction Kristanto, Samuel Miracle; Tanuwijaya, Evan
Sinkron : jurnal dan penelitian teknik informatika Vol. 10 No. 1 (2026): Article Research January 2026
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v10i1.15383

Abstract

The increasing reliance on digital financial documents has highlighted the need for automated methods to extract structured information from bank statements. Traditional optical character recognition (OCR) systems often fail to capture complex tabular structures, leading to incomplete or error-prone transaction records. To address this challenge, this research proposes a two-stage detection and recognition pipeline that combines YOLOv12 for table and structural element detection with PaddleOCR for text extraction, followed by automated Excel conversion. The objective of this study is to improve accuracy in localizing tables, detecting rows and columns, and generating structured financial data that can be directly utilized for downstream applications. The methods involve training a YOLOv12-n model in two stages: Stage 1 focuses on detecting entire table regions, while Stage 2 focuses on identifying row and column structures within the detected tables. A lightweight AdamW optimizer with conservative augmentation strategies was applied to preserve the geometric integrity of document layouts. Results show that Stage 1 achieved precision of 0.998, recall of 1.0, and mAP50-95 of 0.989, while Stage 2 achieved precision of 0.992, recall of 0.964, and mAP50-95 of 0.899, demonstrating strong localization and structural recognition. The conclusions confirm that the proposed two-stage pipeline is effective for financial document processing, with potential applications in digital banking, auditing, and automated record management. Future research may focus on expanding datasets and addressing domain-specific variability.