ILKOM Jurnal Ilmiah
Vol 17, No 3 (2025)

Smart Verification of High School Student Reports Using Optical Character Recognition and BERT Models

Syahyadi, Asep Indra (Unknown)
Afif, Nur (Unknown)
Yusuf, Ahmad (Unknown)
Setiaji, Haris (Unknown)
Ridwang, Ridwang (Unknown)
Irfan, Mohammad (Unknown)



Article Info

Publish Date
09 Dec 2025

Abstract

This study proposes an intelligent framework for verifying high school report cards with diverse layouts by integrating Optical Character Recognition (OCR) and a fine-tuned BERT model. While previous works primarily address document formats with uniform structures, this research specifically tackles the heterogeneity of report cards that differ in subject arrangement, naming conventions, and grade presentation across schools. The system was trained and evaluated using 1,000 Indonesian high school report card pages encompassing 20 subjects, both core (e.g., Mathematics, Indonesian History, Religious Education) and non-core (e.g., Arts and Culture, Physical Education). OCR was employed to extract textual content from scanned or image-based report cards, while BERT handled contextual mapping between subjects and corresponding grades. The dataset was divided into 80% for training and 20% for validation, and the model was fine-tuned on the IndoBERT-base architecture. Experimental results showed that the proposed OCR–BERT pipeline achieved an average accuracy of 97.7%, with per-subject accuracies ranging from 96% to 99%. The model exhibited high robustness in handling inconsistent layouts and minimizing deviations between actual and detected grades. Comparative analysis indicated that this hybrid approach outperforms traditional OCR-only or CNN-based methods, which are typically constrained by fixed template assumptions and lack contextual understanding. The proposed system demonstrates practical relevance for large-scale admission platforms such as SPAN-PTKIN, where manual verification of thousands of report cards is laborious and error-prone. By automating the verification process, the framework reduces human workload, enhances accuracy, and supports fairer, data-driven admission decisions. Future research will explore multimodal integration of textual and visual features, expansion to broader datasets, and application to other academic documents such as transcripts and diplomas. Overall, this work contributes a scalable, accurate, and context-aware solution for educational data verification in heterogeneous document environments.

Copyrights © 2025






Journal Info

Abbrev

ILKOM

Publisher

Subject

Computer Science & IT

Description

ILKOM Jurnal Ilmiah is an Indonesian scientific journal published by the Department of Information Technology, Faculty of Computer Science, Universitas Muslim Indonesia. ILKOM Jurnal Ilmiah covers all aspects of the latest outstanding research and developments in the field of Computer science, ...