Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Vol 9 No 6 (2025): December 2025

Investigating Shallow Learning Methods for Optical Character Recognition of Indonesia’s Nusantara Scripts

Sulistiyo, Mahmud Dwi (Unknown)
Putrada, Aji Gautama (Unknown)
Ihsan, Aditya Firman (Unknown)
Yunanto, Prasti Eko (Unknown)
Richasdy, Donny (Unknown)
Sailellah, Hassan Rizky Putra (Unknown)
Sabrina Adinda Sari (Unknown)



Article Info

Publish Date
11 Jan 2026

Abstract

Indonesia has numerous regional scripts—or so-called Nusantara scripts—and recognizing them is important to preserve Indonesia's cultural heritage. The advances of AI and computer vision technologies make it possible for a machine to optically read the handwritten scripts through the Optical Character Recognition (OCR) technique. However, collecting some of the top OCR solutions and comprehensively investigating their performances on the Nusantara scripts is currently lacking. This study investigates and evaluates some shallow learning-based methods on our newly introduced datasets, consisting of more than 38,000-character images across 80 letter classes in total; here, we focus on three regional scripts: Javanese, Sundanese, and Balinese. The methods include Random Forest, SVM, Logistic Regression, and Gaussian Naïve Bayes, as well as boosting techniques such as XGBoost, Light GBM, and CatBoost. A 5-fold cross-validation approach assessed model performance based on accuracy, precision, recall, and F1-score. Based on the experimental results, the methods demonstrated their competitiveness in reaching the best models for scripts; in particular, XGBoost, Light GBM, and Random Forest-Gini were the winners for Javanese, Sundanese, and Balinese scripts, respectively. These findings demonstrate the effectiveness of ensemble learning methods for diverse handwritten scripts. Comparative analysis to prior deep learning studies is also discussed in this paper. In addition, this research also contributes to preserving Indonesian traditional scripts, as well as offers insights for future regional OCR in other countries.

Copyrights © 2025






Journal Info

Abbrev

RESTI

Publisher

Subject

Computer Science & IT Engineering

Description

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) dimaksudkan sebagai media kajian ilmiah hasil penelitian, pemikiran dan kajian analisis-kritis mengenai penelitian Rekayasa Sistem, Teknik Informatika/Teknologi Informasi, Manajemen Informatika dan Sistem Informasi. Sebagai bagian dari semangat ...