Sabrina Adinda Sari
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Investigating Shallow Learning Methods for Optical Character Recognition of Indonesia’s Nusantara Scripts Sulistiyo, Mahmud Dwi; Putrada, Aji Gautama; Ihsan, Aditya Firman; Yunanto, Prasti Eko; Richasdy, Donny; Sailellah, Hassan Rizky Putra; Sabrina Adinda Sari
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 6 (2025): December 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i6.6648

Abstract

Indonesia has numerous regional scripts—or so-called Nusantara scripts—and recognizing them is important to preserve Indonesia's cultural heritage. The advances of AI and computer vision technologies make it possible for a machine to optically read the handwritten scripts through the Optical Character Recognition (OCR) technique. However, collecting some of the top OCR solutions and comprehensively investigating their performances on the Nusantara scripts is currently lacking. This study investigates and evaluates some shallow learning-based methods on our newly introduced datasets, consisting of more than 38,000-character images across 80 letter classes in total; here, we focus on three regional scripts: Javanese, Sundanese, and Balinese. The methods include Random Forest, SVM, Logistic Regression, and Gaussian Naïve Bayes, as well as boosting techniques such as XGBoost, Light GBM, and CatBoost. A 5-fold cross-validation approach assessed model performance based on accuracy, precision, recall, and F1-score. Based on the experimental results, the methods demonstrated their competitiveness in reaching the best models for scripts; in particular, XGBoost, Light GBM, and Random Forest-Gini were the winners for Javanese, Sundanese, and Balinese scripts, respectively. These findings demonstrate the effectiveness of ensemble learning methods for diverse handwritten scripts. Comparative analysis to prior deep learning studies is also discussed in this paper. In addition, this research also contributes to preserving Indonesian traditional scripts, as well as offers insights for future regional OCR in other countries.