The digital preservation of historical writing systems like Aksara Sunda is critical for cultural heritage, yet automated recognition is hindered by high character similarity and handwriting variability. This study systematically compares two dominant deep learning paradigms, Convolutional Neural Networks (CNNs) and Transformers, to evaluate the crucial trade-off between model accuracy and real-world robustness. Using a transfer learning approach, we trained five models (ResNet50, MobileNetV2, EfficientNetB0, ViT, and DeiT) on a balanced 30-class dataset of Sundanese script. Performance was assessed on a standard in-distribution test set and a challenging, independently collected Out-of-Distribution (OOD) dataset designed to simulate varied real-world conditions. The results reveal a significant performance inversion. While EfficientNetB0 achieved the highest accuracy of 96.9% on in-distribution data, its performance plummeted on the OOD set. Conversely, ResNet50, despite being lower in in-distribution accuracy, proved to be the most robust model, achieving the highest accuracy of 92.5% on the OOD data. This study concludes that for practical applications requiring reliable performance, the generalization capability demonstrated by ResNet50 is more valuable than the specialized accuracy of EfficientNetB0, offering a crucial insight for developing robust digital preservation tools for historical scripts.
Copyrights © 2025