Batik is a cultural heritage of Indonesia that reflects local philosophies and identities through its diverse motifs. In the digital era, automatic classification of batik patterns plays a crucial role in cultural preservation, education, and commercialization. This study aims to develop a batik motif classification system using Vision Transformer (ViT), a deep learning architecture based on self-attention capable of capturing global spatial relationships in images. The dataset comprises 800 images spanning 20 batik motif classes from various regions, divided into training and testing subsets. The ViT model was fine-tuned using pretrained weights from ImageNet-21k, with standard preprocessing and data augmentation applied to the training set. Model performance was evaluated using accuracy, precision, recall, F1-score, confusion matrix, and prediction visualization. Results indicate that ViT achieved an overall accuracy of 96%, with most classes recording F1-scores above 0.90. Evaluation on unseen batik images demonstrated excellent generalization capability, achieving 99.94% confidence in prediction. These findings suggest that ViT is an effective and efficient architecture for batik motif classification and offers valuable contributions to cultural preservation through artificial intelligence.
Copyrights © 2025