Claim Missing Document
Check
Articles

Found 1 Documents
Search

Unsupervised Clustering of Handwritten Essay Answer Images Using Vision Transformer Mohamad Asyqari Anugrah; Yaya Wihardi; Rani Megasari
Jurnal Komputer Teknologi Informasi Sistem Informasi (JUKTISI) Vol. 4 No. 2 (2025): September 2025
Publisher : LKP KARYA PRIMA KURSUS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62712/juktisi.v4i2.517

Abstract

This study explores the use of deep clustering methods to automatically group handwritten essay answer sheets based on their visual patterns. Feature extraction was performed using three backbone models: ResNet-50, Vision Transformer (ViT-base), and Tr-OCR. These features were then clustered using two unsupervised algorithms—K-means (with k=5) and HDBSCAN (with minimum cluster size = 10). To enhance clustering performance, a deep clustering approach was implemented by applying K-means iteratively to refine feature representations. Evaluation was conducted both quantitatively, using Silhouette Score, Davies-Bouldin Index, and Calinski- Harabasz Score, and qualitatively, through t-SNE visualizations and cluster content inspection. The ViT and Tr-OCR backbones outperformed CNN-based ResNet-50, achieving higher cluster cohesion and separation. Notably, the final clustering result using ViT with HDBSCAN reached a Silhouette Score of 0.772, Davies-Bouldin Index of 0.369, and Calinski-Harabasz Score of 408.006. The findings indicate that vision transformer-based models are more effective for unsupervised grouping of handwritten visual data. This approach can assist educators in accelerating and objectifying the grading process and may serve as a foundation for future automated essay evaluation systems integrating OCR and NLP techniques.