Arithmetic word problems (MWP) are a fundamental component of elementary mathematics education that integrate linguistic comprehension with quantitative reasoning. In practice, collections of MWPs are commonly organized based on teacher intuition or broad curriculum categories, which are inherently subjective and often fail to reflect the true mathematical similarity between problems. This study aims to classify Indonesian arithmetic word problems based on their underlying relational structures using Hierarchical Agglomerative Clustering (HAC). The dataset consists of 897 elementary-level arithmetic word problems represented through 143 binary features encoding five relational dimensions, namely combine, change, compare, equal groups, and fair division. Hamming Distance is employed as the dissimilarity metric, and clustering is performed using the complete linkage method. The optimal number of clusters is determined using three internal validity indices: the Calinski–Harabasz Index, Silhouette Score, and Davies–Bouldin Index. Although statistical indices favor smaller cluster configurations, four clusters are selected as the optimal number based on domain-specific interpretability, as they align with established theoretical categories of arithmetic relational structures. This approach effectively identifies latent structural patterns within the dataset and demonstrates the potential of feature-based binary representation combined with HAC for systematic MWP classification. The findings offer practical support for adaptive problem bank development, automated curriculum analysis, and intelligent tutoring system design.
Copyrights © 2026