This research is motivated by the importance of quality evaluation instruments in elementary school mathematics learning, particularly in calculating plane figures, which demands conceptual accuracy and students' numeracy skills. Instruments that do not meet quality standards have the potential to produce inaccurate measurements, impacting learning decisions. The purpose of this study was to analyze the quality of multiple-choice test items based on validity, reliability, difficulty index, and discriminatory power in third-grade elementary school students. The method used was a quantitative descriptive approach, with third-grade students at SD Negeri Inpres Kotabaru in Nabire as the subjects. Data collection was conducted using an objective multiple-choice test with five answer options. Data analysis used the Pearson Product Moment test for validity, the Cronbach's Alpha coefficient for reliability, and analysis of the difficulty index and discriminatory power to assess item quality. The results showed that all test items were valid because their calculated r values were greater than the table r and significance below 0.05. The instrument also had very high reliability, with a Cronbach's Alpha value of 0.946, indicating strong internal consistency. The difficulty index analysis showed that the majority of questions fell into the moderate category, with a distribution between easy and very easy, supporting diverse student ability levels. The discriminatory power analysis showed that the majority of questions fell into the good and excellent categories, effectively discriminating student abilities. These findings contribute to the development of high-quality, empirically analyzed evaluation instruments in elementary school mathematics learning. The study's conclusions confirm that the developed instrument meets high-quality standards and is suitable for use as a learning evaluation tool. Implications of this study include strengthening data-driven evaluation practices, increasing the accuracy of learning outcome assessments, and developing more systematic instruments. Further research recommendations focus on broader instrument testing and the development of technology-based evaluation models to enhance learning effectiveness.
Copyrights © 2026