The importance of learning evaluation must be supported by the use of high-quality assessment instruments. This study aims to analyze the quality of essay test items based on their validity, reliability, difficulty level, and discrimination index in the Grade VIII Javanese Language Final Semester Examination in Sukoharjo Regency for the 2023/2024 academic year. The research employed a quantitative descriptive method. The data consisted of test questions, test blueprints, answer keys, scoring rubrics, and 64 students’ answer sheets selected through simple random sampling. The analysis was conducted using the Classical Test Theory approach, including validity testing (Product Moment correlation), reliability testing (Cronbach’s Alpha), item difficulty analysis, and discrimination index analysis. The results revealed that all test items were invalid (r-count < r-table at the 0.05 significance level), although the instrument demonstrated very high reliability (0.990). All items were categorized as easy, and most showed low or even negative discrimination indices. These findings indicate that the instrument has not accurately measured students’ competencies nor effectively differentiated students’ ability levels. Therefore, a comprehensive revision of the test construction is necessary before it is used as an official evaluation instrument.
Copyrights © 2026