This study aims to analyze the reliability, discriminatory power, difficulty level, and overall quality of the year-end assessment (PAT) items for the Islamic Cultural History (SKI) subject at MI Supervisor Surabaya. The research employed a descriptive qualitative method and was conducted at MI Supervisor Surabaya with 4th-grade students as the subjects. The analyzed instrument consisted of 25 multiple-choice questions from the SKI PAT. The selection of the 4th-grade level was based on the limited number of previous studies analyzing test items at this level, as well as the accessibility of collaboration with teachers at the school. Data were analyzed through three stages: data reduction, data presentation, and conclusion drawing. The results showed that the reliability coefficient of the evaluation instrument was 0.88, indicating a very high level of reliability. This means the test items would produce consistent results if administered repeatedly. The discriminatory power analysis revealed that 11 items had sufficient quality, 3 items were categorized as good, and item number 12 showed the lowest quality, making it unsuitable for use. Based on the recap of item analysis, 21 items (52%) were considered valid and usable, 5 items (12.5%) required revision, and 14 items (35%) were deemed unsuitable for inclusion in the assessment.
Copyrights © 2025