Learning evaluation is a crucial component for measuring the achievement of students' competencies. In the context of Islamic Education (PAI), evaluation instruments often face obstacles related to an imbalanced proportion of difficulty levels and item quality bias. This article aims to deeply examine the basic concepts, classification, implementation, and follow-up actions of item difficulty level analysis in PAI subjects. Utilizing a literature review approach (conceptual/literature review), this article analyzes psychometric theories relevant to the broad characteristics of PAI material, spanning from textual cognitive aspects to Higher Order Thinking Skills (HOTS). The results of the study indicate that difficulty level analysis serves a diagnostic function to assess exam fairness and the effectiveness of teacher pedagogy. The implementation of an ideal difficulty index (P) moves toward a balanced proportion of easy, medium, and difficult questions. These findings recommend the need for a Standard Operating Procedure (SOP) to follow up on evaluation results, such as multicultural-based editorial improvements and storing high-quality items in a question bank to realize a continuous quality improvement cycle for PAI evaluations.
Copyrights © 2026