This study is an evaluative research aimed at analyzing the quality of Fiqh exam items for 12th-grade students at Madrasah Aliyah Negeri 1 Surakarta, using the Rasch Model evaluation theory approach. The reason for choosing this model is its ability to simultaneously evaluate both the test items and the ability to address the limitations of classical test theory. The research subjects consisted of 332 students who met the minimum criteria for Rasch analysis according to Linacre (1994). Data was collected through documentation of final exam results and analyzed using the Winstep software. The main component evaluated in this study is the difficulty level of the items. The analysis was conducted by considering the validity and reliability of the instrument. Validity was assessed through item fit (Outfit MNSQ, ZSTD, and Pt Mean Corr), unidimensionality, and bias detection using DIF and DPF. The results of the study showed that two items (S17 and S27) were outliers in terms of difficulty, two other items (S17 and S39) were classified as misfits, and 19 items were identified as biased. This indicates that some items were unfair and inaccurate in measuring students' abilities. In terms of reliability, the person reliability value of 0.79 indicated a fair consistency in student responses, while the item reliability of 0.99 suggested that the items were highly reliable in distinguishing levels of ability. The Cronbach's alpha of 0.89 indicated very good internal consistency. The study recommends revisions to the problematic items in order to improve the validity and fairness of the evaluation instrument.
Copyrights © 2025