This study aims to analyze the quality of test items in a psycholinguistics course using the Classical Test Theory. The data used consisted of 30 students’ responses to 20 multiple-choice test items, which were analyzed using several indicators: difficulty level, discriminatory power, item-total correlation, and instrument reliability. The results showed that most items were in the moderate difficulty category (p = 0.46–0.56), with one item categorized as easy (p = 0.73) and two items as difficult (p = 0.26). The discriminatory power of the majority of items was in the very good category (87.5%–100%), while three items showed lower discriminatory power and required revision. The item-total correlation was generally very high (r = 0.88–0.99), indicating consistency among items, but several items with lower correlations (r < 0.70) suggested possible wording inaccuracies or content inconsistencies. The test’s reliability reached 0.99, indicating very high internal consistency, although this value was influenced by the quite extreme response patterns between the upper and lower groups. Overall, the test instrument was considered good, but several items needed revision, particularly in terms of distractors, difficulty level, and item functionality, to ensure more accurate and representative learning evaluations.
Copyrights © 2025