Evaluating students’ abilities in educational settings is crucial for assessing learning outcomes and instructional effectiveness. In Indonesia, many schools have developed local English language assessments, yet these tests often lack psychometric validation. This study aims to evaluate the quality of a teacher-developed English language test instrument using the Item Response Theory (IRT) approach. A total of 25 multiple-choice items created by the English teacher group in Muna Regency were administered to 162 students from five randomly selected schools. A descriptive quantitative method was employed with the aid of R Studio for data analysis. Initial sample adequacy was confirmed using the Kaiser-Meyer-Olkin (KMO = 0.686) and Bartlett’s Test of Sphericity (p < .001). The study applied model fit analyses for 1-PL, 2-PL, and 3-PL logistic models, with the 2-PL model emerging as the most appropriate, as 16 items demonstrated good fit. Further analysis of item characteristics under the 2-PL model revealed that only 11 items had acceptable difficulty and discrimination indices. In comparison, the remaining 14 items were either too easy, too complex, or poorly discriminating. These results indicate that a substantial portion of the test requires revision. The study highlights the importance of psychometric evaluation in teacher-made assessments and recommends capacity-building for teachers in test development and validation practices.
                        
                        
                        
                        
                            
                                Copyrights © 2025