The development of multiple-choice test items is often carried out without undergoing an item analysis stage, resulting in instruments that are less optimal in measuring students’ reasoning abilities. This study aims to examine the feasibility and quality of multiple-choice statistical test items to effectively measure reasoning skills. This research employed a descriptive method with a quantitative approach. The subjects consisted of content experts, learning evaluation experts, language experts, individual trial participants, small group trial participants, and 20 students who completed 12 test items. Data were collected through tests, observations, interviews, and questionnaires, and analyzed using Microsoft Excel 2010. The instrument development followed the ADDIE model, incorporating stages of difficulty index analysis, discrimination power, distractor effectiveness, validity, and reliability testing. Validation results indicated very high feasibility ratings from content experts (96%), learning evaluation experts (96%), and language experts (94%). Individual and small group trials also demonstrated very high feasibility, scoring 81% and 82% respectively. Item analysis revealed that all items had moderate difficulty levels, good to very good discrimination power, good distractor effectiveness, high to very high validity, and a reliability coefficient of 0.902. The study concludes that the developed multiple-choice statistical test items are highly feasible for measuring students’ reasoning skills and can serve as a reference for developing similar instruments in the field of education.
Copyrights © 2025