The quality of measurement in educational research is highly dependent on the validity and reliability of the instruments used. However, in practice, many numeracy literacy and problem-solving instruments have not undergone adequate psychometric testing, particularly those that use cultural contexts as question stimuli. This study aims to develop and evaluate instruments for numeracy literacy and problem-solving in sequences and series, grounded in cultural context/local wisdom, using a research and development approach and Rasch modeling. The instrument's content validity was assessed through expert review by learning evaluation experts, mathematics education lecturers, and cultural experts (traditional councils). The analysis was conducted using the average validator score, Aiken's V coefficient, and percentage agreement to ensure the appropriateness of the substance, clarity of language, instrument appearance, and the appropriate integration of cultural context. Empirical trials were conducted in small groups with 71 high school students in grade X with 15-17 years in three regions in Indonesia: Palu City, Donggala Regency, and Sigi Regency, who were selected purposively. The test data were analyzed using Rasch modeling to evaluate item suitability, difficulty level, reliability, separation index, unidimensionality, and potential bias through DIF analysis. The analysis results showed that the developed instrument had very high content validity, strong person and item reliability, and that most items fit the Rasch model. However, the distribution of item difficulty did not fully cover the extreme range of respondents' abilities, and indications of DIF were found in some items in one area, indicating the need for further refinement. Thus, the developed numeracy literacy and problem-solving instrument was deemed suitable based on initial findings at the development stage; however, it still requires further revision and testing before being widely used in a more diverse population. Keywords: numeracy literacy, problem-solving, rasch model, reliability, validity.
Copyrights © 2026