cover
Contact Name
-
Contact Email
-
Phone
-
Journal Mail Official
-
Editorial Address
-
Location
Kab. sleman,
Daerah istimewa yogyakarta
INDONESIA
REiD (Research and Evaluation in Education)
ISSN : -     EISSN : 24606995     DOI : -
Core Subject : Education,
Arjuna Subject : -
Articles 173 Documents
Developing and analyzing items of a physics conceptual understanding test on wave topics for high school students using the Rasch Model Rasyid, Fauziah; Istiyono, Edi; Gunawan, Cahya Widya; Kijambu, John Baptist
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.75575

Abstract

This study aims to develop, validate, and analyze test items for assessing the understanding of mechanical wave concepts among high school students. The test development process followed the Mardapi instrument development model, which includes: (1) constructing test specifications, (2) writing test items, (3) reviewing test items, (4) piloting the test, and (5) analyzing the items. The developed instrument consists of 12 multiple-choice items, covering three aspects of conceptual understanding: translation, interpretation, and interpolation. Content validity was assessed by three validators, and the results were analyzed using the Aiken V method. The instrument was then administered to 257 high school students in South Sulawesi Province. The results were analyzed using Item Response Theory (IRT) with the Rasch model through the Quest program. Item analysis included item fit estimation, reliability, and item difficulty. The content validity test results indicate that the instrument is valid. All items fit the Rasch model, with a reliability coefficient of 0.95, categorized as high reliability. Item difficulty analysis revealed that 8.3% of items were categorized as easy, 8.3% as difficult, and 83.3% as moderate. Overall, the results indicate that the test instrument is of good quality and can be used to assess high school students’ understanding of mechanical wave concepts.
Exploring problem-solving competence in Indonesian language learning: An EFA study using ecological image stimuli Sudaryanto, Memet; Riyadi, Slamet; Hariyadi, Bagus Reza
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.85466

Abstract

The necessity of problem-solving skills has become a core competency that university students must possess, particularly through the appropriate and accurate use of the Indonesian language. This study aims to construct a theoretical framework of problem-solving abilities by analyzing the composition of opinion texts in Indonesian language learning. The research employs Polya’s theoretical approach, integrated with recent studies, and utilizes a quantitative methodology through Exploratory Factor Analysis (EFA). This method is used to examine the validity of the theoretical construction of problem-solving skills within the context of writing opinion texts in Indonesian language learning. The problem-solving theory derived from opinion-based learning was developed to produce a valid measurement instrument. The study began with the development of indicators drawn from various studies on problem-solving competencies. The resulting instrument consists of 19 items administered to students from both science and social studies tracks. A total of 298 first-semester students from Central Java participated in this study. The test reliability estimation yields a standardized alpha of 0.71. The findings include: (1) the adequacy of the sample was confirmed with a KMO-MSA value > 0.5, specifically 0.71, and a significance level of 0.001 on the Bartlett’s test; (2) all items were found to measure problem-solving skills, indicated by anti-image correlation values > 0.5; and (3) the study identified four dimensions of problem-solving skills based on opinion text analysis: initial problem identification, problem resolution, taking tangible action, and evaluation of implemented solutions solution reflection.
Rasch model analysis of essay questions to measure literacy and numeracy skills in plant and animal bioprocess topics based on AKM Wardani, Rianti Tri; Nuraeni, Eni; Diana, Sariwulan
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.85614

Abstract

Literacy skills in reading and numeracy in Indonesia are classified as low, so the government has made new policies, one of which is the application of questions based on the Minimum Competency Assessment (AKM) in the National Assessment. Observation results from several high schools in Pekanbaru, Riau, have not shown the application of AKM questions to biology learning. This research aimed to produce AKM-based reading and numeracy literacy instruments on high school plant and animal bioprocess materials used in Research and Development (R&D) design, where the subjects were grade XII students at three high schools in Pekanbaru city. Data collection instruments were the test instruments that had been developed. Data were analyzed using Rasch modelling assisted by Winstep software, including Wright map analysis, person capability analysis, item capability analysis, scalogram analysis, and question item analysis. The results show that most students have a logic score below 0.0, meaning that reading and numeracy literacy skills and understanding concepts are still low. Four out of 63 students have inappropriate answer patterns, indicating that students did not work on the questions seriously. Then, the analysis results show 14 fit questions; having a Cronbach alpha value of 0.84 with a very high interpretation, a person reability value of 0.77 with sufficient interpretation, and an item reability value of 0.92 in the very high category; the difficulty level of the question items is in line with the rules of the test instrument development, since there is a spread of difficulty of the question items starting from very difficult, hard, medium, easy, and very easy. Thus, it is concluded that the instrument test can be used to measure reading literacy and numeracy skills based on AKM on plant and animal bioprocess materials.
Differential item functioning analysis of Arabic language exams across gender, study specialization, and geographic region in senior high schools Bakti, Anugrah Arya; Marzuki, Marzuki; Ibrahim, Zulfa Safina; Tuanaya, Rugaya; binti Yacob, Nur Yusra
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.85961

Abstract

This study aims to examine the fairness of Arabic language assessment instruments used in Muhammadiyah senior high schools by detecting the presence of Differential Item Functioning (DIF) in the Final Semester Summative Test (UAS) for 12th-grade students in the Special Region of Yogyakarta during the 2023/2024 academic year. Using a descriptive quantitative design, the research analyzed student response data from 1,157 participants across 25 schools. Data collection was conducted through documentation of test blueprints, item sheets, answer keys, and student responses. Analysis was performed using the Lord and Generalized Lord methods within the framework of Item Response Theory (IRT), focusing on three demographic variables: gender, study specialization (science vs. social studies), and school region (Yogyakarta City, Sleman, Bantul, and Kulon Progo). The Rasch model was identified as the most optimal model due to its superior fit and fulfillment of key psychometric assumptions, including unidimensionality and parameter invariance. The findings indicate that several items exhibit significant DIF across all examined variables. Eleven items showed gender-based DIF, with a higher number favoring male students. Twenty-three items demonstrated DIF by study specialization, and thirty-seven items displayed DIF based on school region, with students from Yogyakarta City benefiting the most. These results suggest that the test is not fully equitable and highlight the need for item revision to ensure fairness. The study contributes theoretically to the field of educational measurement and practically to the development of fairer evaluation practices in Islamic and language education settings.
Enhancing the five-tier diagnostic test on cell concepts through Rasch model analysis Silaban, Oky Rizkiana; Kusnadi, Kusnadi; Wulan, Ana Ratna
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.87212

Abstract

Misconceptions in biology can prevent students from gaining a deeper understanding of biological concepts. There is a five-tier diagnostic test that can explore the misconceptions experienced by students. This research aims to develop a five-tier diagnostic test that is feasible to use to identify student misconceptions with Rasch model analysis. The research method used was quantitative descriptive, and the sample was 103 people with a purposive sampling technique, which is a purposeful sampling technique, namely, the school that is the research location, experiencing misconceptions related to the concept of cells. Based on the results of the study, it was found that the five-tier diagnostic test developed was very feasible to use as an instrument to identify students' misconceptions on the concept of cells. Each indicator is represented by several items that have been tested for validity, reliability, difficulty level and differentiation using Rasch model analysis with the help of the Winsteps program. Based on the analysis with the Rasch model, out of 36 items that were externally validated, 23 items were obtained that met the eligibility criteria and were declared valid for implementation.
Enhancing active citizenship: Developing assessment tools for digital literacy and critical thinking in Indonesian Civics Education Tirza, Juliana; Tambunan, Aripin; Parwati, Ni Nyoman; Cendana, Wiputra
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.83859

Abstract

This study aims to develop instruments to measure digital literacy and critical thinking skills within the Civics Education course, addressing the challenge of assessing these essential competencies in the context of modern education. As digital literacy and critical thinking become increasingly crucial for active citizenship, there is a lack of comprehensive and reliable tools to evaluate these skills effectively. The research employs a development method using the 4D model, consisting of four phases: Define, Design, Develop, and Disseminate. In the Define phase, the competencies to be assessed were clearly identified. In the Design phase, the instruments were crafted based on specific indicators of digital literacy and critical thinking. The Develop phase involved testing the reliability and validity of the instruments, while the Disseminate phase prepared the instruments for broader use. The critical thinking instrument was found to have excellent internal consistency, with a Cronbach’s Alpha of 0.908. However, certain items exhibited low item-total correlations, indicating that revisions were necessary. This study contributes to filling the gap in Civics Education by providing a reliable and valid tool for evaluating digital literacy and critical thinking, ultimately supporting the enhancement of students' competencies in these crucial areas for active and informed citizenship.
Critical thinking in math: 10th-grade analysis using cognitive diagnostic modeling Gunawan, Muhammad Ali; Amalia, Fitri; Setiawan, Ari; Ab Ghani, Hawa Husna
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.88074

Abstract

Critical thinking is widely recognized as an essential competency in mathematics education, yet assessments often fail to capture its multidimensional nature. This study applied a Bayesian Cognitive Diagnostic Modeling (G-DINA) approach to identify the mastery profiles of tenth-grade students in Indonesia across four attributes: interpretation, analysis, evaluation, and inference. Data from 60 students revealed that most learners demonstrated partial rather than full mastery, with consistent challenges in evaluative reasoning and inference. These diagnostic profiles provide actionable insights for teachers, enabling more targeted instructional strategies that go beyond total test scores. The findings highlight the potential of Bayesian CDMs to enhance classroom assessment by offering fine-grained evidence of students’ reasoning patterns. This study contributes novelty by being among the first to implement Bayesian cognitive diagnosis in mathematics education within the Indonesian context, bridging methodological innovation with practical implications for teaching and assessment.
A comparison of the stability of ability parameter estimation based on the maximum likelihood and Bayesian estimation: A case study of dichotomous scoring test results Putri, Faradila Ilena; Retnawati, Heri; Kardanova, Elena
REID (Research and Evaluation in Education) Vol. 11 No. 1 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i1.89463

Abstract

This research is related to Item Response Theory (IRT), which is essential for determining the best method for estimating participants' abilities on a test measuring English listening ability. This study aims to (1) determine the characteristics of the test device measuring English listening ability, (2) determine the effect of the length of the test on the stability of the ability estimation using the maximum likelihood (ML) method, (3) determine the effect of test length on the stability of the ability estimation using the Bayes method, and (4) compare the stability of the ability estimate between ML and Bayes. This research is an exploratory descriptive study using a simulation approach. The best model is selected to generate data. The result of the generation is the actual ability (θ) and the participant's response, which is estimated with the maximum likelihood and Bayes, which produces the estimated ability with 10 replications, and is compared with calculating the MSE (mean square error). The method with a smaller MSE is stable and has a better estimation method. The results show that (1) the 2PL model is the best, (2) the length of the test affects the stability of the ability estimation in the ML method and the most stable case when the test contains 46 items, (3) the length of the test affects the stability of the ability estimate in the Bayes method and it is most stable when the test contains 46 items, and (4) the Bayes method is better and more accurate for estimating ability.
Assessment and measurement bias in madrasa performance evaluation: Evidence from underdeveloped areas in Indonesia Prihono, Eko Wahyunanto; Latuapo, Ridhwan; Lapele, Fitria; Dwiningrum, Siti Irene Astuti; Arlinwibowo, Janu; Reyes Jr., Margarito Surbito
REID (Research and Evaluation in Education) Vol. 11 No. 2 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i2.71444

Abstract

Inequality is a condition characterized by an unbalanced assessment process. Physical and psychological factors, both those measuring and those being measured, may impact assessment inequality. The purpose of this research was to highlight the potential inequity in the performance assessment of madrasas in underdeveloped areas. A quantitative research design was employed. The data were collected using a questionnaire instrument that had been proven valid and reliable.  Path analysis was used to determine both direct and indirect effects. The findings showed that measurement errors related to the instruments used have a direct positive effect on inequality in the performance assessment of madrasas in underdeveloped areas, as well as an indirect effect mediated through teacher quality. One alternative solution to reducing the imbalance in assessing the performance of madrasas in underdeveloped areas can be implemented through policy dimensions, including macro, meso, and micro dimensions.
Enhancing student achievement: Developing a Differentiated Instruction Formative Assessment Model (DIFAM) Asriadi AM, Muh.; Hadi, Samsul; Istiyono, Edi; Sanam, Anna Isabela; Sultan, Jumriani; Kassymova, Gulzhaina K.
REID (Research and Evaluation in Education) Vol. 11 No. 2 (2025)
Publisher : Graduate School of Universitas Negeri Yogyakarta & Himpunan Evaluasi Pendidikan Indonesia (HEPI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21831/reid.v11i2.87869

Abstract

An ideal assessment should offer constructive feedback and insights into students' strengths and weaknesses in learning. This study aims to develop an assessment model integrating formative assessment with Differentiated Instruction (DIFAM) to assess learning achievements proportionally. The DIFAM model was developed using the ADDIE development framework. The research sample consisted of 99 students from four high schools in Bandung Regency. Student learning profiles were analyzed using N-Gain and paired sample t-test. Data analysis was conducted with R Studio and JASP software. Data analysis using the N-Gain formula revealed an average improvement in student learning achievements of 25% with the implementation of DIFAM. The formative tests conducted over eight sessions showed that students grasped the material more effectively compared to conventional teaching methods. Feedback from students and teachers indicated that DIFAM facilitated more structured learning and provided constructive feedback, contributing significantly to enhanced student performance. The DIFAM model demonstrates its ability to cater to diverse student needs, achieve significant learning improvements, and has the potential for broader application to ensure more inclusive and equitable learning outcomes.