cover
Contact Name
Natris Idriyani
Contact Email
natrisidriyani@uinjkt.ac.id
Phone
-
Journal Mail Official
natrisidriyani@uinjkt.ac.id
Editorial Address
-
Location
Kota tangerang selatan,
Banten
INDONESIA
Jurnal Pengukuran Psikologi dan Pendidikan Indonesia (JP3I)
ISSN : 20896247     EISSN : 26545713     DOI : -
Core Subject : Education, Social,
Jurnal Pengukuran Psikologi dan Pendidikan Indonesia (JP3I) adalah jurnal ilmiah yang diterbitkan oleh Fakultas Psikologi, Universitas Islam Negeri Syarif Hidayatullah Jakarta. Jurnal ini bertujuan untuk memfasilitasi interaksi, diskusi, dan gagasan di antara para ilmuwan psikologi Indonesia. Jurnal ini difokuskan pada Psikologi Pengukuran.
Arjuna Subject : -
Articles 267 Documents
Investigating Differential Item Functioning (DIF) in Geometry Test Scores: Holistic vs. Analytical Scoring Rubrics Ismail, Raoda; Imawan, Okky Riswandha; Retnawati, Heri; Haryanto
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.40842

Abstract

The use of polytomous data in test instruments enables more detailed assessment of test-takers' abilities, but group differences, such as gender, class, and ethnicity, are often overlooked. Differential Item Functioning (DIF) analysis helps determine whether these identities influence test performance. This descriptive quantitative study examines DIF in geometry tests using Holistic and Analytical Scoring Rubrics across gender, class, and ethnic groups. The study involved 102 undergraduate students from Cenderawasih University, Papua, who completed a geometry test with 10 descriptive questions. Two scoring rubrics were used: the Holistic Scoring Rubric with three categories and the Analytical Scoring Rubric with five. Results were analyzed with the difR package in the R Program. The criterion for detecting DIF is a p-value less than 0.05. Findings show that some test items exhibit DIF concerning gender, class, and ethnicity. The DIF detected for the holistic scoring rubric is items 1, 6, 7, and 10 for the gender group; items 1, 4, 5, and 8 for the class group; and items 1, 2, and 6 for the ethnic group. Meanwhile, the DIF detected for the analytic scoring rubric is item 10 for the gender group; items 9 and 10 for the class group; and item 10 for the ethnic group. However, not all DIF items are flawed; some assess fundamental skills. The Analytical Rubric demonstrated slightly higher reliability (alpha 0.903) than the Holistic Rubric (alpha 0.804). These insights support the development of more equitable and sustainable assessment practices, ensuring fairness and inclusivity in educational evaluations.
A Meta-analysis of Coefficient Alpha for the Revised Life Orientation Test (LOT-R) Syahputra, Wahyu; Hayat, Bahrul
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.43272

Abstract

The Revised Life Orientation Test (LOT-R) is one of the most popular instruments for measuring optimism. The authors conducted a meta-analysis, also known as a reliability Generalization study, to estimate the population Cronbach's Alpha for the LOT-R. This study also examined Alpha heterogeneity and identified factors that might influence it. Using a Random Effects Model, the estimated population Cronbach's Alpha based on 211 studies was found to be 0.768 (95% CI: 0.757, 0.778), indicating a good level of reliability. The heterogeneity analysis revealed an I² value of 95.84%, suggesting significant differences among the studies. These differences may stem from factors beyond just methodological variations or population samples. To understand the possible causes for this heterogeneity, a meta-regression analysis was performed, focusing on moderator variables including gender, age, study setting, study type, citation, and language. Among these moderator variables, only the language variable showed a significant impact on the variation in Cronbach's Alpha. In conclusion, the findings suggest that the LOT-R is a reliable instrument for measuring optimism. However, the study also reveals that linguistic factors may influence its reliability, potentially limiting the generalizability of findings across different language groups and in cross-cultural contexts. Therefore, researchers should consider language-related variations when interpreting LOT-R results. Future studies are encouraged to examine further how linguistic differences may affect the instrument's validity and to explore ways to optimize its use across diverse cultural settings.
Exploring Gender Bias in the Dual Continua of Mental Health Measurement: Differential Item Functioning Analysis Aziz, Rahmat; Mangestuti, Retno; Alribdi, Nada Ibrahim; Ali, Maqdisi Firdaus
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.45690

Abstract

Gender differences in mental health are widely reported, yet few studies have examined whether commonly used assessment instruments function equivalently across gender at the item level, particularly within the Dual Continua Model of Mental Health. The present study addresses this gap by evaluating gender-related measurement bias using regression-based Differential Item Functioning (DIF) analysis. A total of 1,674 university students from 32 Indonesian institutions completed the Azira Mental Health Scale (AMHS-24), which measures psychological well-being and psychological distress. DIF was assessed by controlling for latent trait levels to determine whether males and females respond differently to items at equivalent levels of psychological functioning. Results indicate that most well-being items exhibited no DIF, suggesting structural stability across gender for positive emotion, social relationship, and life satisfaction domains. In contrast, several distress items demonstrated uniform and nonuniform DIF, with one item showing strong magnitude. These findings suggest that the distress dimension may be more sensitive to gender-related response tendencies than the well-being dimension. By integrating the dual-continua framework with item-level psychometric analysis, this study clarifies whether observed gender differences reflect construct-relevant variance or differential item functioning, thereby contributing to the theoretical refinement and measurement fairness of mental health assessment.
Psychometric Properties of Indonesian Job-related Affective Well-being Scale: Examining Factor Structure Through CFA and ESEM Silvia Kristanti Tri Febriana; Muhammad Abdan Shadiqi; Ahmad Helmi Nugraha; Meydisa Utami Tanau; Rusdi Rusli; Rahmiyati
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.47083

Abstract

Job Affective Well-Being Scale (JAWS) is a psychological instrument used to measure emotional conditions in the context of work. However, research on the psychometric properties of JAWS in the Indonesian context remains limited. This study aims to test the construct validity of JAWS among 410 Indonesian workers (mean = 34.98; SD = 9.51) using a competing models strategy. This study compares three approaches: Confirmatory Factor Analysis (CFA), Exploratory Structural Equation Modelling (ESEM), and Set-ESEM. The results of the model comparison analysis indicated that the Modified 4-Factor Set-ESEM model was the best, balancing statistical accuracy and parsimony and accommodating natural cross-loadings among items within the same valence. The evaluation of measurement invariance across genders confirmed the instrument's stability at the configurational and metric levels, as well as partial scalar invariance. The final JAWS Indonesian version model with the Set-ESEM model shows that this instrument is valid, reliable, and fair in measuring affective well-being at work, and supports its use for research and organisational intervention purposes in Indonesia.
Applying the Rasch Model as a Diagnostic Tool for Rating Scale Refinement Asrijanty
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.50616

Abstract

The Rasch model has been widely used in educational, psychological, and health research to evaluate the measurement quality of instruments. In many applications, however, Rasch analysis is primarily reported to support validation or confirm the adequacy of a scale. Although diagnostic analyses may be conducted during instrument development, their role in informing substantive instrument refinement is less explicitly documented and therefore less visible in the literature. This study aims to demonstrate how the Rasch model can be applied as a diagnostic tool to support the refinement of rating scales. Using empirical data from an attitude scale, the study illustrates how detailed Rasch outputs—such as item fit, response category functioning, and threshold ordering—can be interpreted to identify specific sources of measurement problems. These insights provide a basis for targeted revisions, demonstrating how Rasch analysis can contribute not only to validation but also to iterative instrument refinement. This study contributes to the methodological literature by highlighting a more comprehensive use of the Rasch model that integrates validation and diagnostic purposes. It also provides practical guidance for researchers, particularly those less familiar with Rasch analysis, on how to use model outputs to improve measurement instruments, especially rating scales.
A Rasch Analysis of The Indonesian Version of Proactive Personality Scale Prihatsanti, Unika; Prabowo, Dito Aryo; Prasetyo, Anggun Resdasari; Dewi, Kartika Sari
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.51040

Abstract

Proactive personality reflects an individual’s tendency to initiate change and influence the environment, particularly in organizational contexts. This study aimed to examine the psychometric properties of the Indonesian version of the Proactive Personality Scale (PPS) using Rasch measurement modeling. A cross-sectional survey design was employed involving 307 employees from various organizations in Indonesia (40.4% male, 59.6% female; Mage = 32.46, SD = 8.38), recruited through convenience sampling. The instrument was translated and culturally adapted in accordance with International Test Commission (ITC) guidelines. Data were analyzed using the Rasch Rating Scale Model with Winsteps, applying Joint Maximum Likelihood Estimation to evaluate dimensionality, item fit, reliability and rating scale functioning. The Rasch model explained 44.1% of the variance, supporting the unidimensional assumption. The results suggest adequate person separation and stable item calibration and indicate good internal consistency (Cronbach’s α = 0.81), acceptable person reliability (0.77), also excellent item reliability (0.98), However, rating scale analysis revealed suboptimal functioning in lower response categories, indicating that collapsing categories may improve measurement precision. Overall, the Indonesian PPS demonstrates acceptable psychometric properties for research and organizational assessment purposes. Nevertheless, refinement of the response format and further validation using more diverse samples are recommended to enhance measurement precision and generalizability.
Adaptation and Validation of the Indonesian Version of Behavioral Activation for Depression Scale – Short Form (BADS-SF) Using Confirmatory Factor Analysis and Rasch Modeling Salma, Salma; Handoyo, Restu Tri; Puspitasari, Ajeng J.; Hidayat, Rahmat; Leonard, Rachel; Kanter, Jonathan
JP3I (Jurnal Pengukuran Psikologi dan Pendidikan Indonesia) Vol. 15 No. 1 (2026): JP3I
Publisher : FAKULTAS PSIKOLOGI UIN SYARIF HIDAYATULLAH JAKARTA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jp3i.v15i1.51044

Abstract

The Behavioral Activation for Depression Scale – Short Form (BADS-SF) is a brief instrument designed to measure activation levels in individuals undergoing behavioral activation (BA) therapy for depression. With the broad potential for implementing BA therapy, including in Indonesia, the BADS-SF needs to be cross-culturally adapted. While the original validation study supported a two-factor model, subsequent adaptations, particularly in non-Western settings, have yielded inconsistent results. In addition, given the practical application of the BADS-SF total score as an overall indicator of behavioral activation, it is essential to evaluate its psychometric properties as a unidimensional measure. The objectives of this study are: (1) to culturally adapt the BADS-SF into the Indonesian language (Bahasa Indonesia); (2) to examine its underlying factor structure and convergent validity; and (3) to evaluate its psychometric properties using item response theory (IRT), specifically Rasch modeling. The cultural adaptation process demonstrated good content validity based on Aiken’s V (.75 to 1 for all items) and confirmed the scale readability. The confrmatory factor analysis (CFA) results revealed a different factor structure than the original, with a 6-item one-factor model providing the best fit and confirmed convergent validity. Rasch analysis further showed that the Indonesian version of BADS-SF had good psychometric properties, including excellent item reliability, acceptable person reliability, and supported unidimensionality following the exclusion of a misfitting item (Item 8). Based on the overall results, the 6-item version of BADS-SF Bahasa Indonesia is recommended to be used for both research and clinical practice. Further implications of the findings are discussed.