Claim Missing Document
Check
Articles

Found 1 Documents
Search

Automatic Classification of Artificial Intelligence Generated Question Difficulty Levels: Klasifikasi Otomatis Tingkat Kesulitan Soal Hasil Kecerdasan Buatan Najahah, Vina; Pujianto, Utomo
Indonesian Journal of Innovation Studies Vol. 27 No. 1 (2026): January
Publisher : Universitas Muhammadiyah Sidoarjo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21070/ijins.v27i1.1880

Abstract

General Background: Determining question difficulty is a fundamental requirement in educational assessment to support valid evaluation and systematic question curation. Specific Background: The increasing use of artificial intelligence for automatic question generation produces large volumes of linguistically diverse items, making manual difficulty labeling time-consuming and subjective. Knowledge Gap: Despite extensive research on text-based difficulty prediction, lightweight and reproducible pipelines for multi-level difficulty classification of AI-generated questions remain limited. Aims: This study aims to develop and evaluate an automatic classification pipeline for three difficulty levels of AI-generated multiple-choice questions using TF-IDF text representation and a Random Forest classifier. Results: The proposed pipeline achieved a test accuracy of 70.98%, exceeding the random guessing baseline, with the highest F1-score observed in the easy class (78.45%) and the lowest in the medium class (65.32%), indicating greater ambiguity in intermediate difficulty questions. Novelty: This study presents a reproducible and interpretable classification workflow specifically applied to expert-labeled AI-generated questions with high inter-rater reliability. Implications: The findings support the use of lexical feature–based classification as an initial pre-curation and difficulty filtering tool in AI-assisted educational assessment systems. Highlights • The classification pipeline distinguishes three difficulty levels using only textual features• Medium difficulty questions exhibit the highest classification ambiguity• Lexical patterns contribute consistently to difficulty level separation Keywords Question Difficulty Classification; AI Generated Questions; TF-IDF Representation; Random Forest Classifier; Educational Assessment