Advances in artificial intelligence and machine learning have created new opportunities for improving Quran memorization methods. A key requirement for such innovation is the availability of structured and representative datasets specifically designed for Quran memorizers. This study presents the design and development of a dataset that captures demographic characteristics, daily learning behavior, and temporal memorization patterns from 350 students across three Islamic boarding schools in Indonesia. Through preprocessing stages, including normalization, discretization, and feature engineering, the dataset was prepared for Hidden Markov Model (HMM)-based analysis. Experimental results show that the model achieved an accuracy of 30.57%, precision of 71.46%, recall of 30.57%, and F1-score of 37.70% in predicting memorization states. These findings indicate that the proposed dataset provides a useful foundation for modeling memorization progress and supporting adaptive learning path recommendations. However, the study is limited by the relatively small dataset and the modest predictive performance of the initial HMM model. Overall, this work provides an important first step toward building an intelligent and personalized data-driven Quran memorization system.
Copyrights © 2026