Automatic personality recognition (APR) utilizes machine learning to predict personality traits from various data sources. This study aims to predict the big five personality traits from modern standard Arabic (MSA) texts, using both textual and demographic features. The “MSAPersonality” dataset is employed to conduct a comprehensive analysis of features and feature selection methods to evaluate their impact on APR model performance. We compared feature selection algorithms from the filter, wrapper, and embedded-based categories through a systematic experimental design that consisted of feature engineering, feature selection, and regression. This study showed that each trait was more accurately predicted using a distinct set of features. However, age and study level were the most common features among the five traits. Moreover, although there were no statistically significant differences in performance between the feature selection techniques, embedded-based methods offered the best compromise between performance, time, and interpretability. These findings contribute to the understanding of APR in general and among Arabic speakers.
Copyrights © 2025