Claim Missing Document
Check
Articles

Found 3 Documents
Search

MSAPersonality: a modern standard Arabic dataset for personality recognition Chraibi, Khaoula; Chaker, Ilham; Dhassi, Younes; Zahi, Azeddine
International Journal of Electrical and Computer Engineering (IJECE) Vol 14, No 4: August 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v14i4.pp4498-4507

Abstract

Automatic personality recognition is a task that attempts to automatically infer personality traits from a variety of data sources, including Text. Our words, whether spoken or written, reveal a lot about who we are. As people speak different languages, each with its own set of characteristics and level of complexity, identifying their personalities automatically might be language-dependent. This task requires an annotated text corpus with personality traits. However, the lack of corpora for languages other than English makes the task extremely challenging. We concentrated our efforts in this paper on the Arabic language in particular because it is understudied and lacks a corpus, despite being one of the most widely spoken languages in the world. Our primary goal was constructing our “MSAPersonality” dataset, which consists of 267 texts in modern standard Arabic that have been annotated with the Big Five personality traits. To evaluate the dataset and its potential for classification and regression, we used text preprocessing techniques, feature extraction, and machine learning algorithms. We obtained promising experimental results. Therefore, further research into predicting personality from Arabic text can be conducted.
A study of feature extraction for Arabic calligraphy characters recognition Zoizou, Abdelhay; Errebiai, Chaimae; Zarghili, Arsalane; Chaker, Ilham
International Journal of Electrical and Computer Engineering (IJECE) Vol 14, No 1: February 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v14i1.pp870-877

Abstract

Optical character recognition (OCR) is one of the widely used pattern recognition systems. However, the research on ancient Arabic writing recognition has suffered from a lack of interest for decades, despite the availability of thousands of historical documents. One of the reasons for this lack of interest is the absence of a standard dataset, which is fundamental for building and evaluating an OCR system. In 2022, we published a database of ancient Arabic words as the only public dataset of characters written in Al-Mojawhar Moroccan calligraphy. Therefore, such a database needs to be studied and evaluated. In this paper, we explored the proposed database and investigated the recognition of Al-Mojawhar Arabic characters. We studied feature extraction by using the most popular descriptors used in Arabic OCR. The studied descriptors were associated with different machine learning classifiers to build recognition models and verify their performance. In order to compare the learned and handcrafted features on the proposed dataset, we proposed a deep convolutional neural network for character recognition. Regarding the complexity of the character shapes, the results obtained were very promising, especially by using the convolutional neural network model, which gave the highest accuracy score.
Predicting personality traits from Arabic text: an investigation of textual and demographic features with feature selection analysis Chraibi, Khaoula; Chaker, Ilham; Zahi, Azeddine
International Journal of Electrical and Computer Engineering (IJECE) Vol 15, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v15i1.pp970-979

Abstract

Automatic personality recognition (APR) utilizes machine learning to predict personality traits from various data sources. This study aims to predict the big five personality traits from modern standard Arabic (MSA) texts, using both textual and demographic features. The “MSAPersonality” dataset is employed to conduct a comprehensive analysis of features and feature selection methods to evaluate their impact on APR model performance. We compared feature selection algorithms from the filter, wrapper, and embedded-based categories through a systematic experimental design that consisted of feature engineering, feature selection, and regression. This study showed that each trait was more accurately predicted using a distinct set of features. However, age and study level were the most common features among the five traits. Moreover, although there were no statistically significant differences in performance between the feature selection techniques, embedded-based methods offered the best compromise between performance, time, and interpretability. These findings contribute to the understanding of APR in general and among Arabic speakers.