Claim Missing Document
Check
Articles

Found 2 Documents
Search

BERT-based models for classifying multi-dialect Arabic texts Fouadi, Hassan; El Moubtahij, Hicham; Lamtougui, Hicham; Yahyaouy, Ali
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 3: September 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i3.pp3437-3446

Abstract

The area of natural language processing (NLP) is presently a rapidly developing field characterized by innovation and research. Despite this progress, several dialects of Arabic (DA) are classified as low-resource languages, making it challenging for NLP systems to process DA data. One approach to address this issue is to train NLP models on social media data sets containing DA texts. Therefore, these open-access social media datasets, as outlined in our paper, can serve as a valuable resource for developers and researchers involved in the processing of DA.To create our multilingual corpus, we gathered data from various datasets containing different versions of DA. These datasets will be used to classify texts in terms of sentiment classification, topic classification, and dialect identification. Our study contributes to the automated analysis of the classification of Arabic dialects. We aim to investigate and assess various machine learning and deep learning techniques, with a specific focus on utilizing the BERT model. The results of our experiments on our datasets show that DarijaBERT and DziriBERT trained on a similar DA outperform traditional machine learning methods and previous more general pre-trained models that were trained on multiple dialects or languages.
Improving Arabic handwritten text recognition through transfer learning with convolutional neural network-based models Lamtougui, Hicham; El Moubtahij, Hicham; Fouadi, Hassan; Satori, Khalid
Bulletin of Electrical Engineering and Informatics Vol 13, No 6: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i6.8178

Abstract

Arabic handwritten text recognition is a complex and challenging research domain. This study proposes an offline Arabic handwritten word recognition system based on transfer learning. The system exploits four pre-trained convolutional neural network (CNN) architectures, namely VGG16, ResNet50, AlexNet, and InceptionV3. In addition, a specialized image recognition model derived from the ImageNet dataset is incorporated. A combination strategy is designed to combine transfer learning with specific fine-tuning techniques, aiming to improve recognition accuracy. The study is conducted on the IFN/ENIT dataset, which includes images of Tunisian City and village names. The results show that the proposed system achieves a recognition accuracy of 94.73%, which is significantly higher than the accuracy rates achieved by previous approaches. These results suggest that the proposed system is a promising approach for Arabic handwritten text recognition.