Claim Missing Document
Check
Articles

Found 2 Documents
Search

Pre-Trained Transformer-Based Approach for Arabic Question Answering: A Comparative Study Jamal, Amani; Alsubhi, Kholoud
Journal of Applied Science, Engineering, Technology, and Education Vol. 7 No. 1 (2025)
Publisher : PT Mattawang Mediatama Solution

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35877/454RI.asci4209

Abstract

Question answering (QA) is one of the most challenging yet widely investigated problems in Natural Language Processing (NLP). Question-answering (QA) systems try to produce answers for given questions. These answers can be generated from unstructured or structured text. Hence, QA is considered an important research area that can be used in evaluating text understanding systems. A large volume of QA studies was devoted to the English language, investigating the most advanced techniques and achieving state-of-the-art results. However, research efforts in the Arabic questionanswering progress at a considerably slower pace due to the scarcity of research efforts in Arabic QA and the lack of large benchmark datasets. Recently many pre-trained language models provided high performance in many Arabic NLP problems. In this work, we evaluate the state-of-the-art pre-trained transformers models for Arabic QA using four reading comprehension datasets which are Arabic-SQuAD), ARCD, AQAD, and TyDiQA-GoldP datasets. We fine-tuned and compared the performance of the AraBERTv2-base model, AraBERTv0.2-large model, and AraELECTRA model. In the last, we provide an analysis to understand and interpret the low-performance results obtained by some models.
Developing a Bilingual English-Arabic Dataset for Textbook Question Answering: A Hybrid Translation and Validation Approach Jamal, Amani
Journal of Applied Science, Engineering, Technology, and Education Vol. 8 No. 1 (2026)
Publisher : PT Mattawang Mediatama Solution

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35877/454RI.asci4628

Abstract

Textbook Question Answering has been a central feature of educational artificial intelligence enabling curriculumaligned machine reading to support personalized learning and diagnostic testing. While there is significant advancement in English-language TQA datasets, there is still a lag in Arabic because of a lack of sufficient highquality domain-specific resources. A new bilingual English-Arabic TQA data is presented in this paper, and it was created using a hybrid translation and validation method. It combines machine translation of CK12-QA dataset with Google sheet translator. Semantic consistency was evaluated using automated metrics based on multilingual sentence embeddings and translation quality scores. Cosine similarity (0.87) and BLEU score (38.5) confirmed strong semantic equivalence and translation reliability across the bilingual dataset. These results demonstrate robust linguistic alignment and completeness. This approach is a balance between conflicting scalability and accuracy in long-standing semantic drift, morphological variation and in context misalignment issues in Arabic education datasets compared to previous efforts to use machine translation or mini-batch annotation only. Output dataset has a parallel format structure of English-Arabic question-answer pair that facilitates simple cross-lingual research in multiple-choice and textbook conditions. By focusing on K-12 science curriculum in specific subject areas, this contribution can enable improved monolingual and cross-lingual educational QA applications model training and testing. This does not only make AI-based learning more inclusive among Arabic students but also provides impetus to creation of cross-lingual transfer learning and benchmarking in TQA. The sources and information are openly published in an attempt to further increase the reproducibility, verifiable peer cooperation and further promote the development of AI in multilingual education