Studies in English Language and Education
Vol 11, No 3 (2024)

Computational linguistics and natural language processing techniques for semantic field extraction in Arabic online news

Ahmad, Maulana Ihsan (Unknown)
Anwari, Moh. Kanif (Unknown)



Article Info

Publish Date
30 Sep 2024

Abstract

The research aimed to extract semantic fields from Arabic online news and advance Natural Language Processing (NLP) applications in understanding and managing news information effectively. It provides a comprehensive approach to processing and analyzing large volumes of Arabic news data by integrating semantic field analysis, NLP, and computational linguistics. Using quantitative methods, Arabic news articles were collected and processed with Python, a popular programming language in data analysis, and applied various NLP techniques and machine learning models to accurately extract semantic fields. The primary objective was to evaluate the effectiveness of different classification models in categorizing Arabic news and to identify the most suitable model for semantic field extraction. The research evaluated five classification models: Naive Bayes, Support Vector Machine (SVM), Logistic Regression, Random Forest, and Gradient Boosting. Among these, SVM achieves the highest overall accuracy of 90%. Specifically, SVM demonstrated exceptional performance in categorizing sports-related news, with a 99% probability and an F1-Score of 98%. However, it faced challenges in categorizing health and science news, achieving a lower F1-Score of 79%. Overall, the study demonstrated the effectiveness of computational methods, particularly SVM, in classifying Arabic news and extracting semantic fields, thereby advancing NLP and computational linguistics. The findings highlighted the potential of SVM for accurate news analysis and the need for further enhancement of NLP techniques to address multilingual and domain-specific challenges.

Copyrights © 2024






Journal Info

Abbrev

SiELE

Publisher

Subject

Education Languange, Linguistic, Communication & Media

Description

Studies in English Language and Education (SiELE) is a peer-reviewed academic journal published by the Department of English Education, Faculty of Teacher Training and Education, Universitas Syiah Kuala, Banda Aceh, Indonesia. The journal presents research and development in the field of teaching ...