Claim Missing Document
Check
Articles

Found 1 Documents
Search

Enhancing Multi-Label News Text Classification for an Understudied Language: A Comprehensive Study on CNN Performance and Pre-Trained Word Embeddings Rundasa, Diriba Gichile; Ramu, Arulmurugan
International Journal of Engineering, Science and Information Technology Vol 5, No 4 (2025)
Publisher : Malikussaleh University, Aceh, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52088/ijesty.v5i4.987

Abstract

Today's news texts are classified using a multi-label system, which allows for the assignment of a potentially large number of labels to specific instances. The majority of earlier scholars have only looked into mutual exclusion at a single level. Nonetheless, the primary goal of this study was to categorise the news material using multiple labels. Many text documents are created these days from a variety of offline and internet sources. This generated news text is disordered state. As a result, timely access to the needed content from the sources is challenging. Compared with traditional text classification, multi-label classification is difficult and challenging because of its multi-dimensional labels. Convolutional neural networks are used in this study's tests on the problem domain for Afaan Oromo multi-label news text classification due to their ease of assimilation of pre-trained word embeddings. According to pre-trained word embedding with a train-test ratio of 10/90, the new proposed model has shown improved performance. The suggested CNN models might be helpful for labelling news articles in Afaan Oromo news text. The goal of many researchers working on Afaan Oromo classifier development is to use various learning algorithms to boost classification accuracy as the number of categories or labels increases. Using various approaches, they attempted to use basic machine learning methods to address the calculation time issue. Unfortunately, all low-resource language researchers focus on flat, hierarchical, and multi-class classification types, but we created a model for multi-label text classification and attempted to apply it using a deep learning algorithm. Over 5640 Afaan Oromo news dataset items are analysed experimentally over eight main news categories. Python served as our experimental platform for both text classification and word embedding. After the model is fully implemented, the best result of the precision, recall, F1 score and accuracy rate train test ratio of 10/90 for pertained word_ embedding is 89.7%, 88.6%,  93.3% and 96.5, respectively.