A. Adeleke
Universiti Tun Hussein Onn Malaysia

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

A two-step feature selection method for quranic text classification A. Adeleke; N. A. Samsudin; Z. A. Othman; S. K. Ahmad Khalid
Indonesian Journal of Electrical Engineering and Computer Science Vol 16, No 2: November 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v16.i2.pp730-736

Abstract

Feature selection is an integral phase in text classification problems. It is primarily applied in preprocessing text data prior to labeling. However, there exist some limitations with the FS techniques. The filter-based FS techniques have the drawback of lower accuracy performance while the wrapper-based techniques are highly computationally expensive to process. In this paper, a two-step FS method is presented. In the first step, chisquare (CH) filter-based technique is used to reduce the dimensionality of the feature set and then wrapper correlation-based (CFS) technique is employed in the second step to further select most relevant features from the reduced feature set. Specifically, the ultimate aim is to reduce the computational runtime while achieving high classification accuracy. Subsequently, the proposed method was applied in labeling instances of the input data (Quranic verses) using standard classifiers: naïve bayes (NB), support vector machine (SVM), decision trees (J48). The results report the proposed method achieved accuracy result of 93.6% at 4.17secs.
Automating quranic verses labeling using machine learning approach A. Adeleke; N. Samsudin; A. Mustapha; S. Ahmad Khalid
Indonesian Journal of Electrical Engineering and Computer Science Vol 16, No 2: November 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v16.i2.pp925-931

Abstract

Classification of Quranic verses into predefined categories is an essential task in Quranic studies. However, in recent times, with the advancement in information technology and machine learning, several classification algorithms have been developed for the purpose of text classification tasks. Automated text classification (ATC) is a well-known technique in machine learning. It is the task of developing models that could be trained to automatically assign to each text instances a known label from a predefined state. In this paper, four conventional ML classifiers: support vector machine (SVM), naïve bayes (NB), decision trees (J48), nearest neighbor (k-NN), are used in classifying selected Quranic verses into three predefined class labels: faith (iman), worship (ibadah), etiquettes (akhlak). The Quranic data comprises of verses in chapter two (al-Baqara) of the holy scripture. In the results, the classifiers achieved above 80% accuracy score with naïve bayes (NB) algorithm recording the overall highest scores of 93.9% accuracy and 0.964 AUC.