JURNAL MEDIA INFORMATIKA BUDIDARMA
Vol 8, No 2 (2024): April 2024

Multilabel Classification in Indonesian Translation of Religious Text using Word Centrality Term Weighting

Dewantara, Muhammad Pascal (Unknown)
Lhaksmana, Kemas Muslim (Unknown)



Article Info

Publish Date
30 Apr 2024

Abstract

This research focuses on enhancing the understanding of the Quran in the Indonesian translation dataset by employing a word centrality that feeds into a classifier model. The primary goal is to compare the hamming loss score from the TF-IDF and TW-IDF feature extraction methods in the Indonesia translation case study. The TF-IDF is commonly used in prior research. It has a higher hamming loss (which is worse in accuracy) than the TW-IDF incorporating centrality measurement more specifically in degree and closeness centrality. This research adds eigenvector centrality for a new compartment from the other methods. We used SVM, Random Forest (Bagging), and AdaBoost (Boosting) for the classifier model, with Mutual Information as the feature selection method. In evaluating the classifier, Hamming Loss is used given that the method is suitable for multilabel classification. Results indicate that the centrality measurement value for the term weighting method offers a significant improvement over regular TF-IDF. Each centrality method gives the best Hamming Loss score in each classifier model. Degree centrality gets 0.1275 in SVM, closeness centrality gets 0.1367 in AdaBoost, and eigenvector centrality gets 0.1204 in Random Forest. However, eigenvector centrality still can be a strong measurement method to lower the Hamming Loss score. Random Forest and AdaBoost give a significance better over SVM.

Copyrights © 2024






Journal Info

Abbrev

mib

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

Decission Support System, Expert System, Informatics tecnique, Information System, Cryptography, Networking, Security, Computer Science, Image Processing, Artificial Inteligence, Steganography etc (related to informatics and computer ...