Claim Missing Document
Check
Articles

Found 1 Documents
Search

Multi-Label Topic Classification on the Qur'an using the K-Nearest Neighbor and Latent Semantic Analysis Methods Shabrina, Ghina Annisa; Lhaksmana, Kemas Muslim
Jurnal Indonesia Sosial Teknologi Vol. 5 No. 12 (2024): Jurnal Indonesia Sosial Teknologi
Publisher : Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59141/jist.v5i12.1340

Abstract

The Qur'an, comprising over 80,000 words, 6,236 verses, and 114 surahs, presents a multifaceted and deeply significant text that demands a nuanced understanding of historical context, classical Arabic, and exegesis. To analyze and classify its content, various methodologies have been employed, including K-Nearest Neighbor (KNN) and Latent Semantic Analysis (LSA). This research investigates the effectiveness of combining KNN with LSA for multi-label topic classification of Qur'anic verses. The study reveals that KNN alone achieved a micro average F1-score of 0.49, demonstrating reliable performance particularly for topics such as "aqidah" (creed) and "worldly matters." When LSA was applied with 100 components, there was a decrease in performance, reflected by a drop in the micro average F1-score to 0.43 and an increase in Hamming loss to 0.1657. However, as the number of LSA components increased to 200 and 300, performance improved, with micro average F1-scores rising to 0.45 and 0.47, and Hamming loss values decreasing to 0.1507 and 0.1466, respectively. This indicates that while LSA can enhance KNN performance, optimal results are achieved with a higher number of components