Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Sistemasi: Jurnal Sistem Informasi

Information Retrieval Method for the Qur’an based on FastText and Latent Semantic Indexing ramadhan, aziz; Utomo, Fandy Setyo
SISTEMASI Vol 14, No 3 (2025): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v14i3.4446

Abstract

Retrieving contextually relevant verses from the Al-Qur'an translation dataset presents significant challenges due to the linguistic richness and semantic variation of the text. This study aims to enhance the accuracy and relevance of information retrieval in the Al-Qur'an translation dataset by combining Latent Semantic Indexing (LSI) and FastText word embeddings. The proposed method involves several steps: text preprocessing (lowercasing, punctuation removal, stopword elimination, and stemming), tokenization and vocabulary creation, Bag-of-Words (BoW) representation, creation of LSI models, conversion of FastText vectors, and combining LSI and FastText vectors. A similarity index is then created from the combined vectors to process user queries and rank documents based on cosine similarity. Testing on the dataset, consisting of 6236 translated verses from 114 surahs, showed promising results. The combined approach effectively captures both broader semantic structures and detailed word meanings, providing more accurate and contextually relevant search results. Key findings include high similarity scores, with 90% of retrieved verses being highly relevant to the user query, an accuracy improvement to 85%, and enhanced handling of synonyms and morphological variations at 88%. Further development is recommended, including parameter optimization, advanced preprocessing techniques, real-time search optimization, integration of contextual embeddings, and multilingual support to improve search performance and accuracy.