J-KOMA : Jurnal Ilmu Komputer dan Aplikasi
Vol 7 No 2 (2024): J-KOMA : Jurnal Ilmu Komputer dan Aplikasi

Application of Information Retrieval in News Document Search Using Syntax and Semantic Orientation

Kurniawati, Anggi (Unknown)
Pradani, Winangsari (Unknown)



Article Info

Publish Date
20 Dec 2024

Abstract

This study explores an information retrieval system for news document search, leveraging both syntactic and semantic approaches. The Word2Vec model, utilizing the skip-gram architecture, is employed to capture semantic relationships between words, transforming news articles into vector representations. Semantic similarity is measured using Word Mover’s Distance (WMD) and Cosine Similarity, while a syntax-based method employs regular expressions for keyword matching. The dataset comprises 2,813 news articles from Liputan6.com and Tempo.co, collected between 25–31 August 2019, containing 25,951 unique words. Preprocessing steps include case folding, filtering, tokenization, stopword removal, and stemming to enhance data quality. The system was evaluated using six user queries, with performance assessed through Precision@k and Mean Average Precision (MAP). Results indicate that Word2Vec with Cosine Similarity achieved the highest MAP score of 76.86%, outperforming WMD (75.65%) and regular expressions (72.06%). This demonstrates the effectiveness of semantic-based retrieval for news documents. Future work should focus on larger datasets and advanced models like Doc2Vec to improve retrieval accuracy and contextual understanding. 

Copyrights © 2024






Journal Info

Abbrev

jkoma

Publisher

Subject

Computer Science & IT

Description

J-KOMA is an open access journal, with core focus in two aspect: computer science general and information technology. All copyrights are retained by each respective author, but we hold publishing right. Currently, this journal has E-ISSN :2620-4827 published by LIPI which made it as a national ...