Jurnal Ilmiah Matrik
Vol. 27 No. 2 (2025): Jurnal Ilmiah Matrik

Contextualized Word Embedding Untuk Ekstraksi Kutipan Berita Indonesia

Khairina, Syifa (Unknown)
Saffa, Nayara (Unknown)
Lieharyani, Djoko Cahyo Utomo (Unknown)
Hutahaean, Jonner (Unknown)



Article Info

Publish Date
11 Aug 2025

Abstract

This study aims to develop a Named Entity Recognition (NER) model based on Recurrent Neural Networks (RNN) to extract direct quotes from Indonesian news articles, with a focus on enhancing the Medmon system by Kabayan Group, which is used to monitor the public image of public figures and brands. The study is limited to Indonesian news articles and does not include other languages or news sources. Two models are compared in this research: one utilizing static word embedding Word2Vec and the other using contextual word embedding BERT. The experiment was conducted using PFSA-ID corpus, which consist 1,018 Indonesian news articles annotated for direct quotes using BILOU scheme. Both models were trained and evaluated using Python programming libraries such as Pytorch and Hugging Face Transformers. The results show that the BERT model outperforms Word2Vec, with an F1-Score difference of 14.03 points. The BERT model achieved a highest F1-Score of 92.28%, while Word2Vec only reached 78.05%. This research contributes to the field of online media monitoring by improving the efficiency and accuracy of direct quote extraction in Indonesian news, offering practical value for media analysts and organizations relying on automated media analysis

Copyrights © 2025






Journal Info

Abbrev

jurnalmatrik

Publisher

Subject

Computer Science & IT

Description

Peringkat Akreditasi Jurnal Ilmiah Periode III Tahun 2022 KEPUTUSAN DIREKTUR JENDERAL PENDIDIKAN TINGGI, RISET, DAN TEKNOLOGI KEMENTERIAN PENDIDIKAN, KEBUDAYAAN, RISET, DAN TEKNOLOGI REPUBLIK INDONESIA NOMOR 225/E/KPT/2022 TENTANG PERINGKAT AKREDITASI JURNAL ILMIAH PERIODE III TAHUN 2022. Jurnal ...