This study aims to develop a Named Entity Recognition (NER) model based on Recurrent Neural Networks (RNN) to extract direct quotes from Indonesian news articles, with a focus on enhancing the Medmon system by Kabayan Group, which is used to monitor the public image of public figures and brands. The study is limited to Indonesian news articles and does not include other languages or news sources. Two models are compared in this research: one utilizing static word embedding Word2Vec and the other using contextual word embedding BERT. The experiment was conducted using PFSA-ID corpus, which consist 1,018 Indonesian news articles annotated for direct quotes using BILOU scheme. Both models were trained and evaluated using Python programming libraries such as Pytorch and Hugging Face Transformers. The results show that the BERT model outperforms Word2Vec, with an F1-Score difference of 14.03 points. The BERT model achieved a highest F1-Score of 92.28%, while Word2Vec only reached 78.05%. This research contributes to the field of online media monitoring by improving the efficiency and accuracy of direct quote extraction in Indonesian news, offering practical value for media analysts and organizations relying on automated media analysis
Copyrights © 2025