Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : jurnal ilmiah matrik

Contextualized Word Embedding Untuk Ekstraksi Kutipan Berita Indonesia Khairina, Syifa; Saffa, Nayara; Lieharyani, Djoko Cahyo Utomo; Hutahaean, Jonner
Jurnal Ilmiah Matrik Vol. 27 No. 2 (2025): Jurnal Ilmiah Matrik
Publisher : Direktorat Riset dan Pengabdian Pada Masyarakat (DRPM) Universitas Bina Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33557/2ayqqa48

Abstract

This study aims to develop a Named Entity Recognition (NER) model based on Recurrent Neural Networks (RNN) to extract direct quotes from Indonesian news articles, with a focus on enhancing the Medmon system by Kabayan Group, which is used to monitor the public image of public figures and brands. The study is limited to Indonesian news articles and does not include other languages or news sources. Two models are compared in this research: one utilizing static word embedding Word2Vec and the other using contextual word embedding BERT. The experiment was conducted using PFSA-ID corpus, which consist 1,018 Indonesian news articles annotated for direct quotes using BILOU scheme. Both models were trained and evaluated using Python programming libraries such as Pytorch and Hugging Face Transformers. The results show that the BERT model outperforms Word2Vec, with an F1-Score difference of 14.03 points. The BERT model achieved a highest F1-Score of 92.28%, while Word2Vec only reached 78.05%. This research contributes to the field of online media monitoring by improving the efficiency and accuracy of direct quote extraction in Indonesian news, offering practical value for media analysts and organizations relying on automated media analysis
Analisis Komparatif Algoritma Kemiripan Leksikal Untuk Deteksi Plagiarisme Pada Platform E-Learning Djoko Cahyo Utomo Lieharyani; Transmissia Semiawan; Jonner Hutahaean; Ade Chandra Nugraha; Suprihanto Suprihanto; Didik Suwito Pribadi
Jurnal Ilmiah Matrik Vol. 28 No. 1 (2026): Jurnal Ilmiah Matrik
Publisher : Direktorat Riset dan Pengabdian Pada Masyarakat (DRPM) Universitas Bina Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33557/dj4q2224

Abstract

The increasing adoption of online learning systems or e-learning in higher education brings consequences in the form of challenges in maintaining the originality of students' academic work (assignments, reports, theses, and others). One form of violation that is difficult to monitor is intra-class collusion. This study aims to evaluate the effectiveness of three lexical similarity algorithms, namely the Jaccard Index, Levenshtein Distance, and Cosine Similarity, to build an efficient automatic plagiarism detection instrument. The selection of the lexical method is based on the need for low computational resource consumption compared to complex meaning-based/word embedding methods, making it highly relevant for real-time implementation on an LMS platform. The research dataset consists of 854 student discussion responses taken from two different courses at an institution that implements full e-learning. The research stages include text pre-processing, similarity score calculation, and threshold optimization to balance false positive and false negative rates. Experimental results show that Levenshtein Distance provides the best performance with an F2-Score of 0.81044 at a threshold of 0.45. This value indicates high sensitivity in capturing variations in text manipulation in digital learning environments. This research provides a theoretical and practical foundation for institutions in developing lightweight yet accurate academic integrity monitoring features.