Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparison of Word2Vec and GloVe performance in Bi-LSTM models for Indonesian news classification Muhammad Faris Wafda; Husni; Ika Oktavia Suzanti; Firdaus Solihin; Mula'ab; Army Justitia
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol. 11, No. 3, August 2026 (Article in Progress)
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22219/kinetik.v11i3.2608

Abstract

The explosion in the volume of textual data from digital news presents challenges in classifying content automatically and efficiently. For the task of classifying Indonesian-language news, this study aims to compare the performance of several word embeddings specifically Word2Vec using CBOW and Skip-Gram architectures and GloVe when applied to a Bidirectional Long Short-Term Memory (Bi-LSTM) model. This study uses a dataset consisting of 6,715 news articles from the Indonesian news portal that have undergone pre-processing, divided into five categories. The model was trained using 80% of the training data with K-Fold Cross Validation (K=5), while the remaining 20% of the data was used for testing. The experimental findings indicate that the Bi-LSTM model, when combined with CBOW embedding, yielded the best performance, achieving 95.16% accuracy and a 95.15% F1-Score. The Skip-Gram model followed with solid performance, achieving an accuracy of 93.30% and the fastest computation time. Conversely, the model that used pre-trained GloVe embedding delivered the poorest performance, achieving 88.98% accuracy. This result suggests that training embeddings on a specific domain is more effective at capturing local context. The conclusion of this study confirms that selecting a word embedding method specifically trained on local datasets is also an important step in achieving optimal accuracy in Indonesian news text classification.