Edu Komputika Journal
Vol. 12 No. 1 (2025): Edu Komputika Journal

Weakly Supervised Sentiment Analysis of Indonesian Rural Tourism Reviews: A TF-IDF Baseline for Melung Tourism Village

Rifa’i, Zanuar (Unknown)
Mukti, Bayu Priya (Unknown)



Article Info

Publish Date
30 Aug 2025

Abstract

This study investigates sentiment classification of Indonesian-language tourist reviews from the rural destination of Melung Tourism Village. A total of 724 user-generated reviews from 546 unique users are preprocessed using Indonesian-specific text cleaning, stopword filtering, and stemming, then weakly labeled through a stemmed positive–negative lexicon. TF-IDF unigram–bigram features are extracted from the preprocessed texts and used to train three classical classifiers: Naive Bayes, linear Support Vector Machine (SVM), and Logistic Regression. To address class imbalance, RandomOverSampler is applied only to the training data, and model evaluation combines stratified 5-fold cross-validation with a held-out test set, using weighted F1-score as the primary metric. Logistic Regression achieves the best performance on the test set (weighted F1 = 0.8799, accuracy = 0.8828), closely followed by SVM, while Naive Bayes lags behind. The results show that, even with a modest, weakly supervised dataset, a carefully designed classical pipeline can yield reliable sentiment indicators to support data-driven management of rural tourism destinations.

Copyrights © 2025






Journal Info

Abbrev

edukom

Publisher

Subject

Education

Description

Edu Komputika Journal uses Open Journal Systems (OJS) for online journal management in submission, review, copyediting, and publication. Submitted manuscripts are written in English and should follow the style of the Edu Komputika Journal. Manuscripts are original research results, or ...