Claim Missing Document
Check
Articles

Found 1 Documents
Search

Hybrid Relevance and Sentiment Classification of Indonesian Gold Tweets Using Machine Learning for Market Risk Signal Extraction Kamalia, Antika Zahrotul; Indra, Indra; Wibowo, Arief; Riwurohi, Jan Everhard; Hassan, Shiza
International Journal of Advances in Data and Information Systems Vol. 7 No. 1 (2026): April 2026 - International Journal of Advances in Data and Information Systems
Publisher : Indonesian Scientific Journal

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

This study proposes a hybrid relevance–sentiment classification framework to analyze public opinion on physical Antam gold from Indonesian Twitter data and to support exploratory market-risk signal extraction. Tweets were collected during February–November 2025, after preprocessing and text-normalized deduplication, 1,271 unique tweets were retained. The approach combines weak supervision (rule-/lexicon-based silver labels) with TF-IDF-based machine learning in two stages: (1) relevance classification to separate tweets genuinely discussing physical Antam gold from non-relevant contexts (e.g., ANTM stock/capital-market discussions), and (2) two-class sentiment classification (positive vs negative) applied to relevance-filtered tweets. Random Forest achieved the strongest relevance performance (Accuracy = 0.984; macro-F1 = 0.943; 5-fold CV macro-F1 = 0.928 ± 0.033). For sentiment classification, performance was moderate and close across models; the most stable model under cross-validation (Logistic Regression/Naive Bayes) was used for downstream aggregation. Sentiment outputs were aggregated into a monthly sentiment index for descriptive comparison with gold prices; the observed association was weak, indicating that the index is better interpreted as a risk-perception proxy rather than a direct price predictor.