Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : Building of Informatics, Technology and Science

Perbandingan Algoritma Naive Bayes, Random Forest, dan Support Vector Machine Terhadap Pandangan Masyarakat Mengenai Revisi Undang-Undang TNI di Instagram Nasrul, Royhan; Yudhistira, Aditia
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8164

Abstract

The revision of the Indonesian National Army Law (TNI Law), enacted in 2025, sparked widespread controversy within society, particularly concerning issues of civilian supremacy and potential military dominance. With the growing use of social media as a platform for public expression, platforms such as Instagram have become the primary medium for the public to voice their opinions regarding this issue. This study aims to analyze public sentiment toward the revision of the TNI Law by utilizing text classification algorithms, namely Naive Bayes, Random Forest, and Support Vector Machine (SVM). Data was collected from 28,669 Instagram comments and analyzed through stages of data crawling, preprocessing, and labeling. To address data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied. Subsequently, classification was performed using the three algorithms, with evaluation metrics including accuracy, precision, recall, and F1-score. The results after SMOTE demonstrated that the SVM algorithm delivered the best performance with an accuracy of (92%), followed by Random Forest at (88%), and Naive Bayes at (76%). Consequently, SVM was deemed the most effective in capturing patterns of public sentiment objectively. This research is expected to contribute to the advancement of digital public opinion studies and support the evaluation process of national defense policies
Klasifikasi Tingkat Kemiskinan Kabupaten/Kota Di Indonesia Tahun 2023 Menggunakan Logistic Regression Hafizhah, Hafizhah; Yudhistira, Aditia
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8343

Abstract

Poverty remains a major challenge in Indonesia, with a national rate reaching 9.36 percent in 2023, despite significant disparities between rural (12.22 percent) and urban (7.29 percent) areas, as well as the influence of outlier that can distort classification analysis at the district/city level. This study aims to classify poverty levels in 514 districts/cities into high (above 9.36 percent) and low (below or equal to 9.36 percent) categories using logistic regression, and to compare the model performance on original data with outlier-adjusted data through Z-score and interquartile range (IQR) methods. The methods applied include the collection of secondary data from the Central Statistics Agency and the Ministry of Home Affairs, exploratory data analysis to identify patterns and correlations (such as the negative correlation between per capita expenditure and poverty), and pre-processing by capping outlier. logistic regression training with hyperparameter tuning through grid search and cross-validation, as well as evaluation using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (ROC-AUC) metrics. The predictor variables include gross domestic product (GDP), life expectancy, average length of schooling, and per capita expenditure. The results show consistent performance across techniques, with test accuracy reaching 77.67 percent, ROC-AUC of 0.8566, macro precision of 77.90 percent, macro recall of 77.79 percent, and macro F1-score of 77.66 percent. Outlier handling reduced the poverty rate standard deviation from 6.45 to 5.99 (Z-score) and 5.57 (IQR), without changing the distribution of binary labels (266 low, 248 high). The model coefficients confirm the dominant negative influence of per capita expenditure (-1.067), supporting targeted policies to reduce regional disparities.
Perbandingan Naïve Bayes dan Support Vector Machine Berbasis Term Frequency−Inverse Document Frequency pada Analisis Sentimen Ulasan Produk Afiliasi Lintas Platform TikTok dan Shopee Putri, Clara Indriani; Yudhistira, Aditia
Building of Informatics, Technology and Science (BITS) Vol 7 No 4 (2026): March 2026
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i4.9454

Abstract

The growth of affiliate marketing on digital platforms, particularly TikTok and Shopee, has led to a rapid increase in consumer reviews that can be leveraged as actionable insights for businesses. However, reviews across platforms exhibit different linguistic characteristics: Shopee reviews tend to be more repetitive and transactional, whereas TikTok reviews are more informal, rich in slang, and noisier. This difference creates a research gap because sentiment classification performance may vary across platforms, while comparative studies on cross-platform affiliate reviews remain limited. This study aims to analyze and compare the performance of Multinomial Naïve Bayes and Support Vector Machine in identifying positive and negative sentiment polarity in TikTok and Shopee affiliate product reviews. Data were collected via web scraping during December 2025–January 2026, yielding 5,502 raw reviews. After text preprocessing (case folding, regex-based cleaning, normalization, stopword removal, and stemming using Sastrawi), 4,593 clean reviews were obtained. Lexicon-based automatic labeling with negation handling produced a binary dataset of 3,314 reviews (2,729 positive and 585 negative), indicating class imbalance; therefore, no data balancing was applied and evaluation emphasized precision, recall, and F1-score in addition to accuracy. Feature representation used Term Frequency–Inverse Document Frequency, and the dataset was split using an 80:20 hold-out scheme (2,651 training and 663 testing instances). Experimental results show that the Support Vector Machine achieved higher performance (95.93% accuracy; 0.81 negative-class F1) than Multinomial Naïve Bayes (89.14% accuracy; 0.12 negative-class F1). This superiority is related to the ability of Support Vector Machine to learn a maximum-margin hyperplane in the high-dimensional and sparse Term Frequency–Inverse Document Frequency feature space, making it more robust to linguistic variation and noise than the probabilistic Naïve Bayes approach, which is more sensitive to majority-class dominance.