Road infrastructure is a vital aspect of regional development that often receives public attention in online media, especially in North Sumatra. Manual monitoring of public opinion on this issue is inefficient due to the large volume of data and the imbalance of sentiment, which is dominated by complaints. This study aims to develop an automatic sentiment analysis model using a Weak Supervision approach that combines the Lexicon-Based method for automatic labelling and the Multinomial Naive Bayes algorithm to classify public opinion into three distinct categories: positive, negative, and neutral. Data was collected through web scraping techniques from various online news portals. To overcome data class imbalance, this study applied the Synthetic Minority Over-sampling Technique (SMOTE) to the training data. Test results on the test data showed that the model was able to achieve an accuracy of 70.93%. The model performed very well in detecting negative sentiment with a Precision value of 0.86, and was able to recognize positive sentiment with a Recall of 0.70 thanks to the application of SMOTE. Based on these results, the Naïve Bayes model can be used effectively to classify public sentiment towards road damage. In addition, these findings serve as strategic references and recommendations for stakeholders, such as the Inspectorate, to formulate relevant and data-driven policies in infrastructure improvement and regional development efforts.
Copyrights © 2025