Online news contains valuable insights into public phenomena that can support statistical analysis by institutions like BPS Riau. However, current methods of classifying news are manual, time-consuming, and prone to human error. This study proposes an automated news classification system using Natural Language Processing (NLP) techniques with Term Frequency–Inverse Document Frequency (TF-IDF) for feature extraction and the Multinomial Naïve Bayes algorithm for classification. The dataset was collected via web scraping and manually labeled across five statistical categories: poverty, unemployment, democracy, inflation, and economic growth. The system achieved a validation accuracy of 83%, a test accuracy of 90%, with an average precision of 0.85, recall of 0.93, and f1-score of 0.87. These results demonstrate that the proposed system can significantly reduce the manual workload of news classification and be practically implemented by BPS Riau to support accurate and timely statistical reporting.
Copyrights © 2025