Jurnal Informasi dan Teknologi
2025, Vol. 7, No. 2

Machine Learning-Based News Classification: Comparison of KNN Accuracy with Hyperparameter Tuning

Muhamad Nur Gunawan (Unknown)
Nuryasin (Unknown)
Syopiansyah Jaya Putra (Unknown)
Sarah Arhami (Unknown)



Article Info

Publish Date
17 Jul 2025

Abstract

This study aims to develop an automatic news text classification system using the K-Nearest Neighbor (KNN) algorithm with a hyperparameter tuning approach. Manual classification by editors is considered inefficient, so an accurate and lightweight automated approach is needed. News datasets were obtained through web scraping of bbc.com sites with five main categories, namely business, technology, entertainment, science, and health. This research follows the CRISP-DM methodology which consists of six stages: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Feature representation is done using TF-IDF and preprocessing includes stopword removal as well as pattern-based noise cleaning. Two experimental scenarios were performed: first, using complete data without balancing; Second, using more balanced undersampling data. Hyperparameter tuning was performed with k-value variations from 1 to 50 and validated with 5-fold cross-validation. The results showed that the model with balanced data and a value of k=11 produced an accuracy, precision, recall, and F1-score of 95%. The system was also successfully implemented into a Flask-based web application that can be used by news editors for real-time text classification. This study emphasizes the importance of parameter optimization and preprocessing in text classification and shows that simple algorithms such as KNN remain competitive if supported by good data processing.

Copyrights © 2025






Journal Info

Abbrev

jidt

Publisher

Subject

Computer Science & IT

Description

Jurnal Informasi & Teknologi media kajian ilmiah hasil penelitian, pemikiran dan kajian analisis-kritis mengenai penelitian Rekayasa Sistem, Teknik Informatika/Teknologi Informasi, Manajemen Informatika dan Sistem Informasi. Sebagai bagian dari semangat menyebarluaskan ilmu pengetahuan hasil dari ...