Bulletin of Network Engineer and Informatics (BUFNETS)
Vol. 4 No. 1 (2026): BUFNETS (Bulletin of Network Engineer and Informatics) April - September 2026

PERFORMANCE ANALYSIS OF NAÏVE BAYES CLASSIFIERS BASED ON THE INFORMATION GAIN-BASED FEATURE SELECTION WITH MULTICOLLINEARITY ANALYSIS

Luh Putu Risma Noviana Risma (Universitas Pendidikan Ganesha)
I Gede Aris Gunadi (Universitas Pendidikan Ganesha)
I Made Gede Sunarya (Universitas Pendidikan Ganesha)



Article Info

Publish Date
30 May 2026

Abstract

This study aims to analyze the performance of the Naïve Bayes Classifier algorithm by comparing several feature selection methods, namely Information Gain-based feature selection, Multicollinearity-based feature selection, and a combination of Information Gain and Multicollinearity. The dataset used in this study consists of 337 toddler stunting cases obtained from Kintamani I and VI Public Health Centers. The experiment was conducted using four testing scenarios: (1) Naïve Bayes Classifier without feature selection, (2) Naïve Bayes Classifier with Information Gain feature selection, (3) Naïve Bayes Classifier with Multicollinearity feature selection, and (4) Naïve Bayes Classifier with a combination of Information Gain and Multicollinearity feature selection. All experiments used a data split of 70% training data and 30% testing data, while model performance was evaluated using a confusion matrix. In the Information Gain feature selection stage, several features achieved the highest gain values, namely BPJS with a gain value of 1.0, immunization with a gain value of 1.0, age with a gain value of 0.842, maternal pregnancy history with a gain value of 0.791, and smoking habits with a gain value of 0.756. These features were retained in the final combined model because they contributed the most to the stunting classification process. In addition to improving predictive performance, the combination of Information Gain and Multicollinearity was also able to reduce feature redundancy, resulting in a more stable classification model. The results showed that the accuracy of the Naïve Bayes Classifier without feature selection was 90.10%, the Naïve Bayes Classifier with Information Gain feature selection achieved 95.05%, the Naïve Bayes Classifier with Multicollinearity feature selection achieved 93.07%, and the Naïve Bayes Classifier with a combination of Information Gain and Multicollinearity achieved the highest accuracy of 96.04%. These findings indicate that the combination of Information Gain and Multicollinearity produced the best performance among all tested methods. In addition, a coefficient of determination (R Square) test was conducted using SPSS, resulting in a value of 0.577, indicating that 57.7% of stunting classification was influenced by independent variables such as age, BPJS, immunization, smoking habits, and maternal pregnancy history, while the remaining 42.3% was influenced by other factors outside the scope of this study. The results also indicate that the Naïve Bayes algorithm combined with Information Gain feature selection and multicollinearity testing can be used as a stable and effective approach for early stunting classification to support decision-making in public health services.

Copyrights © 2026






Journal Info

Abbrev

bufnets

Publisher

Subject

Computer Science & IT

Description

The Journal invites original articles and is not simultaneously submitted to another journal or conference. Scopes: Information Technology: Software Engineering, Knowledge and Data Mining, Multimedia Technologies, Mobile Computing, Parallel/Distributed Computing, Computer Graphics, Virtual Reality, ...