The classification process often needs help with suboptimal accuracy values, which can be attributed to various factors, including the dataset's wide range of attribute values. Discretization methods offer a solution to address these issues. This study aims to compare the effectiveness of Equal-Width and Equal-Frequency discretization methods in enhancing accuracy during the classification process using datasets with varying sizes. The research employs Naïve Bayes, Decision Tree, and Support Vector Machine as classification models, with three datasets utilized: Bandung City Traffic data (3804 records), Bandung City COVID-19 cases data (2718 records), and Bandung City Dengue Fever Disease Index data (150 records). Three experimental scenarios are executed to assess the impact of the two discretization methods on accuracy. The first scenario involves no discretization, the second employs Equal-Width, and the third applies Equal-Frequency discretization. Experimental results indicate significant accuracy improvements post-discretization. The Naïve Bayes model achieved 94% accuracy for the Traffic dataset, while the Decision Tree achieved 71% accuracy for the COVID-19 dataset and an impressive 98% for the Dengue Fever Disease dataset. These outcomes demonstrate that applying Equal-Width and Equal-Frequency discretization methods addresses the challenge of wide attribute value ranges in the classification process.
Copyrights © 2023