INTEGER: Journal of Information Technology
Vol 11, No 1 (2026): Maret

Perbandingan Seleksi Fitur Sequential, Chi-Square, dan Embedded Pada Klasifikasi Penyakit Kanker Payudara Menggunakan Algoritma Random Forest

Auliya, Yudha Alif (Unknown)
Furqon, Muhammad ‘Ariful (Unknown)
Wibiyanto, Nico (Unknown)



Article Info

Publish Date
04 May 2026

Abstract

Cancer is typically linked to malignant tumors that can metastasize to extensive body tissues. Breast cancer arises from the uncontrolled proliferation of breast cells, resulting in the formation of benign and malignant tumors. Breast cancer presents various indicators, including small, round, and soft lumps associated with benign breast conditions and non-cancerous growths. In contrast, malignant breast cancer presents as asymmetrical, irregular, painful, and various other manifestations. If untreated, the tumor may metastasize and present a fatal risk. This study intends to evaluate the efficacy of Sequential Feature Selection, Chi-Square, and Embedded methods in classifying breast cancer, alongside implementing hyperparameter optimization via grid search on the random forest algorithm. This study utilizes the Wisconsin Breast Cancer dataset from the UCI Machine Learning Repository, comprising 569 data entries, 30 attributes, and 1 class label. The performance of the model is assessed using a Confusion matrix, which quantifies accuracy, precision, recall, and F1-score. The test results were derived from twenty testing schemes employing a combination of data splitting, cross-validation, and hyperparameter tuning via grid search. The optimal performance outcomes were achieved using the random forest model, which was subjected to hyperparameter tuning alongside SFS feature selection. The integration of 20 features yielded an accuracy of 97.37%, precision of 95.83%, recall of 97.87%, and an F1 score of 96.84%. The employed prediction model demonstrates effective performance in identifying both positive and negative classes. The model accurately predicted the true negative class in 66 instances. The model accurately identified the true positive class in 46 instances. One instance involved the model predicting a false positive class, while another instance involved the model predicting a false negative class. These results demonstrate that the model exhibits a high degree of accuracy with negligible prediction errors.

Copyrights © 2026






Journal Info

Abbrev

integer

Publisher

Subject

Computer Science & IT

Description

This journal contains articles from the results of scientific research on problems in the field of Informatics, Information Systems, Computer Systems, Multimedia, Network and other research results related to these ...