Engineering, Mathematics and Computer Science Journal (EMACS)
Vol. 5 No. 1 (2023): EMACS

Breast Cancer Classification Using Outlier Detection and Variance Inflation Factor

Budi Juarto (Bina Nusantara University)



Article Info

Publish Date
31 Jan 2023

Abstract

In terms of malignant tumors, breast cancer is one of the most prevalent. Breast cancer is a form of cancer that develops in the breast tissue when the surrounding, healthy breast tissue is overtaken by the uncontrollably growing cells in the breast tissue. Several features or patient conditions can be used in a machine learning approach to predict breast cancer. Machine learning will be utilized in these situations to determine if the cancer is malignant or benign. The Wisconsin Breast Cancer (Diagnostic) Data Set, which contains 32 characteristics and 569 collected data, was the dataset used in this research.. Feature selection in this study is done by eliminating outliers using the upper and lower quartile of each feature then feature selection is also carried out on features that have features that have a high variance inflation factor. The machine learning methods used in this research are Logistic Regression, Random Forest, KNN, SVC, XG Boost, Gradient Boosting, and Ridge Classifier. The selection of this method is based on the target that will be predicted by 2 labels, namely benign cancer, and malignant cancer. The result obtained is that the selection of features using the variance inflation factor increases the accuracy of the previous Logistic Regression and Random Forest methods from 98.25% to 99.12%. The method that has the highest level of accuracy is the Logistic Regression and Random Forest methods which have a value of 99.12%. The next research will be developed by trying other optimization techniques for hyperparameter tuning.

Copyrights © 2023






Journal Info

Abbrev

EMACS

Publisher

Subject

Civil Engineering, Building, Construction & Architecture Computer Science & IT Engineering Industrial & Manufacturing Engineering Mathematics

Description

Engineering, MAthematics and Computer Science (EMACS) Journal invites academicians and professionals to write their ideas, concepts, new theories, or science development in the field of Information Systems, Architecture, Civil Engineering, Computer Engineering, Industrial Engineering, Food ...