This Author published in this journals
All Journal Jurnal Gaussian
Muhamad Syukron
Departemen Statistika, Fakultas Sains dan Matematika, Universitas Diponegoro

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

PERBANDINGAN METODE SMOTE RANDOM FOREST DAN SMOTE XGBOOST UNTUK KLASIFIKASI TINGKAT PENYAKIT HEPATITIS C PADA IMBALANCE CLASS DATA Muhamad Syukron; Rukun Santoso; Tatik Widiharih
Jurnal Gaussian Vol 9, No 3 (2020): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.v9i3.28915

Abstract

Hepatitis causes around 1.4 million people die every year. This number makes hepatitis to be the largest contagious disease in the number of deaths after tuberculosis. Liver biopsy is still the best method for diagnosing the stage of hepatitis C, but this method is an invasive, painful, expensive, and can cause complications. Non-invasively method needs to be developed, one of non-invasif method is machine learning. Random Forest and XGboost are classification methods that are often used, since they have many advantages over classical classification methods. The SMOTE algorithm can be used to improve the accuracy of predictions from imbalanced data. the data in this study have 24 independent variables in the form of patients self-data, hepatitis C symptoms, and laboratory test results. The dependent variable in this study is a binary category, namely the level of hepatitis C disease (fibrosis and cirrhosis). The results showed that the random forest and XGboost had an accuracy of around 74% but the recall value was less than 2%. SMOTE random forest dan SMOTE XGboost have an accuracy & recall value more than 75%. SMOTE random forest has a higher accuracy for predicting fibrosis class while SMOTE XGboost is better in cirrhosis class. Variables that are more influental in determining hepatitis C stage are variables from laboratory test. Keyword : Fibrosis, Cirrhosis, Random Forest, SMOTE, XGboost