Hepatitis causes around 1.4 million people die every year. This number makes hepatitis to be the largest contagious disease in the number of deaths after tuberculosis. Liver biopsy is still the best method for diagnosing the stage of hepatitis C, but this method is an invasive, painful, expensive, and can cause complications. Non-invasively method needs to be developed, one of non-invasif method is machine learning. Random Forest and XGboost are classification methods that are often used, since they have many advantages over classical classification methods. The SMOTE algorithm can be used to improve the accuracy of predictions from imbalanced data. the data in this study have 24 independent variables in the form of patients self-data, hepatitis C symptoms, and laboratory test results. The dependent variable in this study is a binary category, namely the level of hepatitis C disease (fibrosis and cirrhosis). The results showed that the random forest and XGboost had an accuracy of around 74% but the recall value was less than 2%. SMOTE random forest dan SMOTE XGboost have an accuracy & recall value more than 75%. SMOTE random forest has a higher accuracy for predicting fibrosis class while SMOTE XGboost is better in cirrhosis class. Variables that are more influental in determining hepatitis C stage are variables from laboratory test. Keyword : Fibrosis, Cirrhosis, Random Forest, SMOTE, XGboost
Copyrights © 2020