Tuberculosis (TB) is an infectious disease that remains a serious problem in Indonesia due to its spread and imbalanced data in cases. This study aims to compare the performance of Random Forest and Gradient Boosting algorithms in classifying tuberculosis in imbalanced data. The methods used include the application of the Synthetic Minority Oversampling Technique (SMOTE) as a data balancing method, as well as model evaluation using the metrics of accuracy, precision, sensitivity, specificity, and AUC. The results show that Gradient Boosting without SMOTE produces the best performance with an accuracy of 93% and an AUC of 0.91, while the application of SMOTE actually reduces the performance of the model. Meanwhile, Random Forest showed stable results in both conditions with an accuracy of 93% and an AUC of 0.89. Thus, it can be concluded that Gradient Boosting without SMOTE provides the most optimal classification results and can be the basis for developing classification methods for Imbalanced Data in tuberculosis. Abstract is a brief representation of the whole article which contains the context of the problem (background), the purpose of the research, the principal methods, the results and the major conclusion (contribution). An abstract is often presented separately from the article, so it must be able to stand alone. Thus, the reference must be avoided. Abstract must be written in Nunito , with no more than 300 words in one paragraph.
Copyrights © 2025