JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING
Vol. 7 No. 1 (2023): Issues July 2023

Enhancing Unbalanced Data Classification with Cross-Validation and Extreme Gradient Boosting: A Comprehensive Analysis

muhammad riki atsauri (Universitas Sumatera Utara)
herman mawengkang (Faculty of Matthematics and Natural Science Universitas Sumatera Utara, Medan, 20155, Indonesia)
syahril efendi (Faculty of Computer Science Universitas Sumatera Utara, Medan, 20155, Indonesia)



Article Info

Publish Date
28 Jul 2023

Abstract

As a novel and efficient ensemble learning algorithm, XGBoost has been widely applied due to its multiple advantages, but its classification effect in cases of data imbalance is often not ideal. Aiming at this problem, efforts were made to optimize XGBoost and the Cross Validation algorithm. The main idea is to combine cross validation and XGBoost on unbalanced data for data processing, and then get the final model based on XGBoost through training. At the same time, optimal parameters are searched and adjusted automatically through optimization algorithms to realize more accurate classification predictions. In the testing phase, the area under the curve (AUC) is used as an evaluation indicator to compare and analyze the classification performance of various sampling methods and algorithm models. The results of the model analysis using AUC are expected to verify the feasibility and effectiveness of the proposed algorithm.

Copyrights © 2023






Journal Info

Abbrev

jite

Publisher

Subject

Computer Science & IT Engineering

Description

JURNAL TEKNIK INFORMATIKA, JITE (Journal of Informatics and Telecommunication Engineering) is a journal that contains articles / publications and research results of scientific work related to the field of science of Informatics Engineering such as Software Engineering, Database, Data Mining, ...