As a novel and efficient ensemble learning algorithm, XGBoost has been widely applied due to its multiple advantages, but its classification effect in cases of data imbalance is often not ideal. Aiming at this problem, efforts were made to optimize XGBoost and the Cross Validation algorithm. The main idea is to combine cross validation and XGBoost on unbalanced data for data processing, and then get the final model based on XGBoost through training. At the same time, optimal parameters are searched and adjusted automatically through optimization algorithms to realize more accurate classification predictions. In the testing phase, the area under the curve (AUC) is used as an evaluation indicator to compare and analyze the classification performance of various sampling methods and algorithm models. The results of the model analysis using AUC are expected to verify the feasibility and effectiveness of the proposed algorithm.
Copyrights © 2023