Credit card fraud detection remains a critical challenge due to the inherent imbalance in transaction datasets, where fraudulent transactions are significantly fewer than normal ones. This study investigates the application of the XGBoost classification algorithm to address this issue using the publicly available Kaggle Credit Card Fraud Detection dataset. The research incorporates data preprocessing techniques such as normalization and SMOTE to handle the dataset's imbalance. Hyperparameter tuning using GridSearchCV optimizes the model’s parameters, enhancing its performance. The results indicate that the model achieves an Area Under the Curve (AUC) of 0.97, demonstrating its high accuracy in distinguishing between fraudulent and normal transactions. The evaluation metrics reveal an F1-score of 0.77 for fraudulent transactions, showing the model's reasonable effectiveness in detecting fraud. While the model performs exceptionally well in identifying normal transactions, reducing false negatives remains a challenge. This study underscores the potential of combining advanced machine learning techniques with preprocessing and optimization strategies to develop robust fraud detection systems.
Copyrights © 2025