Green Engineering: Journal of Engineering and Applied Science
Vol. 2 No. 3 (2025): July : Green Engineering: International Journal of Engineering and Applied Scie

Machine Learning Implementation for E-commerce Delivery Delay Prediction Using XGBoost Algorithm

Stevanus Putra Lesmana (Unknown)
Dina Hermawati (Unknown)
Maulina Mukaromah (Unknown)
Iqbal Ahmad Bukhari (Unknown)
Norma Puspitasari (Unknown)



Article Info

Publish Date
31 Jul 2025

Abstract

Delivery delays pose a major challenge in the e-commerce industry, often leading to decreased customer satisfaction and negatively impacting business operations. In this study, the XGBoost (Extreme Gradient Boosting) algorithm is applied to predict delivery delays based on a dataset containing 96,476 records. These records include various features relevant to the delivery process, such as shipping distance, carrier performance, and order characteristics. The model achieves a high overall accuracy of 93.24%, indicating strong general performance. In particular, XGBoost demonstrates excellent results in predicting on-time deliveries, achieving a precision of 93% and a recall of 100%. However, the model struggles to correctly identify delayed deliveries. The recall for delayed deliveries is 0%, and the F1-score is extremely low at 0.01. This significant discrepancy reveals a critical limitation in the model's performance — the inability to detect minority class cases (delayed deliveries) due to class imbalance within the dataset. The results highlight the importance of addressing data imbalance in predictive modeling for delivery outcomes. When the dataset is dominated by on-time delivery records, the model tends to be biased toward that class, failing to learn the patterns associated with delays. To improve performance, the study recommends integrating class balancing techniques such as SMOTE (Synthetic Minority Oversampling Technique) to generate synthetic samples of the minority class. Additionally, the use of alternative evaluation metrics beyond accuracy — such as precision, recall, and F1-score for each class — is suggested to provide a more comprehensive understanding of model effectiveness. Overall, the study provides valuable insights into the complexities of predicting delivery delays and outlines practical strategies for enhancing future models in e-commerce logistics analytics.

Copyrights © 2025






Journal Info

Abbrev

GreenEngineering

Publisher

Subject

Agriculture, Biological Sciences & Forestry Civil Engineering, Building, Construction & Architecture Electrical & Electronics Engineering Engineering

Description

(Green Engineering: Journal of Engineering and Applied Science) [e-ISSN : 3063-6833, p-ISSN : 3063-6841] is an open access Journal published by the IFREL ( Forum of Researchers and Lecturers). Green Engineering accepts manuscripts based on empirical research results, new scientific literature ...