The rapid growth of e-commerce has made it increasingly important for online platforms to understand user behavior, particularly in predicting purchasing intention. This study examines the implementation of three machine learning models: Logistic Regression, Random Forest, and Gradient Boosting, to classify purchase intention using real transaction session data. One of the primary obstacles confronted in this investigation is the matter of class imbalance found in the dataset, where 10422 records indicate no purchase while only 1908 indicate a completed purchase. This disparity may result in a biased model performance that prioritizes the dominant class and limits the ability to accurately detect minority class behavior, which in this case is the actual purchase. To resolve this matter, During the data preprocessing phase, the Synthetic Minority Over-sampling Technique (SMOTE) was implemented. Accuracy, precision, recall, and F1-score metrics were implemented to assess each model's functionality. The results indicate that following the implementation of SMOTE, the Random Forest model attained the best accuracy of 93%, succeeded by Gradient Boosting at 90% and Logistic Regression with 84%. These findings demonstrate that the use of SMOTE significantly improves model sensitivity and balance. This study provides useful insights into designing fairer and more effective predictive systems in the field of e-commerce.
Copyrights © 2025