This study aims to analyze and compare the performance of the Naive Bayes, K-Nearest Neighbors (KNN), and Decision Tree algorithms in predicting the purchase intention of e-commerce visitors using the Online Shoppers Purchasing Intention Dataset, which consists of 12,330 records and 18 variables, with the Revenue variable serving as the classification target. The preprocessing stage involved transforming categorical and boolean variables into numerical form, standardizing features using StandardScaler, and splitting the dataset into 80 percent training data and 20 percent testing data. Model evaluation was conducted using accuracy, precision, recall, F1-score, and ROC-AUC metrics, and was further strengthened by 10-fold cross-validation to obtain more stable results. The findings indicate that KNN achieved the highest accuracy of 0.866180, while Naive Bayes produced the highest recall value of 0.690998 and the highest ROC-AUC value of 0.821696. Meanwhile, Decision Tree demonstrated relatively balanced performance with an accuracy of 0.857259 and an F1-score of 0.571776, whereas the cross-validation results identified KNN as the model with the highest average accuracy of 0.8770. These findings suggest that the selection of a classification model for purchase intention prediction cannot rely solely on a single evaluation metric, as each algorithm possesses different strengths. Therefore, a comparative approach among algorithms can help determine the most suitable model for supporting consumer behavior analysis on e-commerce platforms.
Copyrights © 2025