The rapid growth of digital marketplace platforms such as Shopee, Tokopedia, and Bukalapak has transformed online business competition and increased the importance of data-driven sales analysis. Marketplace data, including product price, ratings, reviews, sales volume, views, and seller location, contain valuable information that can be utilized to predict product market potential. However, the large volume, heterogeneous characteristics, and dynamic nature of marketplace data make manual analysis inefficient. Therefore, this study aims to analyze and compare the performance of the C4.5 and K-Nearest Neighbor (KNN) algorithms in classifying marketplace sales potential. The dataset used in this research was collected through data scraping from Shopee, Tokopedia, and Bukalapak using the BigSeller application in March 2022, consisting of 21,750 product records with numerical and categorical attributes. Data preprocessing was conducted using Orange Data Mining, including data cleaning, missing value handling, normalization, feature transformation, and dataset partitioning. The classification process categorized products into three market potential levels: low, medium, and high. Model performance was evaluated using a confusion matrix based on accuracy, precision, recall, and F1-score metrics. The experimental results demonstrate that the C4.5 algorithm outperformed KNN, achieving an accuracy of 0.86, while KNN obtained an accuracy of 0.70. Moreover, C4.5 showed higher precision, recall, and F1-score values, indicating better classification consistency and stability. The findings suggest that C4.5 is more effective for marketplace sales potential classification due to its ability to identify influential attributes and manage heterogeneous marketplace datasets. This study contributes to marketplace sales prediction and supports data-driven decision-making in e-commerce environments.
Copyrights © 2026