Media Statistika
Vol 17, No 1 (2024): Media Statistika

MULTICLASS CLASSIFICATION OF MARKETPLACE PRODUCTS WITH MACHINE LEARNING

Aditama, Farhan Satria (Unknown)
Krismawati, Dewi (Unknown)
Pramana, Setia (Unknown)



Article Info

Publish Date
14 Oct 2024

Abstract

The use of marketplace data and machine learning in the collection of commodity data can provide an opportunity for Statistics Indonesia to complete the commodity directories for various surveys. This research adopts machine learning to train a product classification model based on existing datasets to predict whether a new dataset falls into which KBKI category. The dataset contains more than 32,000 products from 26 classes consisting of product data from two biggest marketplaces in Indonesia. Algorithms used for classification include Random Forests (RF), Support Vector Machines (SVM), and Multinomial Naive Bayes (MNB). Results indicate that MNB is the most effective algorithm when considering the trade-off between accuracy and processing time. MNB achieved the highest micro-average F1 scores, with 91.8% for Tokopedia and 95.4% for Shopee, and has the fastest execution time approximately 5 seconds.

Copyrights © 2024