PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS
Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St

Entity Matching of Shop Accounts in Online Commerce Portals

Dina Salsabila (Politeknik Statistika STIS)
Takdir Takdir (Politeknik Statistika STIS)



Article Info

Publish Date
04 Jan 2022

Abstract

Currently, online marketplace data are valuable data sources to be analyzed forvarious purposes. In the data collecting phases, duplication of shop accounts was found, resulting in biased analysis. This study examines the development of a mechanism to identify duplicate entities, i.e. store accounts, between different online marketplaces, or commonly known as entity matching. Word similarity algorithms were adopted as the core elements of our approach. Additionally, we present an entity matching model by examining logisticregression, naive Bayes, and random forest to find the best model for classifying store account similarities. Top online marketplaces in Indonesia are the object of our study, limited to one developing municipality, i.e. Sleman, DI Yogyakarta. The results show the best model has an accuracy value of 0.961, precision of 0.963, a recall of 0.958, and an F1-score of 0.962. Therefore, these results are acceptable for duplicate identification.

Copyrights © 2021






Journal Info

Abbrev

icdsos

Publisher

Subject

Computer Science & IT

Description

International Conference on Data Science and Official Statistics International Conference on Data Science and Official Statistics (ICDSOS) 2023 is organized by Politeknik Statistika STIS and Statistics Indonesia (BPS). This international conference in collaboration with Forum Pendidikan Tinggi ...