Jurnal Mandiri IT
Vol. 14 No. 1 (2025): July: Computer Science and Field.

Hybrid clustering and supervised learning model for digital MSME segmentation

Marcelina, Dona (Unknown)
Terttiaavini, Terttiaavini (Unknown)



Article Info

Publish Date
19 Jul 2025

Abstract

Digitalization became a key factor in enhancing the competitiveness of Micro, Small, and Medium Enterprises (MSMEs). However, its implementation still faced several challenges, including low levels of technology adoption and inaccurate data segmentation. This study aimed to develop a hybrid approach by combining clustering techniques and supervised learning to conduct segmentation and prediction of MSMEs based on their level of digitalization. Four clustering algorithms were tested: K-Means, Agglomerative, Gaussian Mixture Model, and HDBSCAN. The evaluation results showed that HDBSCAN outperformed the other algorithms, achieving the highest Silhouette Score (0.3501), the lowest Davies-Bouldin Index (0.9557), and the highest Calinski-Harabasz Index (132.38). The segmentation process generated three distinct clusters: Cluster 0 (Traditional – low digitalization, small revenue), Cluster 1 (Semi-Digital – moderate technology adoption, medium revenue), and Cluster 2 (Fully Digital – high technology adoption, large revenue). These cluster results were then used as labels to train six classification algorithms. Among them, XGBoost and Neural Network delivered the best performance, reaching a prediction accuracy of 98.63%. The main contribution of this study was the development of an analytical framework for data-driven segmentation and prediction of MSMEs, providing more precise, targeted, and adaptive support for national digitalization strategies.

Copyrights © 2025






Journal Info

Abbrev

Mandiri

Publisher

Subject

Computer Science & IT Library & Information Science Mathematics

Description

The Jurnal Mandiri IT is intended as a publication media to publish articles reporting the results of Computer Science and related ...