This study aims to identify clustering patterns of sub-districts in Serang District based on village participation in Micro and Small Industry (MSI) activities using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, a machine learning method in Unsupervised Learning. Secondary data from the Statistics Indonesia (BPS) on Potentials of Villages in Serang District for 2024 was used, covering 29 sub-districts and 15 MSI sector variables. Data preprocessing involved Min-Max Scaler normalization and Principal Component Analysis (PCA) to address sparsity and multicollinearity. DBSCAN parameter optimization was done through simulations of epsilon values (0–1) and MinPts (1–10), validated with the Silhouette Score and Davies-Bouldin Index. The optimal configuration of epsilon=0.3 and MinPts=1 resulted in seven clusters with no noise, and a Davies-Bouldin Index of 0.620, indicating good separation. Spatial analysis revealed meaningful cluster distribution, with comprehensive industry clusters in the central region and specialized clusters in peripheral areas. These findings provide a basis for formulating MSI development policies in Serang District, highlighting the importance of data preprocessing techniques in sparse data analysis for evidence-based decision-making.
                        
                        
                        
                        
                            
                                Copyrights © 2025