Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Engineering, Mathematics and Computer Science Journal (EMACS)

Integrating Geospatial Big Data and Machine Learning for Village Level Rural Urban Classification: Evidence From Toba Regency Br. Saragih, Meilani Thereza; Hartojo, Nurlatifah
Engineering, MAthematics and Computer Science Journal (EMACS) Vol. 8 No. 1 (2026): EMACS (In Press)
Publisher : Bina Nusantara University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21512/emacsjournal.v8i1.14159

Abstract

This study aims to develop a data-driven framework for classifying rural and urban areas at the village level in Toba Regency by integrating official statistical data, geospatial big data, and machine learning techniques. The current regional classification still relies on the 2020 baseline and may not adequately reflect recent socio-spatial transformations occurring at finer administrative levels. To address this limitation, this study integrates Village Potential Statistics (PODES) data with spatial indicators derived from big data sources, including population density from WorldPop and the Built-Up Index (BUI) extracted from satellite imagery. The integration of these datasets enables a more comprehensive representation of settlement patterns, spatial development intensity, and demographic distribution across villages. Three supervised machine learning algorithms were implemented to this study: Support Vector Machine (SVM), Naïve Bayes, and Random Forest, with model evaluation using accuracy, precision, recall, and F1-score. The analysis results show that the Random Forest algorithm provides the best performance. Based on the best model, of the 244 villages analyzed, 156 areas were classified as rural and 88 areas as urban. These results indicate a change in status in 47 villages compared to the previous classification. These findings indicate that integrating official statistical data with big data and machine learning methods can capture the dynamics of regional development more adaptively, potentially serving as a complementary approach for compiling regional classifications and formulating more targeted development policies.