This study aims to improve land use classification accuracy by integrating zonal statistics with supervised machine learning using Sentinel-2 imagery. Two classification models were developed: Model A based on single-pixel values and Model B using aggregated zonal statistics derived from polygon shapefile data. Two algorithms, Random Forest and Classification and Regression Trees (CART), were implemented and evaluated through 5-fold cross validation. The results show that Model B consistently outperformed Model A, with the best performance achieved by Random Forest Model B, reaching an overall accuracy of 73.74% and a kappa coefficient of 0.5999. Class-wise evaluation based on F1-score revealed strong performance in dominant classes such as settlement, water bodies, and rice fields, while underrepresented classes like cropland and shrubland were more difficult to classify due to class imbalance. These findings highlight the effectiveness of zonal statistics in producing more representative training features and improving model stability and accuracy in land use classification tasks.
Copyrights © 2026