PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS
Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St

From Noisy Data to Insight: SOM Filtering Implementation For Improving the Machine Learning Model

Firmansyah, Achmad (Unknown)



Article Info

Publish Date
22 Dec 2025

Abstract

The filtering of representative training data from Big Data are critical steps in developing machine learning models, particularly for official statistics. This study demonstrates the application of Self-Organizing Map (SOM) filtering for enhancing training data quality in remote sensing-based classification of paddy phenological stages using satellite data. By clustering the data, SOM identifies and filters representative samples, which further removing noise and irrelevancy. Following the filtering, comparison is conducted between several purity threshold scheme and non-filtering dataset during model development. Findings reveal that increasing the purity threshold consistently improves classification performance and accuracy respectively, as filtering becomes stricter. The results demonstrate SOM filtering as an effective strategy for improving the representativeness and reliability of training datasets in remote sensing applications, while emphasizing the trade-offs when optimizing machine learning model robustness and generalizability.

Copyrights © 2025






Journal Info

Abbrev

icdsos

Publisher

Subject

Computer Science & IT

Description

International Conference on Data Science and Official Statistics International Conference on Data Science and Official Statistics (ICDSOS) 2023 is organized by Politeknik Statistika STIS and Statistics Indonesia (BPS). This international conference in collaboration with Forum Pendidikan Tinggi ...