Firmansyah, Achmad
Unknown Affiliation

Published : 6 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS

From Noisy Data to Insight: SOM Filtering Implementation For Improving the Machine Learning Model Firmansyah, Achmad
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.614

Abstract

The filtering of representative training data from Big Data are critical steps in developing machine learning models, particularly for official statistics. This study demonstrates the application of Self-Organizing Map (SOM) filtering for enhancing training data quality in remote sensing-based classification of paddy phenological stages using satellite data. By clustering the data, SOM identifies and filters representative samples, which further removing noise and irrelevancy. Following the filtering, comparison is conducted between several purity threshold scheme and non-filtering dataset during model development. Findings reveal that increasing the purity threshold consistently improves classification performance and accuracy respectively, as filtering becomes stricter. The results demonstrate SOM filtering as an effective strategy for improving the representativeness and reliability of training datasets in remote sensing applications, while emphasizing the trade-offs when optimizing machine learning model robustness and generalizability.