Claim Missing Document
Check
Articles

Found 1 Documents
Search

Advanced Sensor Data Analysis using Big Data-Enhanced Algorithms Hadeed, Wael
Sistemasi: Jurnal Sistem Informasi Vol 15, No 4 (2026): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v15i4.6254

Abstract

Traditional IoT anomaly detection systems lack the ability to cope with the increase in dimensionality, the constraints related to processing big data and the problem of non-interpretable features extraction. This article describes a complete flow integrating Apache Spark data preparation, PCA for dimensionality reduction (from 744 to 12 components that retain 92.7% variance), and CatBoost gradient boosting for classification. Performing a thorough benchmarking of six algorithms on the Intel Berkeley Research lab dataset (n=30, 221 instances) demonstrates CatBoost as the best method obtaining F1-score=0.97, precision=0.97, accuracy=98.7% with 3-8% margin of improvements over XGBoost, LightGBM, Random Forest, and SVM methods. Temperature changes (PC1:0.37 factor) and humidity variations (PC2:0.29) became the major indicators of anomalies. The proof of computational feasibility by training finished in 45.2 seconds and making predictions under 35 seconds per batch on consumer Intel i7/16GB hardware, production level for environmental monitoring and industrial IoT applications is confirmed.