Claim Missing Document
Check
Articles

Found 4 Documents
Search
Journal : Jurnal Teknologi Sistem Informasi dan Aplikasi

Optimasi SVM dengan RFE dan ROS untuk Mengatasi High Dimension dan Imbalanced Data Banjir Pambudi, Faldy Alfareza; Siswa, Taghfirul Azhima Yoga; Pranoto, Wawan Joko
Jurnal Teknologi Sistem Informasi dan Aplikasi Vol. 7 No. 3 (2024): Jurnal Teknologi Sistem Informasi dan Aplikasi
Publisher : Program Studi Teknik Informatika Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32493/jtsi.v7i3.41068

Abstract

Floods are natural disasters that often occur in Indonesia, one of which is the city of Samarinda which experienced a significant increase in flood cases in 2018-2021. The use of machine learning, especially the Support Vector Machine (SVM) algorithm, aims to accurately predict future flood events, but the main problem faced is data imbalance and high-dimensional data. This research combines SVM with Random Oversampling (ROS) oversampling techniques and Recursive Feature Elimination (RFE) feature selection to overcome data imbalance and high-dimensional data, with the aim of increasing the classification accuracy of Samarinda City flood data. The cross validation method is with 10-fold cross-validation, and the model performance is evaluated with a confusion matrix to calculate the accuracy value. The data used was obtained from BPDB and BMKG Samarinda City for the 2021-2023 period, consisting of 11 attributes and 1095 lines of data. The research results show that RFE succeeded in identifying the five most important features, namely minimum temperature (Tn), maximum temperature (Tx), average temperature (Tavg), humidity (RH_avg) and maximum wind direction (ddd_x). With the combination of SVM, ROS, and RFE models, flood data classification accuracy increased by 0.78% from 97.14% to 97.92%.
Model Optimasi SVM-GSBE dalam Menangani High Dimensional Data Stunting Kota Samarinda Siti Muawwanah; Taghfirul Azhima Yoga Siswa; Wawan Joko Pranoto
Jurnal Teknologi Sistem Informasi dan Aplikasi Vol. 7 No. 3 (2024): Jurnal Teknologi Sistem Informasi dan Aplikasi
Publisher : Program Studi Teknik Informatika Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32493/jtsi.v7i3.41545

Abstract

Stunting has become a widely discussed health issue in Indonesia, par-ticularly in Samarinda City, which recorded a prevalence of 12.7% in 2023, making it the highest in East Kalimantan Province. The use of data mining techniques becomes crucial in overcoming the challenges of high dimensional data, such as computational complexity, the risk of overfitting, and visualization difficulties. This study aims to enhance the accuracy of Support Vector Machine optimization models using Grid Search and Backward Elimination feature selection (SVM-GSBE) to handle high-dimensional data related to stunting in Samarinda City. The dataset used is sourced from Samarinda City Health Office in 2023, covering 26 community health centers with 21 attributes and a total of 150,466 records. The research methodology includes data collection, pre-processing, data partitioning using K-Fold Cross Validation, feature selection using Backward Elimination, and SVM model optimization with Grid Search. Features such as BB/U, ZS TB/U, ZS BB/U, ZS BB/TB, Height, and LiLA have proven to increase accuracy in stunting data classification. Evaluation results show that Grid Search successfully increased accuracy for Linear from 99.59% to 99.78%, Polynomial from 90.92% to 99.40%, RBF from 89.80% to 98.36%, and Sigmoid from 75.29% to 86.84%. This indicates that the SVM-GSBE model can effectively be used as a tool for early detection of stunting and to support health policies in Samarinda City.
Model Optimasi KNN-PSORF dalam Menangani High Dimensional Data Banjir Kota Samarinda Restu, Anggiq Karisma Aji; Siswa, Taghfirul Azhima Yoga; Pranoto, Wawan Joko
Jurnal Teknologi Sistem Informasi dan Aplikasi Vol. 7 No. 3 (2024): Jurnal Teknologi Sistem Informasi dan Aplikasi
Publisher : Program Studi Teknik Informatika Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32493/jtsi.v7i3.41587

Abstract

Floods are a natural phenomenon that frequently occurs in Indonesia, including in Samarinda City which has faced flood issues over the past three years, affecting thousands of homes and around 27,000 residents. Predicting flood disasters requires machine learning technology using data mining classification methods. However, classification processes often encounter issues related to high-dimensional data, which can lead to overfitting and class imbalance, thereby biasing dominant classes while neglecting minority classes. This research aims to enhance classification accuracy in Samarinda City's flood data using the K-Nearest Neighbor (KNN) algorithm combined with Relief feature selection and Particle Swarm Optimization (PSO) optimization. The validation method employed is 10-fold cross-validation, with performance evaluation using a confusion matrix. Data sourced from Samarinda City's Disaster Management Agency (BPBD) and Meteorology, Climatology, and Geophysics Agency (BMKG) spans from 2021 to 2023, comprising 19 features and a total of 1095 records. Relief feature selection identified four crucial features: maximum wind direction, wind speed, average wind speed, and maximum wind speed direction. Average evaluations with k values of 3, 5, 7, 11, 13, and 15 demonstrate that Relief feature selection and PSO optimization effectively enhance accuracy in the K-Nearest Neighbor algorithm for flood data, with KNN and PSO yielding improvements of 2-5%. Relief feature selection alone improves accuracy by 1-2%, while combining Relief with PSO provides a 2-5% enhancement. The combined KNN, Relief, PSO model is expected to deliver optimal performance in classifying Samarinda City's flood data.
Model Optimasi Random Forest dengan PSO-CHI-SM dalam Mengatasi High Dimensional dan Imbalanced Data Banjir Kota Samarinda Taufiq, Ilham; Siswa, Taghfirul Azhima Yoga; Pranoto, Wawan Joko
Jurnal Teknologi Sistem Informasi dan Aplikasi Vol. 7 No. 3 (2024): Jurnal Teknologi Sistem Informasi dan Aplikasi
Publisher : Program Studi Teknik Informatika Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32493/jtsi.v7i3.41632

Abstract

Flooding is a natural disaster that frequently affects our country. Samarinda City, in particular, continues to experience frequent flooding events with 18 incidents in 2018, 33 incidents in 2020, and 32 incidents in 2021. To predict flood disasters, it is necessary to utilize technology known as machine learning for analyzing and classifying floods. However, classification often encounters issues with high-dimensional data and class imbalance. This study aims to determine the extent to which the accuracy of flood disaster classification improves by using the Random Forest algorithm with PSO for optimization, Chi-Square feature selection, and SMOTE oversampling to balance classes. The data used in this study comprises flood data from 2021-2023 obtained from BMKG and BPBD Samarinda City, with a total of 1095 records and 11 attributes. The validation technique used is 5-fold cross-validation, and the evaluation uses a confusion matrix. The results of the Chi-Square feature selection identified Rainfall, Maximum Wind Direction, Most Frequent Wind Direction, Humidity, Sunshine Duration, and Wind Speed as the most influential features based on Chi-Square scores and P-values. The average accuracy obtained from the proposed classification model using 5-fold cross-validation reached 96.02%.