Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Building of Informatics, Technology and Science

Pengelompokkan Pola Perubahan Cuaca Menggunakan Metode K-Medoids dan Gap Statistic Julianthy, Denissya; Hadiana, Asep Id; Ramadhan, Edvin
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7824

Abstract

Clustering daily weather patterns is an important process for understanding complex weather variations. However, commonly used methods such as K-Means have weaknesses due to their sensitivity to outliers and the need for manual clustering. This study proposes a combination of the K-Medoids and Gap Statistics methods to produce more stable and accurate clusters. Semarang's daily weather data from 2017 to 2023 was processed through cleaning, standardization with Standard Scaler, and dimensionality reduction using PCA. The Gap Statistics results indicate the optimal number of clusters is three: rainy, sunny, and cloudy. The clustering evaluation yielded a Silhouette Score of 0.3793, a Calinski-Harabasz Index of 1490.5604, and a Davies-Bouldin Index of 0.9031. These results indicate a fairly good cluster structure, although there is still room for improvement, especially in the separation between clusters.
Klasifikasi Churn Dengan Algoritma Xgboost Menggunakan Feature Selection Boruta-Shap Hadi Sakaro, Dwi Wahyu Kuncoro; Shabrina, Puspita Nurul; Ramadhan, Edvin
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.7965

Abstract

Customer churn is a critical issue for telecommunications companies, as it directly impacts revenue and business sustainability. This study proposes the development of a churn prediction model using the Extreme Gradient Boosting (XGBoost) algorithm combined with the Boruta feature selection method and SHAP (SHapley Additive exPlanations)-based feature interpretation. The dataset used is the Telco Customer Churn dataset from Kaggle, consisting of 7,043 customer records and 21 features. The research stages include data preprocessing, data transformation, an 80:20 train-test split, data balancing using SMOTE, feature selection with Boruta, feature interpretation with SHAP, and classification using XGBoost. The model’s performance was evaluated using accuracy, precision, recall, and F1-score metrics. Results show that the XGBoost model with Boruta-SHAP (Model B) achieved an accuracy of 0.7576, slightly higher than the model without feature selection (Model A), which achieved 0.7512. Model B also demonstrated improved performance for the majority class (non-churn), with recall increasing from 0.76 to 0.79 and F1-score from 0.82 to 0.83. However, for the minority class (churn), recall decreased from 0.72 to 0.66, although precision increased from 0.52 to 0.54. These findings indicate that integrating Boruta-SHAP can enhance model efficiency and interpretability, but additional strategies are required to maintain performance for the minority class.