Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Journal of Computing Theories and Applications

Dataset Analysis and Feature Characteristics to Predict Rice Production based on eXtreme Gradient Boosting Wijayanti, Ella Budi; Setiadi, De Rosal Ignatius Moses; Setyoko, Bimo Haryo
Journal of Computing Theories and Applications Vol. 1 No. 3 (2024): JCTA 1(3) 2024
Publisher : Universitas Dian Nuswantoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/jcta.10057

Abstract

Rice plays a vital role as the main food source for almost half of the global population, contributing more than 21% of the total calories humans need. Production predictions are important for determining import-export policies. This research proposes the XGBoost method to predict rice harvests globally using FAO and World Bank datasets. Feature analysis, removal of duplicate data, and parameter tuning were carried out to support the performance of the XGBoost method. The results showed excellent performance based on which reached 0.99. Evaluation of model performance using metrics such as MSE, and MAE measured by k-fold validation show that XGBoost has a high ability to predict crop yields accurately compared to other regression methods such as Random Forest (RF), Gradient Boost (GB), Bagging Regressor (BR) and K-Nearest Neighbor (KNN). Apart from that, an ablation study was also carried out by comparing the performance of each model with various features and state-of-the-art. The results prove the superiority of the proposed XGBoost method. Where results are consistent, and performance is better, this model can effectively support agricultural sustainability, especially rice production.
Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection Setiadi, De Rosal Ignatius Moses; Ojugo, Arnold Adimabua; Pribadi, Octara; Kartikadarma , Etika; Setyoko, Bimo Haryo; Widiono, Suyud; Robet, Robet; Aghaunor, Tabitha Chukwudi; Ugbotu, Eferhire Valentine
Journal of Computing Theories and Applications Vol. 2 No. 4 (2025): JCTA 2(4) 2025
Publisher : Universitas Dian Nuswantoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/jcta.12698

Abstract

Breast cancer is the most prevalent cancer among women worldwide, requiring early and accurate diagnosis to reduce mortality. This study proposes a hybrid classification pipeline that integrates Hybrid Statistical Feature Selection (HSFS) with unsupervised LSTM-guided feature extraction for breast cancer detection using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. Initially, 20 features were selected using HSFS based on Mutual Information, Chi-square, and Pearson Correlation. To address class imbalance, the training set was balanced using the Synthetic Minority Over-sampling Technique (SMOTE). Subsequently, an LSTM encoder extracted non-linear latent features from the selected features. A fusion strategy was applied by concatenating the statistical and latent features, followed by re-selection of the top 30 features. The final classification was performed using a Support Vector Machine (SVM) with RBF kernel and evaluated using 5-fold cross-validation and a held-out test set. Experimental results showed that the proposed method achieved an average training accuracy of 98.13%, F1-score of 98.13%, and AUC-ROC of 99.55%. On the held-out test set, the model reached an accuracy of 99.30%, precision of 100%, and F1-score of 99.05%, with an AUC-ROC of 0.9973. The proposed pipeline demonstrates improved generalization and interpretability compared to existing methods such as LightGBM-PSO, DHH-GRU, and ensemble deep networks. These results highlight the effectiveness of combining statistical selection and LSTM-based latent feature encoding in a balanced classification framework.