Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : Scientific Journal of Informatics

Performance Evaluation of Cheng & Church (CC) and Spectral Biclustering Algorithms under Collinearity and Overlap Conditions Hafsah, Siti; Indahwati, Indahwati; Wijayanto, Hari
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.26413

Abstract

Purpose: This study aims to address methodological challenges in evaluating biclustering algorithms under simultaneous collinearity and overlap, which often co-occur in real world multivariate data but are rarely analyzed simultaneously. This research highlights the importance of understanding how these structural challenges affect local pattern detection in data mining applications. Methods: A simulation study was conducted using synthetic matrices embedded with two constant biclusters under 15 combinations of collinearity levels (ρ = 0.3,0.6,0.9) and overlap degrees (none, small, large). Each scenario was replicated 100 times. Performance was assessed using the Liu and Wang Index (ILW), while a three-way ANOVA tested the effects of algorithm type, collinearity, and overlap. Result: Spectral Biclustering maintained stable ILW scores despite increasing collinearity, while CC performed better in low-overlap scenarios but was more sensitive to collinearity. Under high collinearity and large overlap, both algorithms experienced notable degradation. The ANOVA confirmed all main effects and interactions were significant (p < 0.001). Novelty: This study contributes empirical evidence regarding the influence of interacting structural characteristics on biclustering performance. The results deliver practical insights for selecting suitable algorithms and emphasize the potential advantages of hybrid approaches that integrate the stability of spectral methods with the adaptability of residual-based techniques.
Performance Analysis of Machine Learning Models using RFE Feature Selection and Bayesian Optimization in Imbalanced Data Classification with Shap-Based Explanations Aqmar, Nurzatil; Wijayanto, Hari; Mochamad Afendi, Farit
Scientific Journal of Informatics Vol. 12 No. 3: August 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i3.31459

Abstract

Purpose: This research aims to evaluates the performance of Random Forest (RF) and Light Gradient Boosting Machine (LightGBM) models integrated with Recursive Feature Elimination (RFE) for feature selection, Bayesian Optimization (BO) for hyperparameter tuning, and three imbalanced data handling techniques Random Undersampling (RUS), Random Oversampling (ROS), and SMOTENC. Identifying key determinants of household food insecurity in Papua using SHAP for transparent feature interpretation. Methods: The research used 2022 SUSENAS data from Papua Province. Exploring data composition and variable characteristics, and aggregating individual data into household data. Data were split using random sampling (80% training, 20% testing). Eighteen experimental scenarios were created by combining feature selection or no feature selection, three imbalance handling methods, and default or hyperparameter tuning. RF and LightGBM were evaluated over 50 iterations using accuracy, sensitivity, specificity, and G-Mean, with SHAP applied to the best-performing models for interpretability. Result: LightGBM achieved the highest accuracy and stability, particularly when combined with SMOTENC and RFE+BO. RF showed better performance in maintaining G-Mean when paired with RUS, with the highest G-Mean (0.756) obtained by RF + BO + RUS. Three-way ANOVA proved that model type, imbalance handling, feature selection, and their interaction significantly affected the G-Mean value. SHAP analysis shows that health, financial, and educational limitations can increase the risk of food insecurity. Novelty: This research offers a new integration between feature selection, hyperparameter tuning, and imbalanced data handling within an interpretable machine learning framework, thereby providing a robust solution for food vulnerability classification on imbalanced datasets.
Comparison of Extremely Randomized Survival Trees and Random Survival Forests: A Simulation Study Zaenal, Mohamad Solehudin; Fitrianto, Anwar; Wijayanto, Hari
Scientific Journal of Informatics Vol. 11 No. 3: August 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v11i3.8464

Abstract

Abstract. Purpose: This simulation study investigates the Extremely Randomized Survival Trees (EST) model, a machine learning technique expected to handle survival analysis, particularly in large survival datasets, effectively. The study compares the performance of the EST model with that of the Random Survival Forest (RSF) model, focusing on the C-index value to determine which model performs better. Methods: The analysis begins with the generation of 540 simulated datasets, created by combining three levels of sample sizes, two levels of censoring proportions, three types of hazard functions, and 30 repetitions for each scenario. The simulation data were split into 80% training and 20% testing data. The training data were used to build the EST and RSF models, while the test data were used to evaluate their performance. The model with the highest C-index value was deemed the best performer, as a higher C-index indicates superior model performance. Result: The results indicate that the sample size, type of hazard function, and the method used influence that model performance. The EST model significantly outperformed the RSF model when the sample size was large, though no significant difference was observed when the sample size was small or medium. Additionally, the EST model consistently demonstrated faster computation times across all simulation scenarios. Novelty: This study provides a pioneering exploration into applying decision tree algorithms, specifically EST and RSF, in survival analysis. While these methods have been extensively studied in regression and classification contexts, their application in survival analysis remains relatively unexplored.
Co-Authors . Aunuddin . Barizi . Gunawan Aan Kardiana Afnan, Irsyifa Mayzela Agus Mohamad Soleh Aji Hamim Wigena Akhmad Fauzi Aldi Cahyanugroho Anadra, Rahmi Anang Kurnia Andres Purmalino Anggraini Sukmawati Aqmar, Nurzatil Arief Hendarto Arif Handoyo Marsuhandi Aruddy Aruddy ASEP SAEFUDDIN Astridina, Astridina Aunuddin Aunuddin Baba Barus Bagus Sartono Bambang Hendro Trisasongko Barizi . Basita G. Sugihen Bertho Tantular Boedi Tjahjono Budi Susetyo Cici Suhaeni Cut Zaraswati DAHRUL SYAH Darjono, Agus Heru Dede Dirgahayu Domiri Dedi Budiman Hakim Dyah R Panuju Dyah R. Panuju Dyah R. Panuju Edi Abdurrachman Eko S. Pribadi Erfiani Erfiani Erliza Noor Fachry Abda El Rahman Farit Mochamad Afendi Farly Shabahul Khairi fatimah Fatimah Fitria Hasanah Fitrianto, Anwar H S, Rahmat Hikmah, Zetil I K Marla Lusda I Made Sumertajaya Ilma, Meisyatul Ina Widayanty Indahwati Irzaman, Irzaman Istiqlaliyah Muflikhati Jajah K. Wagiono Jayawarsa, A.A. Ketut Kapiluka, Kristuisno Martsuyanto Khairil Anwar Notodiputro Kurnia Suci Indraningsih Kusman Sadik La Ode Abdul Rahman Leny Maryesa Lilik Noor Yuliati Luvy Mayanda M. Syamsul Maarif Mahmud A. Raimadoya Mahmud A. Raimadoya Mualifah, Laily Nissa Atul Muhammad Nur Aidi Musa Hubeis Nunung Nurjanah Nurrahman, Fathu Panca Wiputra Pang S. Asngari Pannu, Abdullah Prabowo Tjitrpranoto Riana Riskinandini Rizal Bakri Rizky Nurkhaerani Rysda Rysda Sachnaz Desta Oktarina Siti Hafsah Suhaeni, Cici Ujang Sumarwan Utami Dyah Syafitri Yarah, Helena Ramadhini Yenni Angraini Yuni Suci Kurniawati Zaenal, Mohamad Solehudin