Claim Missing Document
Check
Articles

Metabolite-Group Selection On Temu Ireng (Curcuma Aeruginosa) Contains Related To Toxicity Activity By Using Group Lasso Regression H S, Rahmat; Afendi, Farit Mochamad; Wijayanto, Hari
ARRUS Journal of Mathematics and Applied Science Vol. 4 No. 2 (2024)
Publisher : PT ARRUS Intelektual Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35877/mathscience3626

Abstract

Metabolites are expressed in mass-to-charge ratio (m/z) on mass spectrometry experiments. They can be identified more than once. Some of m/z representing same metabolites can be considered as a group of metabolites. Evaluation of metabolite effects can be considered based on the groups. Group least absolute shrinkage and selection operator (group lasso) regression can be used to evaluate these groups. It shrinks some coefficients of regression exactly to be zero by adding intermediate penalty on ordinary least square (OLS) objective function. The purposes of this study were to estimate groups of metabolite contains of Curcuma aeruginosa (Temu ireng) affecting toxicity activity by using group lasso regression and to compare it to partial least square regression (PLSR). The data used were toxicity activity and metabolite contain, obtained from LC-MS, of temu ireng from three areas in Java. The groups of metabolites which affected toxicity activity, of group lasso regression by using dedicated software of R with gglasso package, were groups of m/z 238.150, 250.165, 262.128, 264.144, 312.275, and 456.183. The estimates of metabolites that affected of group lasso regression and PLSR had similarities. Based on the goodness of fit, group lasso regression was better than PLSR to estimate the affecting groups.
Identifying Poverty Vulnerability Patterns in Indonesia using Cheng and Chruch’s Algorithm Afnan, Irsyifa Mayzela; Wijayanto, Hari; Wigena, Aji Hamim
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 4 (2024): October
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i4.25790

Abstract

Poverty remains a significant issue in developing countries, including Indonesia, where in 2022, the number of people living in poverty reached 26.36 million, with a poverty rate of 9.57%. The Central Statistics Agency (BPS) measures poverty using a basic needs approach, defined as the inability to meet essential food and non-food needs through expenditure. Individuals are considered poor if their average monthly per capita expenditure is below the poverty line. Research on poverty has evolved into a more multidimensional understanding, The Multidimensional Poverty Index (MPI), which identifies deprivation across three key dimensions: health, education, and living standards. This study aims to identify patterns of poverty vulnerability by applying the Cheng and Church (CC) algorithm through a biclustering approach using data from BPS. This quantitative method utilizes 13 multidimensional poverty indicators across 34 provinces. The CC algorithm begins by setting a threshold, followed by removing rows and columns with the largest residuals, adding qualifying rows and columns, and substituting elements to prevent overlap. The quality of the bicluster is then evaluated based on the Mean Squared Residue (MSR) value until optimal groups are formed. The results indicate that a threshold of ? = 0.01 generates seven biclusters with the lowest mean squared residual (0.0065), signifying optimal bicluster quality. Further validation using the Liu and Wang index reveals less than 50% similarity with other thresholds, reinforcing the uniqueness of these findings. MSR serves as a measure of homogeneity within the bicluster, similar to how uniform the level of poverty is within a region. If families have similar expenditures and are below the poverty line, they face similar challenges, resulting in a low MSR value. In contrast, the Liu and Wang index compares regions with different poverty alleviation strategies. These findings provide valuable insights for policymakers. For example, in bicluster 7, where specific interventions are needed in Papua and West Kalimantan, which face local challenges such as reliance on agriculture, low education levels, and limited access to sanitation and clean water.
Performance Evaluation of Cheng & Church (CC) and Spectral Biclustering Algorithms under Collinearity and Overlap Conditions Hafsah, Siti; Indahwati, Indahwati; Wijayanto, Hari
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.26413

Abstract

Purpose: This study aims to address methodological challenges in evaluating biclustering algorithms under simultaneous collinearity and overlap, which often co-occur in real world multivariate data but are rarely analyzed simultaneously. This research highlights the importance of understanding how these structural challenges affect local pattern detection in data mining applications. Methods: A simulation study was conducted using synthetic matrices embedded with two constant biclusters under 15 combinations of collinearity levels (ρ = 0.3,0.6,0.9) and overlap degrees (none, small, large). Each scenario was replicated 100 times. Performance was assessed using the Liu and Wang Index (ILW), while a three-way ANOVA tested the effects of algorithm type, collinearity, and overlap. Result: Spectral Biclustering maintained stable ILW scores despite increasing collinearity, while CC performed better in low-overlap scenarios but was more sensitive to collinearity. Under high collinearity and large overlap, both algorithms experienced notable degradation. The ANOVA confirmed all main effects and interactions were significant (p < 0.001). Novelty: This study contributes empirical evidence regarding the influence of interacting structural characteristics on biclustering performance. The results deliver practical insights for selecting suitable algorithms and emphasize the potential advantages of hybrid approaches that integrate the stability of spectral methods with the adaptability of residual-based techniques.
MULTIVARIATE MULTILEVEL MODELLING TO ASSESS FACTORS AFFECTING THE QUALITY OF VOCATIONAL HIGH SCHOOLS IN SOUTH SULAWESI PROVINCE Pannu, Abdullah; Wijayanto, Hari; Susetyo, Budi
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 4 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (458.686 KB) | DOI: 10.30598/barekengvol16iss4pp1515-1526

Abstract

This study analyzes the quality of Vocational High Schools (VHS), which have a hierarchical data structure and have more than one response variable. Data gathered for this study is from the Basic Education Data (DAPODIK) in the form of raw data variables of several variables that characterize the quality of VHS and other independent variables in South Sulawesi for four years (2018 to 2021) from the Ministry of Finance Republic of Indonesia (KEMENKEU), and Statistics Indonesia (BPS). The explanatory variable at the regency level consists of four years (2018 to 2021), a multi-year and high-dimensional data structure. Therefore, Principal Component Analysis (PCA) is used to overcome this. The modelling is done by using multivariate multilevel modelling (MVMM) on one-level and two-level structures. This study aims to model the average National Examination and Accreditation scores of Vocational High School in South Sulawesi using MVMM modelling that considers the regency/city level and identifies the factors that influence the average National Examination and Accreditation scores. The results showed that the two-level multivariate model with a random intercept as a hierarchical component was better than the one-level multilevel model based on a minor Deviance Information Criterion (DIC) value. Simultaneously, at the 5% level of significance, variables that contribute significantly to the quality of Vocational High Schools in South Sulawesi Province are produced. The variables that have a significant effect on the quality of Vocational High Schools at the school level are the ratio of the number of students/pupils per study group, the percentage of certified teachers to the number of teachers, the ratio of the number of students/pupils per number of toilets, the ratio of laboratory availability, and the ratio of the availability of supporting rooms. Meanwhile, at the regency level, it was found that the percentage of poverty and Gross Regional Domestic Product (GRDP) had a significant effect on the quality of Vocational High Schools.
PRE-PROCESSING DATA ON MULTICLASS CLASSIFICATION OF ANEMIA AND IRON DEFICIENCY WITH THE XGBOOST METHOD Nurrahman, Fathu; Wijayanto, Hari; Wigena, Aji Hamim; Nurjanah, Nunung
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp0767-0774

Abstract

Anemia and iron deficiency are health problems in Indonesia and globally. In Multiclass Classification, data problems often occur, such as missing data, too many variables, and unbalanced data. Then pre-processing data will be carried out using MissForest imputation, Boruta featuring selection, and SMOTE to help improve the performance of the classification model in predicting a particular class. After the data pre-processing process is carried out, classification modeling will be carried out using the XGBoost algorithm. It was found that when pre-processing the data could improve the performance of the model in predicting multiclass classification for cases of anemia and iron deficiency in women in Indonesia by 0.815 for the accuracy value and 0.9693 for the AUC value
SELECTION OF THE BEST SEM MODEL TO IDENTIFY FACTORS AFFECTING MARKETING PERFORMANCE IN THE ICT INDUSTRY Hikmah, Zetil; Wijayanto, Hari; Aidi, Muhammad Nur
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp1149-1162

Abstract

The digital revolution in society and the advances in marketing practices create tremendous challenges for companies and even more so for Information and Communication Technology (ICT) service providers. They are faced with increasingly complex and rapidly changing market competition, knowing these problems can use SEM to form a research model and find out the relationship between latent variables and their indicators. The purpose of this study is to identify the best structural equation model that can describe Marketing Performance in the ICT Industry in Indonesia. The data used in this study is primary data obtained from the results of distributing offline and online questionnaires to 300 management levels working in the ICT Industry. The methods compared in this study are Covariance Based Structural Equation Modeling and Partial Least Square Structural Equation Modeling. The results showed that the best model to determine the factors that influence Marketing Performance in the ICT Industry in Indonesia is PLS-SEM with the goodness-of-fit model R2 for the latent variable Marketing Performance is 0.436. This shows that the accuracy of the variables CEM, DBI and DOE together in predicting MP variables is relatively weak. Based on the PLS-SEM model, it is found that Digital Operational Excellence is a mediator that can increase the influence of Customer Experience Management on Marketing Performance. Meanwhile, Digital Business Innovation has no significant effect in increasing the influence of Customer Experience Management on Marketing Performance. The novelty of this research is the development of the best SEM models (CB-SEM and PLS-SEM) in the field of Information and Communication Technology in Indonesia.
Sentiment Classification on the 2024 Indonesian Presidential Candidate Dataset Using Deep Learning Approaches Suhaeni, Cici; Wijayanto, Hari; Kurnia, Anang
Indonesian Journal of Statistics and Applications Vol 8 No 2 (2024)
Publisher : Statistics and Data Science Program Study, IPB University, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v8i2p83-94

Abstract

This study aims to compare the performance of three deep learning models (LSTM, BiLSTM, and GRU) in the task of sentiment classification for the 2024 Indonesian Presidential Candidate dataset, focusing specifically on the case of Prabowo Subianto. The dataset comprises social media X posts sourced from kaggle, and the analysis investigates the effectiveness of different variants of recurrent neural network architectures in identifying public sentiment. The models were evaluated on accuracy and F1 score. The results demonstrate that BiLSTM outperformed both LSTM and GRU models in all metrics, achieving a testing accuracy of 80.70% and an F1 score of 86.86%, compared to LSTM and GRU which both achieved a testing accuracy of 72.56% and an F1 score of approximately 84%. The higher performance of BiLSTM is attributed to its ability to capture bidirectional context within the text, thereby understanding complex sentiment patterns more effectively. LSTM and GRU models displayed similar performance, therefore BiLSTM is the best model for this dataset. These results indicate that BiLSTM is especially well-suited for analyzing public sentiment towards political figures like Prabowo Subianto, offering significant insights into public discussions surrounding the 2024 Indonesian Presidential Election. This study recommends exploring transformer-based models like BERT or GPT variants to enhance sentiment classification accuracy in this domain.
Low Birth Weight Classification With Synthetic Minority Over Sampling Technique Random Forest Oktarina, Sachnaz Desta; Wijayanto, Hari; Yarah, Helena Ramadhini
Jurnal Kesehatan Ibu dan Anak Vol. 17 No. 1
Publisher : Poltekkes Kemenkes Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29238/kia.v17i1.1802

Abstract

Low birth weight (LBW) is defined as a condition where the birth weight is less than 2500 grams. Infants born with LBW conditions are more susceptible to disease and have a higher risk of dying at an early age. LBW conditions that are prone to unbalanced data can be classified using the Synthetic Minority Oversampling Technique (SMOTE) random forest method. The analysis was processed on the 2017 Indonesian Demographic and Health Survey (IDHS) data to identify important variables in predicting the incidence of LBW. The results showed that the SMOTE random forest model provided an accuracy value of 79.84%, sensitivity of 30.99%, specificity of 83.6%, and AUC of 62%. Important variables in predicting the incidence of LBW were the number of antenatal care visits, wealth quantile, maternal age at delivery, iron supplementation, marital status, and twins’ birth.
Performance Analysis of Machine Learning Models using RFE Feature Selection and Bayesian Optimization in Imbalanced Data Classification with Shap-Based Explanations Aqmar, Nurzatil; Wijayanto, Hari; Mochamad Afendi, Farit
Scientific Journal of Informatics Vol. 12 No. 3: August 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i3.31459

Abstract

Purpose: This research aims to evaluates the performance of Random Forest (RF) and Light Gradient Boosting Machine (LightGBM) models integrated with Recursive Feature Elimination (RFE) for feature selection, Bayesian Optimization (BO) for hyperparameter tuning, and three imbalanced data handling techniques Random Undersampling (RUS), Random Oversampling (ROS), and SMOTENC. Identifying key determinants of household food insecurity in Papua using SHAP for transparent feature interpretation. Methods: The research used 2022 SUSENAS data from Papua Province. Exploring data composition and variable characteristics, and aggregating individual data into household data. Data were split using random sampling (80% training, 20% testing). Eighteen experimental scenarios were created by combining feature selection or no feature selection, three imbalance handling methods, and default or hyperparameter tuning. RF and LightGBM were evaluated over 50 iterations using accuracy, sensitivity, specificity, and G-Mean, with SHAP applied to the best-performing models for interpretability. Result: LightGBM achieved the highest accuracy and stability, particularly when combined with SMOTENC and RFE+BO. RF showed better performance in maintaining G-Mean when paired with RUS, with the highest G-Mean (0.756) obtained by RF + BO + RUS. Three-way ANOVA proved that model type, imbalance handling, feature selection, and their interaction significantly affected the G-Mean value. SHAP analysis shows that health, financial, and educational limitations can increase the risk of food insecurity. Novelty: This research offers a new integration between feature selection, hyperparameter tuning, and imbalanced data handling within an interpretable machine learning framework, thereby providing a robust solution for food vulnerability classification on imbalanced datasets.
Performance of Prediction Interval Estimators based on Random Forest Models with Correlated Predictors Ilma, Meisyatul; Sartono, Bagus; Wijayanto, Hari
Jurnal Matematika UNAND Vol. 14 No. 4 (2025)
Publisher : Departemen Matematika dan Sains Data FMIPA Universitas Andalas Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/jmua.14.4.320-332.2025

Abstract

Uncertainty in prediction results is a crucial aspect that needs to be taken into account in regression modeling, especially when there is a high correlation between explanatory variables. This study aims to evaluate the performance of three prediction interval formation approaches, namely Out-of-Bag Prediction Interval (OOB-PI), Quan tile Regression Forest (QRF), and Split Conformal Prediction (SC), in Random Forest modeling. The evaluation was conducted through a simulation study with a variety of data structures, including the level of correlation between variables, the shape of the mean function, and the type of error distribution. Further validation was conducted using data from the National Socio-Economic Survey (SUSENAS) of West Java Province in 2023. The results show that increasing the correlation between explanatory variables can improve the efficiency and accuracy of prediction interval estimation. Overall, OOB-PI showed the most balanced performance compared to the other two methods, with a prediction coverage rate close to 90% and a narrower interval width than QRF and SC. This finding indicates that OOB-PI is an adaptive and efficient approach for various data structures, including socioeconomic data with highly correlated predictors.
Co-Authors . Aunuddin . Barizi . Gunawan Aan Kardiana Afnan, Irsyifa Mayzela Agus Mohamad Soleh Aji Hamim Wigena Akhmad Fauzi Aldi Cahyanugroho Anadra, Rahmi Anang Kurnia Andres Purmalino Anggraini Sukmawati Aqmar, Nurzatil Arief Hendarto Arif Handoyo Marsuhandi Aruddy Aruddy ASEP SAEFUDDIN Astridina, Astridina Aunuddin Aunuddin Baba Barus Bagus Sartono Bambang Hendro Trisasongko Barizi . Basita G. Sugihen Bertho Tantular Boedi Tjahjono Budi Susetyo Cici Suhaeni Cut Zaraswati DAHRUL SYAH Darjono, Agus Heru Dede Dirgahayu Domiri Dedi Budiman Hakim Dyah R Panuju Dyah R. Panuju Dyah R. Panuju Edi Abdurrachman Eko S. Pribadi Erfiani Erfiani Erliza Noor Fachry Abda El Rahman Farit Mochamad Afendi Farly Shabahul Khairi fatimah Fatimah Fitria Hasanah Fitrianto, Anwar H S, Rahmat Hikmah, Zetil I K Marla Lusda I Made Sumertajaya Ilma, Meisyatul Ina Widayanty Indahwati Irzaman, Irzaman Istiqlaliyah Muflikhati Jajah K. Wagiono Jayawarsa, A.A. Ketut Kapiluka, Kristuisno Martsuyanto Khairil Anwar Notodiputro Kurnia Suci Indraningsih Kusman Sadik La Ode Abdul Rahman Leny Maryesa Lilik Noor Yuliati Luvy Mayanda M. Syamsul Maarif Mahmud A. Raimadoya Mahmud A. Raimadoya Mualifah, Laily Nissa Atul Muhammad Nur Aidi Musa Hubeis Nunung Nurjanah Nurrahman, Fathu Panca Wiputra Pang S. Asngari Pannu, Abdullah Prabowo Tjitrpranoto Riana Riskinandini Rizal Bakri Rizky Nurkhaerani Rysda Rysda Sachnaz Desta Oktarina Siti Hafsah Suhaeni, Cici Ujang Sumarwan Utami Dyah Syafitri Yarah, Helena Ramadhini Yenni Angraini Yuni Suci Kurniawati Zaenal, Mohamad Solehudin