Claim Missing Document
Check
Articles

A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND NAIVE BAYES CLASSIFICATION USING UNBALANCED DATA HANDLING Lestari, Nila; Indahwati, Indahwati; Erfiani, Erfiani; Julianti, Elisa D
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1585-1594

Abstract

Classification is a supervised learning method that predicts the class of objects whose labels are unknown. Classification in machine learning will produce good performance if it has a balanced data class on the response variable. Therefore, unbalanced classification is a problem that must be taken seriously. This study will handle unbalanced data using the Synthetic Minority Over-Sampling Technique (SMOTE). The classification methods that are quite popular are the Naïve Bayes Classifier (NB) and the Resilient Backpropagation Artificial Neural Network (Rprop-ANN). The data used comes from the Health Nutrition Research and Development Agency (Balitbangkes) which consists of 2499 observations. This study examines the use of NB and ANN using the SMOTE method to classify the incidence of anemia in young women in Indonesia. Modeling is done on 80% of training data and predictions on 20% of test data. The analysis shows that SMOTE can perform better than not handling unbalanced data. Based on the results of the study, the best method for predicting the incidence of anemia is the Naïve Bayes method, with the sensitivity value of 82%.
Effectiveness of Machine Learning Models with Bayesian Optimization-Based Method to Identify Important Variables that Affect GPA R, Arifuddin; Syafitri, Utami Dyah; Erfiani, Erfiani
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 3 (2024): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i3.21711

Abstract

To produce superior human resources, the SPs-IPB Master Program must consider the factors influencing the GPA in the student selection process. The method that can be used to identify these factors is a machine learning algorithm. This paper applies the random forest and XGBoost algorithms to identify significant variables that affect GPA. In the evaluation process, the default model will be compared with the model resulting from Bayesian and random search optimization. Bayesian optimization is a method for optimizing hyperparameters that combines information from previous iterations to improve estimates. It is highly efficient in terms of computing time. Based on a balanced accuracy and sensitivity metrics average, Bayesian optimization produces a model superior to the default model and more time-efficient than random search optimization. XGBoost sensitivity metric is 25% better than random forest. However, random forest is 19% better in accuracy and 30% in specificity. Important variables are obtained from the information gain value when splitting the tree nodes formed. According to the best random forest and XGBoost model, variables that have the most influence on students' GPA are Undergraduate University Status (X8) and Undergraduate University (X6). Meanwhile, the variables with the smallest influence are Gender (X4) and Enrollment (X9).
Robust Continuum Regression Study of LASSO Selection and WLAD LASSO on High-Dimensional Data Containing Outliers Daulay, Nurmai Syaroh; Erfiani, Erfiani; Soleh, Agus M
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 3 (2024): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i3.23123

Abstract

In research, we often encounter problems of multicollinearity and outliers, which can cause coefficients to become unstable and reduce model performance. Robust Continuum Regression (RCR) overcomes the problem of multicollinearity by reducing the number of independent variables, namely compressing the data into new variables (latent variables) that are independent of each other and whose dimensions are much smaller and applying robust regression techniques so that the complexity of the regression model can be reduced without losing essential information from data and provide more stable parameter estimates. However, it is hampered in the computational aspect if the data has very high dimensions (p>>n). In the initial stage, it is necessary to reduce dimensions by selecting variables. The Least Absolute Shrinkage and Selection Operator (LASSO) can overcome this but is sensitive to the presence of outliers, which can result in errors in selecting significant variables. Therefore, we need a method that is robust to outliers in selecting explanatory variables such as Weighted Least Absolute Deviations with LASSO penalty (WLAD LASSO) in selecting variables by considering the absolute deviation of the residuals. This method aims to overcome the problem of multicollinearity and model instability in high-dimensional data by paying attention to resistance to outliers. Leverages the outlier resistant RCR and variable selection capabilities of LASSO and WLAD LASSO to provide a more reliable and efficient solution for complex data analysis. Measure the performance of RKR-LASSO and RKR-WLAD LASSO; simulations were carried out using low-dimensional data and high-dimensional data with two scenarios, namely without outliers (δ= 0%) and with outliers (δ= 10%, 20%, 30%) with a level of correlation (ρ = 0.1,0.5,0.9). The analysis stage uses RStudio version 4.1.3 software using the "MASS" package to generate data that has a multivariate normal distribution, the "glmnet" package for LASSO variable selection, the "MTE" package for WLAD LASSO variable selection. The simulation results show the performance of RKR-LASSO tends to be superior in terms of model goodness of fit compared to RKR-WLAD LASSO. However, the performance of RKR-LASSO tends to decrease as outliers and correlations increase. RKR-LASSO tends to be looser in selecting relevant variables, resulting in a simpler model, but the variables chosen by LASSO are only marginally significant. RKR-WLAD LASSO is stricter in variable selection and only selects significant variables but ignores several variables that have a small but significant impact on the model.
Penentuan Lama Waktu Optimal pada Pengukuran Glukosa Darah Noninvasif Fitrianto, Anwar; Erfiani, Erfiani; Nisa, Rahmatun
JST (Jurnal Sains dan Teknologi) Vol. 11 No. 1 (2022)
Publisher : Universitas Pendidikan Ganesha

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (393.205 KB) | DOI: 10.23887/jstundiksha.v11i1.43185

Abstract

Pengukuran kadar glukosa darah menggunakan metode invasif, yaitu melukai bagian tubuh, seperti jari, merupakan metode yang kurang disukai oleh sebagian besar masyarakat. Tujuan penelitian ini untuk mengembangkan teknologi berupa alat pengukur kadar glukosa darah noninvasif. Alat ini menggunakan prinsip kerja spektroskopi inframerah. Oleh karena itu, lama waktu pengukuran menjadi hal yang harus dipertimbangkan. Keoptimalan lama waktu pengukuran diperlukan agar proses pemeriksaan kadar glukosa darah efisien dan bisa merekam seluruh informasi. Tujuan penelitian ini adalah menentukan lama waktu optimal pada alat pengukur kadar glukosa darah noninvasif. Data yang digunakan merupakan data primer hasil pengukuran kadar glukosa darah dari tiga responden. Data tersebut dianalisis menggunakan metode eksplorasi dan regresi linier. Hasil pemodelan dengan persamaan ,  lama waktu optimal tersebut berada pada waktu perlakuan sebesar 1700 ms dengan menggunakan metode gradien pada kurva. Maka, lama waktu tersebut secara umum dikatakan sebagai waktu yang sangat singkat dalam dalam melakukan pengukuran glukosa dalam darah secara noninfasif.
Analisis Pola Konvergensi Transpor Kelembapan Udara di Indonesia Bagian Barat Menggunakan K-Means dengan Pembobotan Statistik dan Hierarchical Shape-Based Clustering Pratiwi, Asri; Azis, Tukhfatur Rizmah; Fitrianto, Anwar; Erfiani, Erfiani; Jumansyah, L.M. Risman Dwi
KUBIK Vol 9 No 2 (2024): KUBIK: Jurnal Publikasi Ilmiah Matematika
Publisher : Jurusan Matematika, Fakultas Sains dan Teknologi, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/kubik.v9i2.39753

Abstract

This study analyzes the convergence patterns of Vertically Integrated Moisture Transport (VIMT) in the western region of Indonesia using the K-Means method with statistical weighting and Hierarchical Shape-Based Clustering based on Dynamic Time Warping (DTW). Daily data on specific humidity, zonal wind speed, and meridional wind speed from 2020–2023 were used to calculate VIMT. Clustering methods were utilized to identify grouping patterns in moisture transport data. The results showed that moisture convergence significantly increased during the rainy season (November–February). Using the K-Means method, five clusters with clearer separations were obtained compared to the four clusters produced by the Hierarchical Clustering method. Performance evaluation using Silhouette and Calinski-Harabasz scores indicated that the K-Means method was superior, with scores of 0.37 and 104.88 compared to 0.13 and 96.34 for the Hierarchical method. This provides an understanding of the moisture transport patterns, serving as a reference for predicting weather and climate patterns, thereby supporting efforts to mitigate the impacts of extreme weather in Western Indonesia.
Analisis Ridge Robust Penduga Generalized M (GM) Pada Pemodelan Kalibrasi Untuk Kadar Gula Darah Agung Tri Utomo; Erfiani, Erfiani; Fitrianto, Anwar
VARIANSI: Journal of Statistics and Its application on Teaching and Research Vol. 4 No. 2 (2022)
Publisher : Program Studi Statistika Fakultas MIPA UNM

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (384.666 KB) | DOI: 10.35580/variansiunm14

Abstract

Calibration modeling is one of the methods used to analyze the relationship between different methods. The relationship is like the relationship between invasive and non-invasive blood sugar measurement. Problems that often arise in calibration modeling are multicollinearity and outliers. Multicollinearity problems can cause the regression confidence interval to widen, so that there is no statistically significant regression coefficient. Outliers cause statistical tests to deviate. The handling of these problems can be solved by robust ridge analysis. Ridge robust is a combined analysis of ridge regression and robust regression. Ridge regression is able to overcome the problem of multicollinearity and robust regression can overcome the problem of outliers. The estimator used is Generalized M (GM). This method will be applied to a calibration model that uses invasive and non-invasive blood sugar level data. The model used with Generalized M (GM) estimator robust regression using modulation clusters 50 to 90 in 2017 is better than the modulation group 50. up to 90 in 2019. The statistical values obtained are SSE of 0.910, RMSEadj of 0.114, and RMSEP of 0.030. Calibration models that have outliers and multicollinearity problems can be overcome by robust ridge regression. The feasibility value of the model obtained in the GM estimator robust regression is smaller than the MM estimator ridge robust regression in the calibration modeling for non-invasive blood sugar level data. That is, the best model that can be used is the robust ridge regression GM estimator.
Determinants of Environmental Quality Index (EQI) in Indonesia in 2018-2022 Sihombing, Pardomuan Robinson; Erfiani, Erfiani; Notodiputro, Khairil Anwar; Kurnia, Anang
KEUNIS Vol. 13 No. 2 (2025): JULY 2025
Publisher : Finance and Banking Program, Accounting Department, Politeknik Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32497/keunis.v13i2.6559

Abstract

The environment is a critical issue in sustainable development in Indonesia, with significant variations in environmental quality between regions. This study seeks to examine the influence of the Regional Government Budget, COVID-19 (as a dummy variable), Gross Regional Domestic Product (GRDP), and the Human Development Index (HDI) on the Environmental Quality Index (EQI) in Indonesia. The data for this study were obtained from BPS–Statistics Indonesia and the Ministry of Environment and Forestry, covering the period from 2018 to 2022. The analysis employs multiple linear regression using panel data. Panel model testing indicates that the fixed effects model with cross-sectional lag provides the best fit. The results show that, collectively, all variables have a significant influence on Indonesia's Environmental Quality Index (EQI). Individually, the Regional Government Budget for environmental purposes, the COVID-19 dummy variable, and the Human Development Index (HDI) have a significant positive impact on EQI. In contrast, Gross Regional Domestic Product (GRDP) has a significant negative effect. These findings highlight the need for comprehensive macro-socioeconomic policies to sustain and enhance environmental quality in Indonesia.
The Influence of Women’s Empowerment on The Preference for Contraceptive Methods in Indonesia: A Multinomial Logistic Regression Modelling Fulazzaky, Tahira; Indahwati, Indahwati; Fitrianto , Anwar; Erfiani, Erfiani; Khikmah, Khusnia Nurul
JURNAL INFO KESEHATAN Vol 22 No 3 (2024): JURNAL INFO KESEHATAN
Publisher : Research and Community Service Unit, Poltekkes Kemenkes Kupang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31965/infokes.Vol22.Iss3.1213

Abstract

The concept of women's empowerment encompasses enabling women to take control of their own lives, independently make choices, and fulfill their complete capabilities. Numerous research studies examined the correlation between the empowerment of women and their reproductive health. In Indonesia, female labor force participation is relatively low. As a result, research on the influence of empowering women on contraceptive method preference in Indonesia makes sense. This research aims to find the multinomial logistic regression model in choosing contraceptive methods for married women in Indonesia and to identify the women’s empowerment traits that most impact contraceptive method choice.  For this study, the researchers utilized secondary data obtained from the 2017 Indonesian Demographic and Health Survey (IDHS). The participants consisted of women between the ages of 15 and 49 who were married. The total number of respondents sampled was 49,216. Variables that significantly affect contraceptive method use include the respondent's current employment, the respondent has bank account or other financial institution accounts, the cumulative count of offspring previously born and beating justified if the wife argues with her husband. The analysis is obtained using the multinomial logistic regression test, independency, multicollinearity, and parameter test, and the selection is made by considering either the smallest value of Akaike's information criterion or the option that achieves the highest level of accuracy. Findings highlight four significant variables: Firstly, employed women are more likely to use contraceptives than the unemployed. Secondly, access to banking services correlates with a higher likelihood of contraceptive use. Thirdly, women with more children tend to prefer long-acting reversible contraceptives. Lastly, endorsement of spousal violence justifiability is linked to conventional contraceptive selection. These results emphasize the roles of employment, financial access, family size, and gender-based violence perceptions in shaping contraceptive choices in Indonesia. Model 3 emerges as the most accurate predictor of preferences after eliminating six variables based on rigorous testing and multicollinearity considerations. These findings underscore the importance of addressing economic empowerment and gender-related issues in Indonesian reproductive health programs and policies. Such a comprehensive approach can enhance women's autonomy, enabling them to make crucial life choices and ultimately improving their overall well-being.         
Analisis Pola Konvergensi Transpor Kelembapan Udara di Indonesia Bagian Barat Menggunakan K-Means dengan Pembobotan Statistik dan Hierarchical Shape-Based Clustering Pratiwi, Asri; Azis, Tukhfatur Rizmah; Fitrianto, Anwar; Erfiani, Erfiani; Jumansyah, L.M. Risman Dwi
KUBIK Vol 9 No 2 (2024): KUBIK: Jurnal Publikasi Ilmiah Matematika
Publisher : Department of Mathematics, Faculty of Science and Technology, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/kubik.v9i2.39753

Abstract

This study analyzes the convergence patterns of Vertically Integrated Moisture Transport (VIMT) in the western region of Indonesia using the K-Means method with statistical weighting and Hierarchical Shape-Based Clustering based on Dynamic Time Warping (DTW). Daily data on specific humidity, zonal wind speed, and meridional wind speed from 2020–2023 were used to calculate VIMT. Clustering methods were utilized to identify grouping patterns in moisture transport data. The results showed that moisture convergence significantly increased during the rainy season (November–February). Using the K-Means method, five clusters with clearer separations were obtained compared to the four clusters produced by the Hierarchical Clustering method. Performance evaluation using Silhouette and Calinski-Harabasz scores indicated that the K-Means method was superior, with scores of 0.37 and 104.88 compared to 0.13 and 96.34 for the Hierarchical method. This provides an understanding of the moisture transport patterns, serving as a reference for predicting weather and climate patterns, thereby supporting efforts to mitigate the impacts of extreme weather in Western Indonesia.
Loopy Orthogonal Signal Correction Scatter Correction in Non-Invasive Blood Glucose: Koreksi Pencaran Loopy Orthogonal Signal Correction pada Glukosa Darah Non-Invasif Misrika, Dahlia; Erfiani, Erfiani; Wigena, Aji
Indonesian Journal of Statistics and Applications Vol 7 No 2 (2023)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v7i2p105-113

Abstract

Spectroscopy is the study of matter based on light, sound, or particles emitted, absorbed, or reflected as well as the study of methods for generating and analyzing spectra. The spectrum has systematic diversity, namely the presence of light scattering and differences in the size of objects. The spectroscopic output allows for scattering shifts, because the same object measured several times does not exactly produce the same spectrum. Problems found in the spectrum can be overcome by pre-processing the data, namely the scatter correction method. Scatter correction is used to reduce the physical properties in the spectrum so that the information obtained is relatively the same for each spectrum, produces good estimates, and can be interpreted well. One of the spectroscopic tools that utilize infrared light is a non-invasive blood glucose level measuring device. The output of the tool is the time domain and intensity spectrum. Each object from the resulting spectrum still has noise, so scatter correction can be applied to this data. The purpose of this study was to perform a loopy Orthogonal Signal Correction (OSC) scatter correction method on time domain spectrum data on intensity on a non-invasive blood glucose level measuring device. The OSC method uses the concept of orthogonality to the mean by drawing the intensity value, weighting it, calculating the vector loading and then making corrections to the initial intensity. Based on the analysis, the loopy OSC method is better than OSC because the convergence is more accurate, the mean difference is smaller, the variance is smaller and the value converges on all the values tested. Based on exploration and the average difference, the loopy OSC method is better able to form the same pattern for each replication. This also shows that an object that is measured repeatedly has been able to be identified as the same object.
Co-Authors . Aunuddin A. A., Muftih Abd. Rahman Abqorunnisa, Farah Afendi, Farit M Agus Mohamad Soleh Ahmad Khairul Reza Ahmad Nur Rohman Ahmad Syauqi Aji Hamim Wigena Alamanda, Dinda Aprilia Alfa Nugraha Pradana Alfa Nugraha Pradana Alfa Nugraha Pradana Alifviansyah, Kevin Aliu, Mufthi Alwi ALIU, MUFTIH ALWI Amatullah, Fida Fariha Amelia, Reni Aminah Aminah Amri Luthfi Najih Anadra, Rahmi Anang Kurnia Andi Harismahyanti A. Anik Djuraidah Anissa Tsalsabila Ardhani, Rizky Arini Annisa Adi Aristawidya, Rafika ASEP SAEFUDDIN Asri Pratiwi, Asri Assyifa Lala Pratiwi Hamid Aunuddin . Aunuddin Aunuddin Az-Zahra, Putri Nisrina Azis, Tukhfatur Rizmah Bagus Sartono Bartho Sihombing Bimawan Sudarmoko Budi Susetyo Daswati, Oktaviyani Daulay, Nurmai Syaroh Deti Anggraeni Ekawati Dian Kusumaningrum Dini Ramadhani Dwi Jumansyah, L.M. Risman Dwi Putri Kurniasari Fanny Amalia Farit M Afendi Farly Shabahul Khairi Fatimah Fatimah Fauziah, Monica Rahma Fitrianto, Anwar Freza Riana Fulazzaky, Tahira Hamim Wigena, Aji Hari Wijayanto Hasnataeni, Yunia Herlin Fransiska Hilda Zaikarina I Gusti Ngurah Sentana Putra I Made Sumertajaya Ihsan, Muhammad Taufik Ilmani, Erdanisa Aghnia Indah, Yunna Mentari Indahwati Irzaman, Irzaman Ismah, Ismah Julianti, Elisa D Jumansyah, L. M. Risman Dwi Jumansyah, L.M. Risman Dwi Kevin Alifviansyah Khikmah, Khusnia Nurul Khusnia Nurul Khikmah Lestari, Nila Made Agung Prebawa Parama Artha Mahfuz Hudori Marshelle, Sean Mastuti, Winda Chairani Megawati Megawati Misrika, Dahlia Mohammad Masjkur Muggy David Cristian Ginzel Muh Akbar Idris Muhammad Nur Aidi Muhammad Syafiq mutiah, siti Mutmainah, Zamrah Nabila Fida Millati Nadira Nisa Alwani Nenden Rahayu Puspitasari Novitri Novitri Nugraha, Adhiyatma Nur Khamidah nurrusydah, zaima Nurul Fadhilah Nurul Fadhilah Pardomuan Robinson Sihombing Qalbi, Asyifah R, Arifuddin Rachmat Bintang Yudhianto Rahmatun Nisa, Rahmatun Ratih Dwi Septiani Reka Agustia Astari Reni Amelia Retno Dwi Jayanti Rika Rachmawati Riska Asri Pertiwi Sachnaz Desta Oktarina Sari, Jefita Resti Siregar, Indra Rivaldi Sofia Octaviana Tangdilomban, Claudian Tikulimbong Tetinia Gulo Tiara, Yesan Umam Hidayaturrohman Unique DA Resiloy Uswatun Hasanah Utami Dyah Syafitri Utomo, Agung Tri Vitona, Desi Waode, Yully Sofyah Wati, Wahyuni Kencana Weisha, Ghea Wigena, Aji Wijaya, Ferdian Bangkit Winda Chairani Mastuti Windi D.Y Putri Yulia Christina Yuniar Istiqomah Zaikarina, Hilda Zaima Nurrusydah