Claim Missing Document
Check
Articles

Bibliometric Mapping and Trend Analysis of Beta Regression Modeling: A Decade of Development (2015–2024) Sihombing, Pardomuan Robinson; Erfiani, Erfiani; Notodiputro, Khairil Anwar; Kurnia, Anang
Sinkron : jurnal dan penelitian teknik informatika Vol. 9 No. 3 (2025): Article Research July 2025
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v9i3.14949

Abstract

Beta regression is a statistical model designed to handle dependent variables that assume values within the open interval (0, 1), such as rates, proportions, or percentages. The study aimed to determine the development of beta regression over the last 10 years with a bibliometric approach. The source of the article database used comes from the Scopus website. The tool used for analysis is R software with a bibliometrix package. The results of this study show that there are 293 articles published in the Scopus Journal. Research develops in various research fields. The author with the most articles is Cribari-Neto, F., with the most significant number of documents, i.e., 12. According to the author's country of origin related to the beta regression method, Brazil has the most countries, while Indonesia is in 12th place. Therefore, research on beta regression still has excellent potential to continue to be developed.
OUTLIER DETECTION ON HIGH DIMENSIONAL DATA USING MINIMUM VECTOR VARIANCE (MVV) A., Andi Harismahyanti; Indahwati, Indahwati; Fitrianto, Anwar; Erfiani, Erfiani
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 3 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (473.955 KB) | DOI: 10.30598/barekengvol16iss3pp797-804

Abstract

High-dimensional data can occur in actual cases where the variable p is larger than the number of observations n. The problem that often occurs when adding data dimensions indicates that the data points will approach an outlier. Outliers are part of observations that do not follow the data distribution pattern and are located far from the data center. The existence of outliers needs to be detected because it can lead to deviations from the analysis results. One of the methods used to detect outliers is the Mahalanobis distance. To obtain a robust Mahalanobis distance, the Minimum Vector Variance (MVV) method is used. This study will compare the MVV method with the classical Mahalanobis distance method in detecting outliers in non-invasive blood glucose level data, both at p>n and n>p. The test results show that the MVV method is better for n>p. MVV shows more effective results in identifying the minimum data group and outlier data points than the classical method.
THE ORDINAL LOGISTIC REGRESSION MODEL WITH SAMPLING WEIGHTS ON DATA FROM THE NATIONAL SOCIO-ECONOMIC SURVEY Amelia, Reni; Indahwati, Indahwati; Erfiani, Erfiani
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 4 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (461.793 KB) | DOI: 10.30598/barekengvol16iss4pp1355-1364

Abstract

Ordinal logistic regression is a method describing the relationship between an ordered categorical response variable and one or more explanatory variables. The parameter estimation of this model uses the maximum likelihood estimation having assumption that each sample unit having an equal chance of being selected, or using simple random sampling (SRS) design. This study uses data from the National Socio-Economic Survey (SUSENAS) having two-stage one-phase sampling (not SRS). So, the parameter estimation should consider the sampling weights. This study describes the parameter estimation of the ordinal logistic regression with sampling weight using the pseudo maximum likelihood method, especially in SUSENAS sampling design framework. The variance estimation method uses Taylor linearization. This study also provides numerical examples using ordinal logistic regression with sampling weight. Data used is 121,961 elderly spread over 514 districts/cities. Testing data (20%) is used to obtain the accuracy of the prediction results. The variables used in this study are the health status of the elderly as the response variable, and nine explanatory variables. The results of this study indicate that the ordinal logistic regression model with sampling weights is more representative of the population and more capable to predict minority categories of the response variable (poor and moderate health status) than is without sampling weights.
PERFORMANCE OF LASSO AND ELASTIC-NET METHODS ON NON-INVASIVE BLOOD GLUCOSE MEASUREMENT CALIBRATION MODELING Abqorunnisa, Farah; Erfiani, Erfiani; Djuraidah, Anik
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 1 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (446.747 KB) | DOI: 10.30598/barekengvol17iss1pp0037-0042

Abstract

Diabetes Mellitus (DM) is a disease that can occur in humans caused by conditions of high blood glucose levels (hyperglycemia). Detection of blood glucose levels can be done using invasive methods (injuring) and non- invasive methods (with infrared rays). Analytical methods are needed to model these results to obtain estimates of blood glucose levels. An alternative approach that can be used to analyze the relationship between invasive and non- invasive blood glucose levels is the calibration model. Problems that often occur in calibration modelling are multicollinearity and outliers. These problems can be overcome by adding new data, applying principal component analysis, and using LASSO and Elastic-Net regression to overcome calibration problems. The research data used was invasive and non-invasive blood glucose data in 2019, with as many as 74 respondents. The results of the study concluded that the summarization of the trapezoidal area in calibration modelling provides a good estimate. The performance of the Elastic Net method provides better prediction results than other models, with an RMSE value of 22.39. It has the most significant positive correlation value of 0.97, which means close to 1 so that the performance of the Elastic Net method can handle calibration modelling.
G-OPTIMAL DESIGN OF NON-LINEAR MODEL TO INCREASE PURITY LEVELS OF SILICON DIOXIDE Wulandari, Nindya; Erfiani, Erfiani; Irzaman, Irzaman; Syafitri, Utami Dyah
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp0659-0666

Abstract

Silicon Dioxide (SiO2) is one of the most abundant minerals found on earth. SiO2 is widely used in various fields, so its availability as a finite natural resource diminishes. A purity procedure can raise the purity of low-quality silica by altering the temperature and rate of temperature rise. This study aims to obtain the best design for increasing SiO2 levels—the G-optimal design on a non-linear model using the Variable Neighborhood Search (VNS) algorithm. The VNS algorithm employs two types of neighborhoods, one acquired by replacing one design point with a candidate set and the other by replacing two design points with two points in the candidate set. The model used to increase silicon dioxide's purity is a non-linear model that follows the exponential decay distribution. The best design points obtained from the G-optimal design on the relationship between temperature (oC) and the rate of temperature increase (oC/min) 800 oC to 900 oC is a pair of points 800 oC and 1,67 oC /min, 800 oC and 2,17 oC/min, 815 oC and 2,50 oC/min, 825 oC and 2,00 oC/min, 845 oC and 2,34 oC/min, 895 oC and 3,34 oC/min 900 oC and 3,50 oC/min with a G-efficiency of 96,41%.
OVERDISPERSION HANDLING IN POISSON REGRESSION MODEL BY APPLYING NEGATIVE BINOMIAL REGRESSION Tiara, Yesan; Aidi, Muhammad Nur; Erfiani, Erfiani; Rachmawati, Rika
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 1 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (418.136 KB) | DOI: 10.30598/barekengvol17iss1pp0417-0426

Abstract

Statistical analysis that can be used if the response variable is quantified data is Poisson regression, assuming that the assumption must be met equidispersion, where the average response variable is the same as the standard deviation value. A negative binomial regression can overcome an unfulfilled equidispersion assumption where the mean is greater than the standard deviation value (overdispersion). This method is more flexible because it does not require that the variance be equal to the mean. The case studies used in this research are cases of anemia in women of childbearing age (WCA) in 33 provinces of Indonesia. This study aims to apply the Poisson regression method and negative binomial in the case data of anemia in WCA to prove the model's goodness and find the factors that influence anemia in WCA. This data was obtained from biomedical sample data for Riset Kesehatan Dasar (Riskesdas) and data obtained from the website of the Badan Pusat Statistik (BPS) in 2013. By applying these two methods, the result is that negative binomial regression is the best model in modeling WCA cases with anemia in Indonesia because it has the smallest AIC value of 221.72; however, the difference is not too far from the AIC in the Poisson regression model, which is 221.83. It can also be supported that Poisson regression is unsuitable for the analysis because of the case of overdispersion. With a significance level of 10%, the number of WCA affected by malaria per 100 population influences cases of WCA anemia. At the same time, other independent variables have no effect.
A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND NAIVE BAYES CLASSIFICATION USING UNBALANCED DATA HANDLING Lestari, Nila; Indahwati, Indahwati; Erfiani, Erfiani; Julianti, Elisa D
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1585-1594

Abstract

Classification is a supervised learning method that predicts the class of objects whose labels are unknown. Classification in machine learning will produce good performance if it has a balanced data class on the response variable. Therefore, unbalanced classification is a problem that must be taken seriously. This study will handle unbalanced data using the Synthetic Minority Over-Sampling Technique (SMOTE). The classification methods that are quite popular are the Naïve Bayes Classifier (NB) and the Resilient Backpropagation Artificial Neural Network (Rprop-ANN). The data used comes from the Health Nutrition Research and Development Agency (Balitbangkes) which consists of 2499 observations. This study examines the use of NB and ANN using the SMOTE method to classify the incidence of anemia in young women in Indonesia. Modeling is done on 80% of training data and predictions on 20% of test data. The analysis shows that SMOTE can perform better than not handling unbalanced data. Based on the results of the study, the best method for predicting the incidence of anemia is the Naïve Bayes method, with the sensitivity value of 82%.
Effectiveness of Machine Learning Models with Bayesian Optimization-Based Method to Identify Important Variables that Affect GPA R, Arifuddin; Syafitri, Utami Dyah; Erfiani, Erfiani
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 3 (2024): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i3.21711

Abstract

To produce superior human resources, the SPs-IPB Master Program must consider the factors influencing the GPA in the student selection process. The method that can be used to identify these factors is a machine learning algorithm. This paper applies the random forest and XGBoost algorithms to identify significant variables that affect GPA. In the evaluation process, the default model will be compared with the model resulting from Bayesian and random search optimization. Bayesian optimization is a method for optimizing hyperparameters that combines information from previous iterations to improve estimates. It is highly efficient in terms of computing time. Based on a balanced accuracy and sensitivity metrics average, Bayesian optimization produces a model superior to the default model and more time-efficient than random search optimization. XGBoost sensitivity metric is 25% better than random forest. However, random forest is 19% better in accuracy and 30% in specificity. Important variables are obtained from the information gain value when splitting the tree nodes formed. According to the best random forest and XGBoost model, variables that have the most influence on students' GPA are Undergraduate University Status (X8) and Undergraduate University (X6). Meanwhile, the variables with the smallest influence are Gender (X4) and Enrollment (X9).
Robust Continuum Regression Study of LASSO Selection and WLAD LASSO on High-Dimensional Data Containing Outliers Daulay, Nurmai Syaroh; Erfiani, Erfiani; Soleh, Agus M
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 3 (2024): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i3.23123

Abstract

In research, we often encounter problems of multicollinearity and outliers, which can cause coefficients to become unstable and reduce model performance. Robust Continuum Regression (RCR) overcomes the problem of multicollinearity by reducing the number of independent variables, namely compressing the data into new variables (latent variables) that are independent of each other and whose dimensions are much smaller and applying robust regression techniques so that the complexity of the regression model can be reduced without losing essential information from data and provide more stable parameter estimates. However, it is hampered in the computational aspect if the data has very high dimensions (p>>n). In the initial stage, it is necessary to reduce dimensions by selecting variables. The Least Absolute Shrinkage and Selection Operator (LASSO) can overcome this but is sensitive to the presence of outliers, which can result in errors in selecting significant variables. Therefore, we need a method that is robust to outliers in selecting explanatory variables such as Weighted Least Absolute Deviations with LASSO penalty (WLAD LASSO) in selecting variables by considering the absolute deviation of the residuals. This method aims to overcome the problem of multicollinearity and model instability in high-dimensional data by paying attention to resistance to outliers. Leverages the outlier resistant RCR and variable selection capabilities of LASSO and WLAD LASSO to provide a more reliable and efficient solution for complex data analysis. Measure the performance of RKR-LASSO and RKR-WLAD LASSO; simulations were carried out using low-dimensional data and high-dimensional data with two scenarios, namely without outliers (δ= 0%) and with outliers (δ= 10%, 20%, 30%) with a level of correlation (ρ = 0.1,0.5,0.9). The analysis stage uses RStudio version 4.1.3 software using the "MASS" package to generate data that has a multivariate normal distribution, the "glmnet" package for LASSO variable selection, the "MTE" package for WLAD LASSO variable selection. The simulation results show the performance of RKR-LASSO tends to be superior in terms of model goodness of fit compared to RKR-WLAD LASSO. However, the performance of RKR-LASSO tends to decrease as outliers and correlations increase. RKR-LASSO tends to be looser in selecting relevant variables, resulting in a simpler model, but the variables chosen by LASSO are only marginally significant. RKR-WLAD LASSO is stricter in variable selection and only selects significant variables but ignores several variables that have a small but significant impact on the model.
Penentuan Lama Waktu Optimal pada Pengukuran Glukosa Darah Noninvasif Fitrianto, Anwar; Erfiani, Erfiani; Nisa, Rahmatun
JST (Jurnal Sains dan Teknologi) Vol. 11 No. 1 (2022)
Publisher : Universitas Pendidikan Ganesha

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (393.205 KB) | DOI: 10.23887/jstundiksha.v11i1.43185

Abstract

Pengukuran kadar glukosa darah menggunakan metode invasif, yaitu melukai bagian tubuh, seperti jari, merupakan metode yang kurang disukai oleh sebagian besar masyarakat. Tujuan penelitian ini untuk mengembangkan teknologi berupa alat pengukur kadar glukosa darah noninvasif. Alat ini menggunakan prinsip kerja spektroskopi inframerah. Oleh karena itu, lama waktu pengukuran menjadi hal yang harus dipertimbangkan. Keoptimalan lama waktu pengukuran diperlukan agar proses pemeriksaan kadar glukosa darah efisien dan bisa merekam seluruh informasi. Tujuan penelitian ini adalah menentukan lama waktu optimal pada alat pengukur kadar glukosa darah noninvasif. Data yang digunakan merupakan data primer hasil pengukuran kadar glukosa darah dari tiga responden. Data tersebut dianalisis menggunakan metode eksplorasi dan regresi linier. Hasil pemodelan dengan persamaan ,  lama waktu optimal tersebut berada pada waktu perlakuan sebesar 1700 ms dengan menggunakan metode gradien pada kurva. Maka, lama waktu tersebut secara umum dikatakan sebagai waktu yang sangat singkat dalam dalam melakukan pengukuran glukosa dalam darah secara noninfasif.
Co-Authors . Aunuddin A. A., Muftih Abd. Rahman Abqorunnisa, Farah Agung Tri Utomo Agus Mohamad Soleh Ahmad Khairul Reza Ahmad Nur Rohman Ahmad Syauqi Aji Hamim Wigena Alamanda, Dinda Aprilia Alfa Nugraha Pradana Alfa Nugraha Pradana Alfa Nugraha Pradana Aliu, Mufthi Alwi ALIU, MUFTIH ALWI Amatullah, Fida Fariha Amelia, Reni Aminah Aminah Anadra, Rahmi Anang Kurnia Anik Djuraidah Anissa Tsalsabila Ardhani, Rizky Arini Annisa Adi Aristawidya, Rafika ASEP SAEFUDDIN Asri Pratiwi, Asri Assyifa Lala Pratiwi Hamid Aunuddin . Aunuddin Aunuddin Azis, Tukhfatur Rizmah Bagus Sartono Bartho Sihombing Bimawan Sudarmoko Budi Susetyo Daswati, Oktaviyani Daulay, Nurmai Syaroh Deti Anggraeni Ekawati Dian Kusumaningrum Dini Ramadhani Dwi Jumansyah, L.M. Risman Dwi Putri Kurniasari Fanny Amalia Farit M Afendi Farly Shabahul Khairi Fatimah Fatimah Fauziah, Monica Rahma Fitrianto, Anwar Freza Riana Fulazzaky, Tahira Hamim Wigena, Aji Hari Wijayanto Harismahyanti A., Andi Hasnataeni, Yunia Herlin Fransiska Hilda Zaikarina I Made Sumertajaya Ihsan, Muhammad Taufik Ilmani, Erdanisa Aghnia Indah, Yunna Mentari Indahwati Irzaman, Irzaman Ismah, Ismah Julianti, Elisa D Jumansyah, L. M. Risman Dwi Jumansyah, L.M. Risman Dwi Khikmah, Khusnia Nurul Khusnia Nurul Khikmah Lestari, Nila Made Agung Prebawa Parama Artha Mahfuz Hudori Marshelle, Sean Megawati Megawati Misrika, Dahlia Mohammad Masjkur Muggy David Cristian Ginzel Muhammad Nur Aidi mutiah, siti Nadira Nisa Alwani Nenden Rahayu Puspitasari Novitri Novitri Nugraha, Adhiyatma Nur Khamidah Nurul Fadhilah Pardomuan Robinson Sihombing Qalbi, Asyifah R, Arifuddin Rahmatun Nisa, Rahmatun Ramadhani, Dini Ratih Dwi Septiani Reka Agustia Astari Reni Amelia Retno Dwi Jayanti Rika Rachmawati Riska Asri Pertiwi Siregar, Indra Rivaldi Sofia Octaviana Tetinia Gulo Tiara, Yesan Umam Hidayaturrohman Uswatun Hasanah Utami Dyah Syafitri Vitona, Desi Waode, Yully Sofyah Wati, Wahyuni Kencana Weisha, Ghea Wigena, Aji Wijaya, Ferdian Bangkit Winda Chairani Mastuti Windi D.Y Putri Yulia Christina Yuniar Istiqomah Zaima Nurrusydah