Claim Missing Document
Check
Articles

Found 22 Documents
Search

Pendugaan Koefisien Regresi Logistik Biner Menggunakan Algoritma Least Angle Regression Utami, Mamik; Islamiyati, Anna; Thamrin, Sri Astuti
ESTIMASI: Journal of Statistics and Its Application Vol. 5, No. 1, Januari, 2024 : Estimasi
Publisher : Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/ejsa.v5i1.12489

Abstract

Binary logistic regression is a statistical analysis method that aims to determine the relationship between variable which has two categories with the predictor variable that have categorical or continuous scale. The method that used to estimate logistic regression parameters is Maximum Likelihood Estimation (MLE) method. In estimating parameters, Least Angle Regression (LAR) algorithm is used to select the significant variables in order to get the best model from the estimation results of binary logistic regression coefficients. This LAR algorithm is applied to the risko of stunting data in two-year-old-babies at Buntu Batu Health Center working area, Enrekang Regency, South Sulawesi in 2019. This results obtained in the estimation of binary logistic regression prediction model using LAR algorithm, the standard error value is 0.018 smaller than the standard error value of binary logistic regression, which is 0.025. This shows that the binary logistic regression model using LAR algorithm is better than the usual binary logistic regression model on the risk of stunting data. Based on the results obtained, the variables that significantly affect the risk of stunting in two-year-old-babies on 2019 are father’s height, body length of birth, exclusive breastfeeding, history of infectious diseases, and history of immunization.
Determining Factors that Influence Unmet Need For Family Planning Using Geographically Weighted Logistic Regression With LASSO: Dian Ayu Permata Sari Rusdy; Sri Astuti Thamrin; Anna Islamiyati
Jurnal Matematika, Statistika dan Komputasi Vol. 21 No. 2 (2025): JANUARY 2025
Publisher : Department of Mathematics, Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/j.v21i2.35081

Abstract

Binary logistic regression is a regression used for categorical response variables with two possibilities: success or failure. This regression is a global model, making it inappropriate for spatial data. Binary logistic regression was then developed into geographically weighted logistic regression (GWLR). GWLR considers location factors into the model through a weight function. Nevertheless, GWLR is unable to overcome multicollinearity issue. Multicollinearity can cause the estimated parameters to be insignificant, thus it needs to be solved. A method to deal with multicollinearity is least absolute shrinkage and selection operator (LASSO). LASSO is applicable to various areas, including health, namely in the case of unmet need for family planning (FP). Unmet need for FP refers to productive-age women who do not wish to have more children or wish to postpone having children without using contraceptive methods. This study aims to obtain GWLR model with LASSO and influential factors, and acquire the performance of GWLR model with LASSO on unmet need for FP in South Sulawesi. The AIC value of the GWLR with LASSO model, which is 31,918, is less than the AIC value of the GWLR without LASSO, which is 38,879. This implies that GWLR with LASSO method is able to model unmet need for FP better than GWLR model. In addition, it was obtained that the status of unmet need for FP in 22 districts/cities was affected by the percentage of women with junior high school education or equivalent or lower, number of high-fertility women, percentage of husbands/families who refuse family planning, and number of KB staffs, while there were 2 districts/cities where the status of unmet need for KB was determined by the number of high-fertility women, percentage of husbands/families who refuse family planning, and number of FP staffs.
MEASUREMENT OF CLASSIFICATION PERFORMANCE WITH THE LEARNING VECTOR QUANTIZATION METHOD ON COVID-19 VACCINATION DATA AT THE PARUMPANAI HEALTH CENTER PRANANDA, ADHIYAKSA; Siswanto, Siswanto; Thamrin, Sri Astuti; Siddik, A. Muh. Amil
Jurnal Matematika UNAND Vol. 13 No. 2 (2024)
Publisher : Departemen Matematika dan Sains Data FMIPA Universitas Andalas Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/jmua.13.2.131-141.2024

Abstract

In the midst of the COVID-19 pandemic, various countries are always trying their best to restore global stability. One effective way is the discovery of several vaccines to prevent transmission of the virus. Indonesia is one of the countries that is aggressively implementing the COVID-19 vaccination. The vaccination process which has been carried out from February 2021 until the end of 2021 has covered approximately 160 million people or 76.83% of the target set by the government. Vaccine recipients have criteria to be able to get vaccinated to avoid side effects or complications. So it is necessary to classify groups that can receive vaccines and also delay vaccination. This research aims to determine the performance of the learning vector quantization classification method. Learning vector quantization method classification produces 95% accuracy, 97% precision, and 96% sensitivity. From these performance measurements, it can be concluded that the learning vector quantization method is very good and can be used in the classification of COVID-19 vaccination recipients at the Parumpanai Public Health Center, East Luwu Regency.
KLASIFIKASI FAKTOR-FAKTOR PENYEBAB PENYAKIT DIABETES MELITUS DI RUMAH SAKIT UNHAS MENGGUNAKAN ALGORITMA C4.5 Dewi Rahma Ente; Sri Astuti Thamrin; Samsul Arifin; Hedi Kuswanto; Andreza Andreza
Indonesian Journal of Statistics and Applications Vol 4 No 1 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i1.330

Abstract

Diabetes mellitus (DM) is one of the chronic and deadly diseases that are widely observed in various countries today. This disease continues and is increasing to a very alarming stage. This study aims to identify and see the relationship between factors that influence DM disease. The method used in this research is C4.5 algorithm which is one of the algorithms used to make predictive classifications. Classification is one of the processes in data mining that aims to find patterns in relatively large data that use the representations in the form of decision trees. This method is applied to data from medical records of patients with DM in 2014-2018 taken from the Hasanuddin University Teaching Hospital. The results obtained indicate that there are four factors that influence the prediction of a patient's DM status namely; Fasting Blood Glucose (GDP), LDL Cholesterol, Triglycerides, and Body Weight.
PENENTUAN FAKTOR-FAKTOR POTENSIAL YANG MEMPENGARUHI KEJADIAN MALARIA DI PROVINSI PAPUA DENGAN EPIDEMIOLOGI SPASIAL Siswanto Siswanto; Sri Astuti Thamrin
Indonesian Journal of Statistics and Applications Vol 4 No 3 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i3.681

Abstract

In Indonesia malaria is found to be widespread in all islands with varying degrees and severity of infection. Based on the Annual of Parasite Incidence (API) in Eastern Indonesia, Malaria is a disease that has a high incidence rate. The three provinces with the highest APIs are Papua (42.64%), West Papua (38.44%) and East Nusa Tenggara (16.37%). Spatial aspects are considered important to be studied because the spread of disease through mosquitoes is strongly influenced by fluctuating climate. The purpose of this study is to determine the potential factors that influence the incidence of Malaria disease in the province of Papua in 2013 by looking at aspects that are the focus of attention in spatial epidemiology. The methods used in analyzing the area are Simultaneous Autoregressive (SAR) and Conditional Autoregressive (CAR) models with a spatial weighting matrix up to second order. The result shows the average monthly wind velocity, average monthly rainfall, and malaria treatment with government program drugs by getting ACT drugs are substantial factors in determining the incidence number of Malaria in Papua based on the lowest AIC value for the second-order of CAR model. While the SAR model, in this case, has no spatial influence. By knowing the potential factors that influence the incidence of malaria, the Papua Province through the Health Office can take more effective preventive measures to reduce the number of malaria incidents.
Exploration of Obesity Status of Indonesia Basic Health Research 2013 With Synthetic Minority Over-Sampling Techniques: Eksplorasi Status Obesitas Riset Kesehatan Dasar 2013 Indonesia dengan Teknik Synthetic Minority Over-Sampling Sri Astuti Thamrin; Dian Sidik; Hedi Kuswanto; Armin Lawi; Ansariadi Ansariadi
Indonesian Journal of Statistics and Applications Vol 5 No 1 (2021)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v5i1p75-91

Abstract

The accuracy of the data class is very important in classification with a machine learning approach. The more accurate the existing data sets and classes, the better the output generated by machine learning. In fact, classification can experience imbalance class data in which each class does not have the same portion of the data set it has. The existence of data imbalance will affect the classification accuracy. One of the easiest ways to correct imbalanced data classes is to balance it. This study aims to explore the problem of data class imbalance in the medium case dataset and to address the imbalance of data classes as well. The Synthetic Minority Over-Sampling Technique (SMOTE) method is used to overcome the problem of class imbalance in obesity status in Indonesia 2013 Basic Health Research (RISKESDAS). The results show that the number of obese class (13.9%) and non-obese class (84.6%). This means that there is an imbalance in the data class with moderate criteria. Moreover, SMOTE with over-sampling 600% can improve the level of minor classes (obesity). As consequence, the classes of obesity status balanced. Therefore, SMOTE technique was better compared to without SMOTE in exploring the obesity status of Indonesia RISKESDAS 2013.
SENTIMENT ANALYSIS OF MERDEKA BELAJAR KAMPUS MERDEKA POLICY USING SUPPORT VECTOR MACHINE WITH WORD2VEC Rezki, Nurul; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 1 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (352.972 KB) | DOI: 10.30598/barekengvol17iss1pp0481-0486

Abstract

Sentiment analysis is a data text analysis that classifies data into positive and negative sentiments. This study aims to obtain the results of sentiment classification related to Merdeka Belajar Kampus Merdeka policy on Twitter using support vector machine algorithm with Word2Vec feature extraction. Support Vector Machine is a classification algorithm that separates data classes using the optimum hyperplane. Text data used in sentiment analysis must change its numerical form by performing feature extraction. In this study, the feature extraction used is Word2Vec which represents words in vector form. Data in this study are tweets with the keyword "Kampus Merdeka" uploaded on Twitter as many as 10000 tweets. After preprocessing text data, data used to analyze sentiment was 1579 tweets. Sentiment classification resulted in classification model accuracy 89.87%, precision 91.20%, recall 84.44% and F-Measure 87.68%. Classification sentiment using support vector machine with Word2Vec feature extraction in this study produces a good model.
SPATIAL MODELING IN DATA PANELS WITH LEAST SQUARE DUMMY VARIABLE TO IDENTIFY FACTORS AFFECTING UNEMPLOYMENT IN INDONESIA Amil, Sri Indriani; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1381-1392

Abstract

Unemployment is a serious issue that must be addressed. Unemployment has a negative impact on the national economy, making economic growth unpredictable. In 2015, Indonesia was ranked third with the highest unemployment rate in ASEAN. It is estimated that the unemployment rate in each province of Indonesia is influenced by the surrounding provinces. Therefore, spatial modelling on panel data with Least Square Dummy Variable (LSDV) is needed to identify factors that influence unemployment in Indonesia. The data used is Open Unemployment Rate (OUR) data and influencing factors are population, average length of schooling, Gross Regional Domestic Product (GRDP) rate, and Human Development Index (HDI) in 34 provinces of Indonesia from 2015 to 2020. Spatial model on panel data with appropriate LSDV for OUR data are Spatial Autoregressive Model (SAR) and Spatial Error Model (SEM). The SAR model with fixed effects has an value of 90.289%, which is greater than the SEM model with fixed effects (82.708%) and LSDV model (87.864%). The root mean square error value for SAR model with fixed effects is 0.58951, less than SEM model with fixed effects (0.78669) and LSDV model (0.65903). The best model is the SAR model with fixed effects. Based on this model, the factors that influence OUR in Indonesia from 2015-2020 are obtained, namely the rate of GRDP and HDI.
COMPARISON OF SUPPORT VECTOR MACHINE BASED ON FASTTEXT WITHOUT AND WITH FIREFLY OPTIMIZATION PARAMETERS FOR DISASTER SENTIMENT ANALYSIS IN INDONESIA Adhel, Fadilah Amirul; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 18 No 3 (2024): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol18iss3pp1791-1802

Abstract

Sentiment analysis is a process for analyzing opinions, sentiments, assessments, and emotions from someone's statements regarding a domain or is also a process for entering and processing data in the form of text. Support vector machine (SVM) is a supervised machine learning technique that functions as a separator of two classes of data. SVM aims to obtain numerical vectors using fasttext. SVM cannot choose appropriate parameters so the use of parameters is not optimal. To obtain optimal parameters with better classification results, firefly optimization was carried out. This research compares the fasttext-based SVM method without and with firefly optimization parameters using data from tweets with the keyword "Indonesian disaster" which was crawled using the Twitter application. The results of this research obtained 128 dimensions that form the weight of each word. This means that each word is represented in a 128-dimensional vector space. The evaluation of the SVM classification model with and without firefly optimization provides an accuracy of 89.1% and 61.3% respectively. This shows that the SVM classification method with firefly optimization provides quite good classification performance compared to the SVM model without optimization.
Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013: Aplikasi Metode Adaptive Synthetic Nominal dan Extreme Gradient Boosting dalam Menentukan Faktor yang Memengaruhi Obesitas: Studi Kasus Riset Kesehatan Dasar Indonesia 2013 Rombe, Yoris; Thamrin, Sri Astuti; Lawi, Armin
Indonesian Journal of Statistics and Applications Vol 6 No 2 (2022)
Publisher : Statistics and Data Science Program Study, IPB University, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v6i2p309-317

Abstract

Obesity is the accumulation of excessive body fat and can be harmful to health. According to recent studies, several factors that contribute to the increasing prevalence of obesity in Indonesia include poor diet, lack of consumption of vegetables and fruits, high consumption of fast food, area of residence, and lack of physical activity. In addition, psychological factors, high consumption of alcohol and cigarettes, cultural differences, and stress factors also trigger obesity. The rapid development of the medical field cannot be separated from the availability of data that is increasingly easy to access and increasing knowledge in the medical field. This makes machine learning increasingly needed for pattern recognition from very large medical data, including obesity data. In this study, the factors that influence obesity status in Indonesia will be determined. In order to achieve this, Extreme Gradient Boosting (XGBoost) was used. This method is one of the classification methods that has better scalability and more efficient over its previous methods. Besides that, to overcome the imbalanced data, Adaptive Synthetic Nominal Algorithm (ADASYN-N) is used in order to balance the data and improve its prediction accuracy. Both the ADASYN-N and XGBoost methods will be applied to obesity data from the Indonesian Basic Health Research Survey in 2013. This study shows that female is more at risk in determining obesity status in Indonesia based on the highest gain value (37%). In addition, age 35-54 years, strenuous activity, and eating vegetables for 6 days are also risk factors of obesity.