Claim Missing Document
Check
Articles

Found 22 Documents
Search

Pendugaan Koefisien Regresi Logistik Biner Menggunakan Algoritma Least Angle Regression Utami, Mamik; Islamiyati, Anna; Thamrin, Sri Astuti
ESTIMASI: Journal of Statistics and Its Application Vol. 5, No. 1, Januari, 2024 : Estimasi
Publisher : Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/ejsa.v5i1.12489

Abstract

Binary logistic regression is a statistical analysis method that aims to determine the relationship between variable which has two categories with the predictor variable that have categorical or continuous scale. The method that used to estimate logistic regression parameters is Maximum Likelihood Estimation (MLE) method. In estimating parameters, Least Angle Regression (LAR) algorithm is used to select the significant variables in order to get the best model from the estimation results of binary logistic regression coefficients. This LAR algorithm is applied to the risko of stunting data in two-year-old-babies at Buntu Batu Health Center working area, Enrekang Regency, South Sulawesi in 2019. This results obtained in the estimation of binary logistic regression prediction model using LAR algorithm, the standard error value is 0.018 smaller than the standard error value of binary logistic regression, which is 0.025. This shows that the binary logistic regression model using LAR algorithm is better than the usual binary logistic regression model on the risk of stunting data. Based on the results obtained, the variables that significantly affect the risk of stunting in two-year-old-babies on 2019 are father’s height, body length of birth, exclusive breastfeeding, history of infectious diseases, and history of immunization.
Determining Factors that Influence Unmet Need For Family Planning Using Geographically Weighted Logistic Regression With LASSO: Dian Ayu Permata Sari Rusdy; Sri Astuti Thamrin; Anna Islamiyati
Jurnal Matematika, Statistika dan Komputasi Vol. 21 No. 2 (2025): JANUARY 2025
Publisher : Department of Mathematics, Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/j.v21i2.35081

Abstract

Binary logistic regression is a regression used for categorical response variables with two possibilities: success or failure. This regression is a global model, making it inappropriate for spatial data. Binary logistic regression was then developed into geographically weighted logistic regression (GWLR). GWLR considers location factors into the model through a weight function. Nevertheless, GWLR is unable to overcome multicollinearity issue. Multicollinearity can cause the estimated parameters to be insignificant, thus it needs to be solved. A method to deal with multicollinearity is least absolute shrinkage and selection operator (LASSO). LASSO is applicable to various areas, including health, namely in the case of unmet need for family planning (FP). Unmet need for FP refers to productive-age women who do not wish to have more children or wish to postpone having children without using contraceptive methods. This study aims to obtain GWLR model with LASSO and influential factors, and acquire the performance of GWLR model with LASSO on unmet need for FP in South Sulawesi. The AIC value of the GWLR with LASSO model, which is 31,918, is less than the AIC value of the GWLR without LASSO, which is 38,879. This implies that GWLR with LASSO method is able to model unmet need for FP better than GWLR model. In addition, it was obtained that the status of unmet need for FP in 22 districts/cities was affected by the percentage of women with junior high school education or equivalent or lower, number of high-fertility women, percentage of husbands/families who refuse family planning, and number of KB staffs, while there were 2 districts/cities where the status of unmet need for KB was determined by the number of high-fertility women, percentage of husbands/families who refuse family planning, and number of FP staffs.
MEASUREMENT OF CLASSIFICATION PERFORMANCE WITH THE LEARNING VECTOR QUANTIZATION METHOD ON COVID-19 VACCINATION DATA AT THE PARUMPANAI HEALTH CENTER PRANANDA, ADHIYAKSA; Siswanto, Siswanto; Thamrin, Sri Astuti; Siddik, A. Muh. Amil
Jurnal Matematika UNAND Vol. 13 No. 2 (2024)
Publisher : Departemen Matematika dan Sains Data FMIPA Universitas Andalas Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/jmua.13.2.131-141.2024

Abstract

In the midst of the COVID-19 pandemic, various countries are always trying their best to restore global stability. One effective way is the discovery of several vaccines to prevent transmission of the virus. Indonesia is one of the countries that is aggressively implementing the COVID-19 vaccination. The vaccination process which has been carried out from February 2021 until the end of 2021 has covered approximately 160 million people or 76.83% of the target set by the government. Vaccine recipients have criteria to be able to get vaccinated to avoid side effects or complications. So it is necessary to classify groups that can receive vaccines and also delay vaccination. This research aims to determine the performance of the learning vector quantization classification method. Learning vector quantization method classification produces 95% accuracy, 97% precision, and 96% sensitivity. From these performance measurements, it can be concluded that the learning vector quantization method is very good and can be used in the classification of COVID-19 vaccination recipients at the Parumpanai Public Health Center, East Luwu Regency.
SENTIMENT ANALYSIS OF MERDEKA BELAJAR KAMPUS MERDEKA POLICY USING SUPPORT VECTOR MACHINE WITH WORD2VEC Rezki, Nurul; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 1 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (352.972 KB) | DOI: 10.30598/barekengvol17iss1pp0481-0486

Abstract

Sentiment analysis is a data text analysis that classifies data into positive and negative sentiments. This study aims to obtain the results of sentiment classification related to Merdeka Belajar Kampus Merdeka policy on Twitter using support vector machine algorithm with Word2Vec feature extraction. Support Vector Machine is a classification algorithm that separates data classes using the optimum hyperplane. Text data used in sentiment analysis must change its numerical form by performing feature extraction. In this study, the feature extraction used is Word2Vec which represents words in vector form. Data in this study are tweets with the keyword "Kampus Merdeka" uploaded on Twitter as many as 10000 tweets. After preprocessing text data, data used to analyze sentiment was 1579 tweets. Sentiment classification resulted in classification model accuracy 89.87%, precision 91.20%, recall 84.44% and F-Measure 87.68%. Classification sentiment using support vector machine with Word2Vec feature extraction in this study produces a good model.
SPATIAL MODELING IN DATA PANELS WITH LEAST SQUARE DUMMY VARIABLE TO IDENTIFY FACTORS AFFECTING UNEMPLOYMENT IN INDONESIA Amil, Sri Indriani; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1381-1392

Abstract

Unemployment is a serious issue that must be addressed. Unemployment has a negative impact on the national economy, making economic growth unpredictable. In 2015, Indonesia was ranked third with the highest unemployment rate in ASEAN. It is estimated that the unemployment rate in each province of Indonesia is influenced by the surrounding provinces. Therefore, spatial modelling on panel data with Least Square Dummy Variable (LSDV) is needed to identify factors that influence unemployment in Indonesia. The data used is Open Unemployment Rate (OUR) data and influencing factors are population, average length of schooling, Gross Regional Domestic Product (GRDP) rate, and Human Development Index (HDI) in 34 provinces of Indonesia from 2015 to 2020. Spatial model on panel data with appropriate LSDV for OUR data are Spatial Autoregressive Model (SAR) and Spatial Error Model (SEM). The SAR model with fixed effects has an value of 90.289%, which is greater than the SEM model with fixed effects (82.708%) and LSDV model (87.864%). The root mean square error value for SAR model with fixed effects is 0.58951, less than SEM model with fixed effects (0.78669) and LSDV model (0.65903). The best model is the SAR model with fixed effects. Based on this model, the factors that influence OUR in Indonesia from 2015-2020 are obtained, namely the rate of GRDP and HDI.
COMPARISON OF SUPPORT VECTOR MACHINE BASED ON FASTTEXT WITHOUT AND WITH FIREFLY OPTIMIZATION PARAMETERS FOR DISASTER SENTIMENT ANALYSIS IN INDONESIA Adhel, Fadilah Amirul; Thamrin, Sri Astuti; Siswanto, Siswanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 18 No 3 (2024): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol18iss3pp1791-1802

Abstract

Sentiment analysis is a process for analyzing opinions, sentiments, assessments, and emotions from someone's statements regarding a domain or is also a process for entering and processing data in the form of text. Support vector machine (SVM) is a supervised machine learning technique that functions as a separator of two classes of data. SVM aims to obtain numerical vectors using fasttext. SVM cannot choose appropriate parameters so the use of parameters is not optimal. To obtain optimal parameters with better classification results, firefly optimization was carried out. This research compares the fasttext-based SVM method without and with firefly optimization parameters using data from tweets with the keyword "Indonesian disaster" which was crawled using the Twitter application. The results of this research obtained 128 dimensions that form the weight of each word. This means that each word is represented in a 128-dimensional vector space. The evaluation of the SVM classification model with and without firefly optimization provides an accuracy of 89.1% and 61.3% respectively. This shows that the SVM classification method with firefly optimization provides quite good classification performance compared to the SVM model without optimization.
Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013: Aplikasi Metode Adaptive Synthetic Nominal dan Extreme Gradient Boosting dalam Menentukan Faktor yang Memengaruhi Obesitas: Studi Kasus Riset Kesehatan Dasar Indonesia 2013 Rombe, Yoris; Thamrin, Sri Astuti; Lawi, Armin
Indonesian Journal of Statistics and Applications Vol 6 No 2 (2022)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v6i2p309-317

Abstract

Obesity is the accumulation of excessive body fat and can be harmful to health. According to recent studies, several factors that contribute to the increasing prevalence of obesity in Indonesia include poor diet, lack of consumption of vegetables and fruits, high consumption of fast food, area of residence, and lack of physical activity. In addition, psychological factors, high consumption of alcohol and cigarettes, cultural differences, and stress factors also trigger obesity. The rapid development of the medical field cannot be separated from the availability of data that is increasingly easy to access and increasing knowledge in the medical field. This makes machine learning increasingly needed for pattern recognition from very large medical data, including obesity data. In this study, the factors that influence obesity status in Indonesia will be determined. In order to achieve this, Extreme Gradient Boosting (XGBoost) was used. This method is one of the classification methods that has better scalability and more efficient over its previous methods. Besides that, to overcome the imbalanced data, Adaptive Synthetic Nominal Algorithm (ADASYN-N) is used in order to balance the data and improve its prediction accuracy. Both the ADASYN-N and XGBoost methods will be applied to obesity data from the Indonesian Basic Health Research Survey in 2013. This study shows that female is more at risk in determining obesity status in Indonesia based on the highest gain value (37%). In addition, age 35-54 years, strenuous activity, and eating vegetables for 6 days are also risk factors of obesity.
Naive Bayes Algorithm with Feature Selection Using Particle Swarm Optimization Siswanto, Siswanto; Kurniawan, Iwan; Thamrin, Sri Astuti
Jurnal Varian Vol. 7 No. 2 (2024)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/varian.v7i2.2409

Abstract

The COVID-19 vaccine in Indonesia has led to the emergence of public opinion which is conveyed on social media such as Twitter. One of the analyses that can be done to produce various information from public opinion is sentiment analysis. Sentiment analysis is used to determine whether an opinion tends to be positive or negative. This study aims to classify the public opinion of the COVID-19 vaccine in Indonesia with sentiment analysis and to visualize the location of the sentiment of the COVID-19 vaccine tweet data in Indonesia. To achieve this aim, the Naïve Bayes algorithm with Particle Swarm Optimization (PSO) feature selection was used. This study uses opinions into positive and negative class sentiments towards 2,547 tweets related to the COVID-19 vaccine in Indonesia from January to June 2021. The results show that the distribution of positive and negative class sentiments is 2,328 and 219, respectively. In addition, the positive sentiment for the COVID-19 vaccine was dominated by people on the island of Java based on a random number matrix initialized by the PSO method. The classification of public opinion on Twitter media provides accurate and optimal performance results using a combination of the Naïve Bayes algorithm with PSO feature selection. The results of the combination of these methods have accuracy and F1 score values of 91.28% and 95.38%, respectively. The visualization of geo-spatial mapping showed that positive sentiments related to the COVID-19 vaccine exist in almost all regions in Indonesia but are dominated by the Jabodetabek area.
Improved Chi Square Automatic Interaction Detection on Students Discontinuation to Secondary School Al Anshory, Fadhil; Siswanto, Siswanto; Thamrin, Sri Astuti; Inayah, Ika
Jurnal Varian Vol. 7 No. 1 (2023)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/varian.v7i1.2627

Abstract

Improved Chi Square Automatic Interaction Detection (CHAID) with bias correction is the development of the CHAID method by relying on Tschuprow's T test calculations with bias correction in the process of forming a classification tree. This study aims to obtain a classification of factors which influence students for not continuing their education from junior high school or equivalent to high school or equivalent. The results obtained in the classification tree produce nine classifications. Based on the results of the classification tree, the classification of students who do not continue their education to high school or equivalent is: students with disabilities who do not have access to Information and Communication Technology (ICTs) (0.89); students who work without disability but do not have access to ICTs (0.73); and students who do not work without disability but do not have access to in ICTs (0.60). Based on the classification obtained the factors which influence students for not continuing their education to high school or equivalent are access to ICTs, employment status, and persons with disabilities. The classification accuracy of the results uses the Improved-CHAID method with bias correction with a proportion of 80% training data and 20% testing data, namely 72.3033% on training data and an increase of 73.3300% on testing data.
KLASIFIKASI FAKTOR-FAKTOR PENYEBAB PENYAKIT DIABETES MELITUS DI RUMAH SAKIT UNHAS MENGGUNAKAN ALGORITMA C4.5 Ente, Dewi Rahma; Thamrin, Sri Astuti; Arifin, Samsul; Kuswanto, Hedi; Andreza, Andreza
Indonesian Journal of Statistics and Applications Vol 4 No 1 (2020)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i1.330

Abstract

Diabetes mellitus (DM) is one of the chronic and deadly diseases that are widely observed in various countries today. This disease continues and is increasing to a very alarming stage. This study aims to identify and see the relationship between factors that influence DM disease. The method used in this research is C4.5 algorithm which is one of the algorithms used to make predictive classifications. Classification is one of the processes in data mining that aims to find patterns in relatively large data that use the representations in the form of decision trees. This method is applied to data from medical records of patients with DM in 2014-2018 taken from the Hasanuddin University Teaching Hospital. The results obtained indicate that there are four factors that influence the prediction of a patient's DM status namely; Fasting Blood Glucose (GDP), LDL Cholesterol, Triglycerides, and Body Weight.