cover
Contact Name
Dr. Muhammad Ahsan
Contact Email
muh.ahsan@its.ac.id
Phone
+6281331551312
Journal Mail Official
inferensi.statistika@its.ac.id
Editorial Address
Department of Statistics Faculty of Science and Data Analytics Institut Teknologi Sepuluh Nopember (ITS) Kampus ITS Keputih Sukolilo Surabaya Indonesia 60111
Location
Kota surabaya,
Jawa timur
INDONESIA
Inferensi
ISSN : 0216308X     EISSN : 27213862     DOI : http://dx.doi.org/10.12962/j27213862
The aim of Inferensi is to publish original articles concerning statistical theories and novel applications in diverse research fields related to statistics and data science. The objective of papers should be to contribute to the understanding of the statistical methodology and/or to develop and improve statistical methods; any mathematical theory should be directed towards these aims; and any approach in data science. The kinds of contribution considered include descriptions of new methods of collecting or analysing data, with the underlying theory, an indication of the scope of application and preferably a real example. Also considered are comparisons, critical evaluations and new applications of existing methods, contributions to probability theory which have a clear practical bearing (including the formulation and analysis of stochastic models), statistical computation or simulation where the original methodology is involved and original contributions to the foundations of statistical science. It also sometimes publishes review and expository articles on specific topics, which are expected to bring valuable information for researchers interested in the fields selected. The journal contributes to broadening the coverage of statistics and data analysis in publishing articles based on innovative ideas. The journal is also unique in combining traditional statistical science and relatively new data science. All articles are refereed by experts.
Articles 147 Documents
Comparison of Logistic Regression and Support Vector Machine in Predicting Stroke Risk Safitri, Lensa Rosdiana; Chamidah, Nur; Saifudin, Toha; Firmansyah, Mochammad; Alpandi, Gaos Tipki
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20420

Abstract

The issue of health is the third goal of Indonesia's Sustainable Development Goals (SDGs) which is state to ensuring a healthy life and promoting prosperity for all people at all ages. One of the SDGs’s concerns is deaths caused by non-communicable diseases (NCDs) including strokes. One prevention that can be done is by making a prediction of stroke for early detection. There are various methods available which are statistical methods and machine learning methods. In this research work, we aim to compare the two methods based on statistical method and machine learning method on stroke risk prediction. The data used in this research is primary data from Universitas Airlangga Hospital (RSUA) from June until August 2023. In this research, we compare the statistical method that is Logistic Regression (LR), and the machine learning method which is Support Vector Machine(SVM). We use Phyton to analyze all methods in this research. The results show that SVM with Radial Basis Kernel is better than LR in predicting stroke risk based on three goodness criteria namely sensitivity, F-1 score and accuracy where these three goodness criteria values of SVM are greater than those of LR.
Determinants of PM2.5 Concentration in DKI Jakarta Province: A VAR Model Approach Jayadri, Bertolomeus Laksana; Pangastuti, Mafitroh; Farhan, Muh; Kartiasih, Fitri
Inferensi Vol 7, No 1 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i1.19843

Abstract

Air pollution in the DKI Jakarta Province is a serious issue as it is related to public health and environmental concerns. Therefore, this research aims to analyze the causality of PM2.5 concentration with meteorological factors such as air temperature, humidity, rainfall, and wind speed. The data source used is from the MERRA-2 satellite, which provides information at a spatial resolution of 0.5° × 0.625°. The data covers the period from January 1, 1980, to November 1, 2023, with hourly time intervals. The research variables involve PM2.5 concentration as the response variable, as well as predictor variables such as air temperature, humidity, rainfall, and wind speed. The analytical method employed is the Vector Autoregressive (VAR) approach, as all variables are stationary at the level.  The constructed VAR model tends to indicate that meteorological variables have a low explanatory power for PM2.5 concentration, while changes in PM2.5 concentration itself have sustained impacts in both the short and long term. This suggests that the fluctuations in PM2.5 concentration in DKI Jakarta Province are not significantly influenced by meteorological factors.
Risk Analysis Forecasting Models of Poisson Regression, Negative Binomial Regression, Poisson GSARIMA, and Negative Binomial GSARIMA (Case Study: Number of Bicycle Sales) Pramujati, Windya Harieska
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20255

Abstract

The Poisson model is a model that can be applied to count data, where in this research the case study used is the number of bicycle sales. However, there is an equidispersion assumption in the Poisson model, that the response variable has the same mean and variance. A more flexible model is needed if the equidispersion assumption is not met, namely the Negative Binomial model. In this research, two models were applied, namely the regression model and the GSARIMA model, with two different distributions, namely the Poisson distribution and the Negative Binomial distribution. Therefore the models that will be compared are the Poisson Regression, Negative Binomial Regression, Poisson GSARIMA, and Negative Binomial GSARIMA models. The differences in results for each model are due to errors that occur in each model used. Hence, a model with a smaller error can be said to be a model that has a smaller risk than other models. The results of this study show that the error rate in the Negative Binomial GSARIMA ZQ1 model is relatively smaller than other models with a value of AIC = 1058.7. This model is the best model that can be used as a forecasting model in the case of bicycle sales and can minimize the risk of error in a forecasting result.
Ensemble Cluster Method For Clustering Cabbage Production In East Java Maghfiro, Maulidya; Wardhani, Ni Wayan Surya; Iriany, Atiek
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20378

Abstract

Cluster analysis is a multivariate analysis method classified under interdependence methods, where explanatory variables are not differentiated from response variables. The methods used include hierarchical cluster analysis, such as agglomerative and divisive, and non-hierarchical methods such as Self Organizing Maps (SOM) based on Artificial Neural Networks (ANN). Various cluster analysis methods often yield diverse solutions, making it challenging to determine the optimal solution. Therefore, the ensemble cluster method is employed to combine various clustering solutions without considering the initial data characteristics with providing better results. One case study of clustering is the grouping of cabbage production. East Java Province has become the third-highest cabbage-producing province in Indonesia with a production of 210,454 tons. Clustering of cabbage-producing regencies/cities was conducted to optimize production and identify areas that have not yet reached their maximum potential. This study compares five clustering methods which are hierarchical analysis (complete linkage, single linkage, average linkage), Self-Organizing Map (SOM), and Ensemble Cluster. The quality of clustering was evaluated using the Silhouette Coefficient (SC), Dunn Index (DI), and Connectivity Index (CI). The results indicate that the Ensemble Cluster method showed the best performance, with an SC value of 0.9124, a DI value of 1.3734, and a CI value of 2.9290, indicating excellent cluster separation. Therefore, the ensemble cluster method is recommended as the best clustering method in this study.
Pengendalian Kualitas Semen PCC di PT Semen Bosowa Banyuwangi Menggunakan Maximum Half-Normal Multivariate Control Chart (Max-Half-Mchart) Loka, I Melda Puspita; Khusna, Hidayatul; Aksioma, Diaz Fitra
Inferensi Vol 7, No 1 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i1.16247

Abstract

Cement is a building adhesive material produced from clinker and has the main ingredients in the form of calcium silicate and additional gypsum. The Indonesian Cement Association (ASI) states that there will be an increase in domestic cement consumption by 5.5% in 2021. Competition in the industrial sector is quite tight, causing PT Semen Bosowa Banyuwangi maintain and improve the quality of its products continuously. One of the steps taken is to check for blaine, residual, and free lime through the laboratory before the cement is distributed. Since there are more than one quality characteristic of Portland Composite Cement (PCC) and each quality characteristic is monitored every shift, the control chart used is a multivariate control chart for individual observations in the form of a Max-Half-Mchart. The Max-Half-Mchart for individual observation can effectively monitor mean and process variability simultaneously. PCC cement quality control using the Max-Half-Mchart in phase I showed that the process was statistically controlled. In phase II, there were out of control observations identified as a shift in the average process. The multivariate process capability measurement results obtained a 〖MC〗_pk value of 1.053, which means that the overall production of PCC cement complies with company regulations.
Modeling the Percentage of Tuberculosis Cure in Indonesia Using a Multivariate Adaptive Regression Spline Approach Novianti, Dita Aris; Marwanda, Nadia Dwi; Saifudin, Toha; Suliyanto, Suliyanto
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20344

Abstract

Tuberculosis (TB) is an infectious disease caused by the bacterium Mycobacterium Tuberculosis. After India, Indonesia is the country with the second highest number of TB sufferers in the world. TB prevention efforts in Indonesia have been carried out, even since 1995. However, in general, 2006-2022 the TB cure in Indonesia tends to experience a downward trend. Therefore, it is important to know what variables have a significant effect and how the pattern relates to the percentage of TB cures. We urgently need this information to optimize TB handling efforts and achieve Sustainable Development Goals (SDGs) point 3, which focuses on good health and well-being. For that purpose, this study used the Multivariate Adaptive Regression Spline (MARS) approach. MARS is considered more flexible in overcoming cases of predictor variables that do not form a certain pattern to their response variables and can accommodate possible interactions between predictor variables. The best model was obtained at BF=18,MI=2, and MO=0 with minimum GCV value is 37.053 and R^2 is 91.6%, with significant predictor variables are food management sites meet the requirements according to standards, complete treatment, smoking population over 15 years, families with healthy latrines, and districts/municipalities implement healthy living germas policy. The significance of the nine predictors should prioritize enhancing the quality of health services for example ensuring a fair distribution of complete treatment for TB patients.
Intrusion Detection Systems (IDSs) using Multivariate Control Chart Hotelling’s T2 with Dimensional Reduction of Factorial Analysis of Mixed Data (FAMD) and Autoencoder Rifki, Kevin Agung Fernanda; Rosyadi, Niam; Zenklinov, Amanatullah Pandu; Suhermi, Novri
Inferensi Vol 7, No 1 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i1.18751

Abstract

Traditional multivariate control charts for network intrusion detection encounter significant challenges including false alarms due to non-conforming network data traffic distributions, limitations in identifying outlier intrusions caused by masking effects, and handling diverse data types. This paper introduces a T2-based multivariate control chart that leverages dimensional reduction techniques using Factor Analysis of Mixed Data (FAMD) and Autoencoder to address these issues. FAMD reduces data with both quantitative and qualitative variables, while Autoencoder focuses on dimensionality reduction for quantitative variables, enhancing multivariate control chart performance. The proposed chart, a modified T2, is compared to conventional T2 with dimensionality reduction through FAMD and Autoencoder. Results from simulating data using UNSW-NB 15 demonstrate T2's superior performance with dimensionality reduction compared to conventional T2. Under various conditions, conventional control chart T achieves 64% accuracy, T2 with FAMD achieves 74%, and T2 with Autoencoder reaches 76%. Notably, T2 with FAMD excels in detecting normal activity as intrusion compared to Autoencoder. This approach holds promise for improving network intrusion detection accuracy, especially in mixed-data environments.
Generalized Linear Mixed Models for Predicting Non-Life Insurance Claims Saputra, Kie Van Ivanky; Margaretha, Helena; Ferdinand, Ferry Vincenttius; Budhyanto, Johana Daniella
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20447

Abstract

Generalized linear mixed models (or GLMMs) are an extension of linear mixed models to allow response variables from different distributions. Alternatively, GLMMs are an extension of generalized linear models (GLMs) to include both fixed and random effects (hence mixed models) that can be used as a modeling approach that allows the modeling of nonlinear behaviors and non-Gaussian distributions of residues. These models are very useful for general insurance claim predictions, where the frequency and the severity of claims distributions are usually non-Gaussian. In our research, we shall compare the performance of GLMS and that of GLMMS to estimate the aggregate of claims of auto insurance. The data used are a secondary dataset which is the motor vehicle dataset from Australia named ausprivauto0405. The results of our research suggest that GLMMs approach does not always give the best estimations and even in some cases GLMs outperform GLMMs. The accuracy of the models was compared to choosing the best model for determining pure insurance premiums using R software. More investigation using different models is needed to ensure which model is more appropriate for estimating the aggregate of insurance claims.
Modeling Life Expectancy Index in West Nusa Tenggara Province with Panel Regression Method Astuti, Alfira Mulya; Ashri, Erina Salsabila; Sabri, Sabri
Inferensi Vol 7, No 1 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i1.20148

Abstract

Health is a condition of total physical, mental, and social well-being, rather than simply the lack of disease or weakness. One way to assess health indicators in a region is by enhancing the development of the health sector, which may be quantified using the life expectancy index (LEI). This study seeks to analyze the impact of average years of schooling, the adjusted per capita expenditure, and the number of poor people on life expectancy in NTB province from 2011 to 2020. The study's individual observation units consist of 10 regencies/cities in NTB Province. The data were obtained from BPS NTB in a panel data format and processed using the panel regression method. The panel model selection indicates that the Random Effect Model is the most suitable to predict the life expectancy in NTB province. The average years of schooling and the adjusted per capita expenditure have a notable impact on the life expectancy in NTB province. The effect provided is a beneficial impact. The number of poor people has a limited impact on life expectancy. Simultaneously, the average years of schooling, the adjusted per capita expenditure, and the number of poor people in the province of NTB have a substantial impact on the life expectancy. 
Risk Factors for Lymphatic Filariasis in Endemic Areas of Papua Using Binary Logistic Regression Based on Synthetic Minority Over-sampling Technique Simangunsong, Sri Rohmanisa; Oktora, Siskarossa Ika
Inferensi Vol 7, No 2 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i2.20283

Abstract

Neglected tropical diseases (NTDs), such as lymphatic filariasis (LF), are a significant issue in Indonesia. The high percentage of LF in Papua highlights the urgency of addressing LF in the area due to its devastating impact on the health and economy of the poor. Moreover, imbalanced outcome variable categories are a common issue in logistic regression analysis using medical data. One of the solutions to this problem is using Synthetic Minority Over-sampling Technique (SMOTE). Therefore, this study aims to provide an overview of LF cases in endemic areas of Papua and identify the factors that influence its occurrence using binary logistic regression analysis and the SMOTE method. The data utilized was the LF diagnosis status of individuals in endemic areas of Papua Province, Indonesia as contained in the Riset Kesehatan Dasar (Riskesdas) 2018. It was found that the SMOTE approach in binary logistic regression analysis can be used to address data imbalance. The following factors are significant: sex, age, occupation, education level, use of mosquito bite preventive measures, use of latrines for defecation, and participation in Mass Drug Administration (MDA).

Page 11 of 15 | Total Record : 147