cover
Contact Name
Hasih Pratiwi
Contact Email
hpratiwi@mipa.uns.ac.id
Phone
+6282134673512
Journal Mail Official
ijas@mipa.uns.ac.id
Editorial Address
Study Program of Statistics, Universitas Sebelas Maret, Surakarta 57126, Indonesia
Location
Kota surakarta,
Jawa tengah
INDONESIA
Indonesian Journal of Applied Statistics
ISSN : -     EISSN : 2621086X     DOI : https://doi.org/10.13057/ijas
Indonesian Journal of Applied Statistics (IJAS) is a journal published by Study Program of Statistics, Universitas Sebelas Maret, Surakarta, Indonesia. This journal is published twice every year, in May and November. The editors receive scientific papers on the results of research, scientific studies, and problem solving research using statistical method. Received papers will be reviewed to assess the substance of the material feasibility and technical writing.
Articles 62 Documents
K-Medoids Clustering dan Mean-Value at Risk untuk Optimasi Portofolio Saham Jakarta Islamic Index Puspaningsih, Eka Sri; I Maruddani, Di Asih; Tarno, Tarno
Indonesian Journal of Applied Statistics Vol 6, No 1 (2023)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v6i1.79231

Abstract

The problem of the portfolio is how to choose stocks and determine their weights in order to generate maximum returns with minimal risk. Portfolios are formed by selecting stocks that have different characteristics. K-Medoids Clustering can be used to group data sets that contain outliers. Validate cluster results using the Davies Bouldin Index to determine the best number of clusters. Portfolio weighting is determined using the Mean-VaR method by taking into account the expected return value and minimizing the VaR risk value. Stocks are grouped based on Return on Assets, Return on Equity, Debt to Asset Ratio, and Debt to Equity Ratio. The results of cluster formation on the Jakarta Islamic Index stocks obtained six portfolio constituent stocks based on the highest expected return value from each cluster, consisting of PTBA, ADRO, AKRA, EXCL, PTPP, and UNVR. The results of calculating the weight of the optimal portfolio with Mean-VaR obtained a weight for PTBA of 0.46536; AKRA of 0.24018; EXCL of 0.25421; and UNVR of 0.25392. ADRO and PTPP stocks have a negative weight value of -0,07775 and -0,13593 this indicates the occurrence of short selling in the weighting. At the 95% confidence level, the VaR portfolio value is 5.06%.Keywords: Clustering; K-Medoids; Daveis Bouldin Index; Portfolio; Mean-VaR
Comparing Monthly Rainfall Prediction in West Sumatra Using SARIMA, ETS, LSTM, and XGBoosting Methods Aslam, Fadhil Muhammad; Afghani, Fadhli Aslama
Indonesian Journal of Applied Statistics Vol 7, No 1 (2024)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v7i1.83187

Abstract

The West Sumatra Province, serving as the trading center on the island of Sumatra, and boasting various attractive tourist destinations, is not immune to incidents of high precipitation leading to hydro-meteorological disasters such as floods and landslides. Therefore, the accurate prediction of monthly rainfall is crucial to minimize the impacts of high precipitation. This research aims to determine the best method for predicting monthly rainfall using data from 1992 to 2022, which can adequately represent its climatological conditions. The results indicate that the Extreme Gradient Boosting method outperforms the Seasonal Autoregressive Integrated Moving Average (SARIMA), Exponential Smoothing (ETS), and Long Short-Term Memory (LSTM) methods in West Sumatra Province, represented by three weather observation points from the BMKG (Climatology Station of West Sumatra, Maritime Meteorology Station of Teluk Bayur, and Minangkabau Meteorology Station). This method exhibits the lowest error values and the strongest correlation between predicted and actual data. This is evident from the Nash-Sutcliffe Efficiency (NSE) values, which are 0.188214535, 0.613823746, and 0.545734162 (unsatisfactory-satisfactory), as well as the obtained correlation values of 0.472103386, 0.795586268, and 0.743002591 (moderate-strong). However, this method is unable to perfectly capture outlier values. These outliers arise as a result of unusual conditions, such as natural disasters or climate changes, and atmospheric phenomena like El Niño-Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD), leading to exceptionally high or low precipitation.
Modeling East Java Province Poverty Cases Using Birespon Truncted Spline Regression Putri, Rizka Amalia; Wulandari, Nindya; Widyaningrum, Erlyne Nadhilah; Fathan, Morina A.; Safitriani, Nur Rezky
Indonesian Journal of Applied Statistics Vol 8, No 1 (2025)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v8i1.100915

Abstract

An analytical method for determining the relationship between predictor and response variables is regression. For data that shows unidentified patterns, nonparametric regression is a suitable data analysis technique. A nonparametric regression technique is the truncated spline. Due to the widespread use of truncated spline with a single response variable, this study employs biresponse truncated spline, which uses two response variables to produce a better model than single-response modeling. The purpose of this study is to obtain the best model and to identify which variables influence the poverty case in East Java Province using biresponse truncated spline regression. The best knot points were chosen for this investigation using Generalized Cross Validation (GCV). With three knot points and a model goodness of fit () of 95.83%, GCV gives the best modeling results. Applying this model to the East Java Province case of poverty using data on the poverty depth index and the percentage of the population living in poverty in 2023 reveals that the Labor Force Participation Rate (TPAK), Average Years of Schooling (RLS), and Open Unemployment Rate (TPT) all have a significant effect.Keywords: biresponse truncated spline; nonparametric regression; poverty
Bayesian Neural Network untuk Prediksi Diabetes: Uncertainty Quantification dalam Machine Learning Kamila, Sabrina Adnin; Sadik, Kusman; Suhaeni, Cici; Soleh, Agus Mohamad
Indonesian Journal of Applied Statistics Vol 9, No 1 (2026)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v9i1.103994

Abstract

Penelitian ini bertujuan mengevaluasi dan membandingkan kinerja tiga model machine learning, yaitu random forest (RF), feedforward neural network (FNN), dan bayesian neural network (BNN), dalam klasifikasi diabetes menggunakan Diabetes Health Indicators Dataset dari UCI Machine Learning Repository yang memiliki ketidakseimbangan kelas. Prapemrosesan data meliputi normalisasi fitur menggunakan StandardScaler dan penanganan ketidakseimbangan kelas dengan synthetic minority over-sampling technique (SMOTE). Evaluasi model dilakukan menggunakan metrik akurasi dan skor F1, yang didukung oleh classification report dan confusion matrix. Hasil evaluasi menunjukkan bahwa RF menghasilkan akurasi tinggi (0,8493) namun skor F1 yang rendah (0,3386), yang mengindikasikan rendahnya sensitivitas model terhadap kasus positif diabetes. FNN memberikan performa yang lebih seimbang dengan skor F1 sebesar 0,4490 setelah penyesuaian threshold optimal. Sementara itu, BNN mencapai akurasi 0,8498 dan skor F1 sebesar 0,4043, serta memiliki keunggulan tambahan berupa kemampuan mengukur ketidakpastian prediksi melalui pendekatan Monte Carlo Dropout. Dengan demikian, FNN lebih unggul dalam keseimbangan klasifikasi, sementara BNN lebih relevan untuk aplikasi medis yang membutuhkan informasi tingkat kepercayaan prediksi guna mendukung pengambilan keputusan klinis yang lebih andal.This study aims to evaluate and compare the performance of three machine learning models, namely random forest (RF), feedforward neural network (FNN), and bayesian neural network (BNN), for diabetes classification using the Diabetes Health Indicators Dataset from the UCI Machine Learning Repository, which exhibits significant class imbalance. Data preprocessing includes feature normalization using StandardScaler and class imbalance handling through synthetic minority over-sampling technique (SMOTE). Model performance is evaluated using accuracy and F1-score metrics, supported by classification report and confusion matrix analysis. The results show that RF achieves high accuracy (0.8493) but a low F1-score (0.3386), indicating poor sensitivity to positive diabetes cases. FNN provides more balanced performance with an F1-score of 0.4490 after optimal threshold adjustment. Meanwhile, BNN achieves an accuracy of 0.8498 and F1-score of 0.4043, while offering the additional advantage of uncertainty quantification through Monte Carlo Dropout. Therefore, FNN is more effective for balanced classification performance, while BNN is more suitable for medical applications that require prediction confidence information to support more reliable and informed clinical decision-making.Kata Kunci: Prediksi diabetes, kuantifikasi ketidakpastian, bayesian neural network, classification imbalance, machine learning.Keywords: Diabetes prediction, uncertainty quantification, bayesian neural network, classification imbalance, machine learning.
Comparative Analysis of Fuzzy Mamdani Method and Fuzzy Sugeno Method in Predicting Household Electricity Consumption Costs Zahra, Luthfia; Mashuri, Mashuri
Indonesian Journal of Applied Statistics Vol 8, No 2 (2025)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v8i2.103436

Abstract

Electricity has become an essential part of our daily lives. As technology has rapidly developed, many modern activities and devices have become highly dependent on electricity. The more electricity that is used, the higher the monthly cost. This cost is influenced by usage patterns and various uncertain factors. Fuzzy logic is one approach that can be used in decision support systems in the face of uncertainty like this. This study aims to apply the Mamdani and Sugeno fuzzy methods based on house building area, number of electronic devices, number of family members, and income to determine which method more accurately predicts household electricity consumption costs based on the mean absolute percentage error (MAPE) value. Data for this study were obtained through questionnaires and interviews with residents of Margorejo Village. Data processing yielded a MAPE value of 12.3% for the Mamdani method and a MAPE value of 9.9% for the Sugeno method. Based on these results, the MAPE value for the Sugeno method is smaller than that for the Mamdani method. Therefore, it can be concluded that the Sugeno method is more accurate for predicting household electricity consumption costs in Margorejo Village.Keywords: Mamdani Method, Sugeno Method, MAPE.
Value at Risk Estimation of Portfolio Affected by the BDS Movement: A Copula Approach Sofawi, Binarvian; Wutsqa, Dhoriva Urwatul
Indonesian Journal of Applied Statistics Vol 9, No 1 (2026)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v9i1.95849

Abstract

This study aims to estimate value at risk (VaR) as a measure of the maximum potential loss in an investment portfolio through the application of a Copula approach to stocks affected by the Boycott, Divestment, and Sanctions (BDS) Movement. The data used are the daily return data of MAPI from PT Mitra Adiperkasa, Tbk, FAST from PT Fast Food Indonesia, Tbk, and UNVR from PT Unilever Indonesia, Tbk obtained from the closing stock prices. The returns used are daily simple returns from March 1, 2019, to February 29, 2024, consisting of 1,130 days. The model used in this study is the ARMA-GARCH Copula model. Autoregressive moving average (ARMA) is used due to the involvement of time influence in estimation, while generalized autoregressive conditional heteroskedasticity (GARCH) is used to address the high volatility in stocks. The selection of the best copula model using maximum likelihood estimation (MLE) involves five copulas: gaussian copula, t-Student copula, Clayton copula, Frank copula, and Gumbel copula. The results of the analysis show that Clayton copula is the best model, with VaR of the portfolio of stocks affected by the Boycott, Divestment, and Sanctions (BDS) movement at the 99%, 95%, and 90% confidence levels are 3.45%, 2.11%, and 1.55%, respectively. These findings suggest that lower tail dependence plays an important role in portfolio risk, indicating the potential for simultaneous extreme losses. Therefore, investors are encouraged to consider copula-based risk measurement methods and diversification strategies to minimize potential portfolio losses.Keywords: ARMA, copula, GARCH, returns, value at risk
Deteksi Polycystic Ovary Syndrome (PCOS) Berbasis Machine Learning: Kombinasi SMOTE, Random Forest, Gradient Boosting, dan Bayesian Optimization Alfiryal, Naufalia; Sadik, Kusman; Suhaeni, Cici; Soleh, Agus Mohamad
Indonesian Journal of Applied Statistics Vol 8, No 2 (2025)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v8i2.109931

Abstract

Polycystic ovary syndrome (PCOS) merupakan gangguan endokrin yang umum terjadi pada wanita usia reproduktif. Kondisi ini dapat menyebabkan gangguan ovulasi, ketidakseimbangan hormon, resistensi insulin, serta meningkatkan risiko penyakit kardiovaskular, obesitas, dan gangguan psikologis. Meskipun prevalensinya cukup tinggi, sekitar 75% kasus PCOS masih belum terdiagnosis dalam praktik klinis akibat kompleksitas gejala dan keterbatasan metode diagnosis yang digunakan saat ini. Untuk mengatasi permasalahan tersebut, penelitian ini mengusulkan pendekatan berbasis machine learning guna meningkatkan akurasi dan efisiensi deteksi PCOS. Penelitian ini membandingkan performa dua algoritma pembelajaran terawasi, yaitu random forest dan gradient boosting, dalam melakukan prediksi PCOS. Dataset yang digunakan diperoleh dari repositori publik dan memuat berbagai fitur klinis yang berkaitan dengan PCOS. Untuk menangani permasalahan ketidakseimbangan kelas, metode synthetic minority over-sampling technique (SMOTE) diterapkan pada data pelatihan. Selain itu, bayesian optimization digunakan untuk melakukan penyetelan hiperparameter pada masing-masing model agar diperoleh performa yang optimal. Evaluasi performa model dilakukan menggunakan beberapa metrik, dengan area under the curve–receiver operating characteristic (AUC-ROC) sebagai metrik utama. Hasil penelitian menunjukkan bahwa model Gradient Boosting memberikan performa terbaik dengan nilai AUC sebesar 0,8983 dan nilai recall sebesar 0,95, yang mengindikasikan sensitivitas tinggi dalam mengidentifikasi kasus PCOS. Temuan ini menunjukkan bahwa kombinasi SMOTE dan bayesian optimization efektif dalam meningkatkan akurasi prediksi, khususnya pada dataset medis yang tidak seimbang. Pendekatan yang diusulkan memiliki potensi untuk diintegrasikan ke dalam sistem pendukung keputusan klinis guna mendukung proses skrining PCOS yang lebih dini dan andal.Polycystic ovary syndrome (PCOS) is a common endocrine disorder among reproductive-aged women. This condition can lead to ovulatory dysfunction, hormonal imbalance, insulin resistance, and an increased risk of cardiovascular disease, obesity, and psychological disorders. Despite its high prevalence, approximately 75% of PCOS cases remain undiagnosed in clinical settings due to the complexity of symptoms and limitations of current diagnostic methods. To address this issue, a machine learning-based approach is proposed to improve the accuracy and efficiency of PCOS detection. This study compares the performance of two supervised learning algorithms random forest and gradient boosting for PCOS prediction. The dataset used was obtained from a public repository and contains various clinical features associated with PCOS. To address the class imbalance problem, the synthetic minority over-sampling technique (SMOTE) was applied to the training data. Additionally, bayesian optimization was employed to fine-tune the hyperparameters of each model for optimal performance. Model performance was evaluated using several metrics, with the area under the curve–receiver operating characteristic (AUC-ROC) as the primary measure. The Gradient Boosting model achieved the best results, with an AUC of 0.8983 and a recall of 0.95, indicating high sensitivity in identifying positive PCOS cases. These findings demonstrate that the combination of SMOTE and Bayesian Optimization is effective in enhancing predictive accuracy, especially in imbalanced medical datasets. The proposed approach shows promise for integration into clinical decision-support systems to facilitate earlier and more reliable PCOS screening.Kata Kunci: Bayesian optimization; gradient boosting; PCOS; random forest; SMOTE.Keywords : Bayesian optimization; gradient boosting; PCOS; random forest; SMOTE.
Analisis Ketahanan Remaja Perempuan di Perdesaan terhadap Pernikahan Dini di Indonesia dengan Log-logistic Gamma Shared Frailty Survival Model Yulianto, Werri; Usman, Hardius
Indonesian Journal of Applied Statistics Vol 8, No 2 (2025)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v8i2.87185

Abstract

Pada tahun 2022, Badan Pusat Statistik (BPS) mencatat bahwa angka pernikahan dini di wilayah perdesaan dua kali lebih tinggi dibandingkan perkotaan. Praktik ini masih didukung sebagai bagian dari tradisi, meskipun merupakan bentuk pelanggaran terhadap hak anak dan kekerasan terhadap perempuan. Penelitian ini menggunakan analisis survival dengan model shared frailty log-logistic gamma berdasarkan data Susenas 2022 untuk mengkaji faktor-faktor yang memengaruhi ketahanan remaja perempuan perdesaan usia 15–24 tahun terhadap pernikahan dini. Hasil penelitian menunjukkan bahwa tingkat pendidikan perempuan, tingkat kesejahteraan, dan jumlah anggota rumah tangga menjadi faktor pengaruh yang paling besar terhadap ketahanan remaja perempuan di perdesaan dari pernikahan dini. Selain itu, terdapat pengaruh faktor tak terobservasi (shared frailty) pada setiap kabupaten/kota yang menunjukkan variasi karakteristik antar wilayah dari ketahanan remaja perempuan di perdesaan terhadap pernikahan dini. Pencegahan pernikahan dini memerlukan pendekatan komprehensif melalui peningkatan akses pendidikan, pemberdayaan ekonomi perempuan, serta penguatan sosialisasi dan pendidikan kesehatan reproduksi.In 2022, BPS recorded that the number of early marriage in rural areas was twice than urban areas. Rural communities still support early marriage as a form of local tradition and culture, even though early marriage is a form of violence against women that violates children's rights. This study uses survival analysis with a shared frailty log-logistic gamma model to analyze the variables that influence the resilience of rural Indonesian girls aged 15-24 against early marriage based on data from the National Socioeconomic Survey (Susenas 2022). The results show that the level of education, welfare level, and number of household members have the greatest influence on the resilience of rural adolescent girls to early marriage. In addition, there is an influence of unobserved factors (shared frailty) in each district/city, which shows variations in characteristics between regions in terms of the resilience of rural adolescent girls to early marriage in Indonesia. Preventing early marriage requires a holistic approach through the empowerment of women, especially in rural areas, by increasing access to education, economic independence, and support for entrepreneurship programs. In addition, raising public awareness through social outreach, institutional cooperation, and strengthening sex education and reproductive health are important steps to reduce early marriage.Kata kunci: Pernikahan dini, analisis survival, shared frailty.Keywords: Early marriage, survival analysis, shared frailty.
Forecasting the U.S. Treasury Yield Curve Using the Hybrid Dynamic Nelson-Siegel and Long Short-Term Memory (LSTM) Method Firdausanti, Neni Alya; Shafira, Alya
Indonesian Journal of Applied Statistics Vol 9, No 1 (2026)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v9i1.114966

Abstract

U.S. Treasury (UST) securities are widely regarded as safe-haven assets and serve as global financial benchmarks, making the U.S. Treasury yield curve a key indicator of market expectations and economic risks, including recession probabilities. For Indonesia, where foreign exchange reserves are partly allocated to UST securities, accurate yield curve forecasts are essential for effective reserve management and monetary policy formulation. This study proposes a hybrid forecasting framework that integrates the dynamic Nelson–Siegel (DNS) model with long short-term memory (LSTM) networks to improve the accuracy and stability of U.S. Treasury yield curve forecasts. The decay parameter in the DNS model is estimated using the Newton–Raphson method, while the remaining parameters are estimated using ordinary least squares (OLS). The resulting DNS latent factors are subsequently used as input features for the LSTM model under various hyperparameter configurations. Forecasting performance is evaluated using the root mean squared error (RMSE) and benchmarked against a DNS–ARIMA model. The empirical results demonstrate that the proposed DNS-LSTM approach consistently outperforms DNS-ARIMA across all maturities, yielding lower forecasting errors and greater flexibility in capturing yield curve dynamics, particularly during the post-pandemic period. Overall, the DNS-LSTM model offers a more robust and data-driven alternative to traditional yield curve forecasting methods. These findings have practical implications for foreign reserve management, exchange rate stabilization, and investment decision-making. Future research may extend this framework by incorporating macroeconomic variables and exploring longer forecast horizons.Keywords: Bonds, yield, dynamic Nelson-Siegel, long short-term memory, U.S. Treasury
Pemodelan Spatial Autoregressive Confused pada Prevalensi Ketidakcukupan Konsumsi Pangan di Nusa Tenggara Tahun 2023 Baharuddin, Baharuddin; Agusrawati, Agusrawati; Yahya, Irma
Indonesian Journal of Applied Statistics Vol 8, No 2 (2025)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13057/ijas.v8i2.103618

Abstract

Analisis regresi terhadap prevalensi ketidakcukupan konsumsi pangan (prevalence of undernourishment, PoU) di Provinsi Nusa Tenggara Barat dan Provinsi Nusa Tenggara Timur menunjukkan adanya otokorelasi spasial, baik pada peubah respon maupun pada komponen galat. Kondisi tersebut menyebabkan pelanggaran terhadap asumsi regresi linier. Penelitian ini bertujuan untuk menangani kedua bentuk otokorelasi spasial tersebut melalui pemodelan spatial autoregressive confused (SAC). Data PoU menurut kabupaten/kota bersumber dari Badan Pangan Nasional, sementara peubah bebas berasal dari publikasi Badan Pusat Statistik. Hasil penelitian ini menunjukkan bahwa model SAC memberikan dugaan parameter yang lebih akurat dibandingkan dengan model regresi linier. Faktor-faktor yang berpengaruh signifikan terhadap PoU di suatu kabupaten/kota meliputi produksi beras per kapita, realisasi bantuan sosial pangan per kapita, PoU di daerah tetangga, dan suku-suku galat di daerah tetangga.A regression analysis of the prevalence of undernourishment (PoU) in the provinces of Nusa Tenggara Barat and Nusa Tenggara Timur indicates the presence of spatial autocorrelation, both in the response variable and the error component. This condition violates the assumptions of linear regression. This study aims to address both forms of spatial autocorrelation by employing the spatial autoregressive confused (SAC) model. The 2023 PoU data by regency/city in Nusa Tenggara were sourced from the National Food Agency, while the explanatory variables were obtained from BPS-Statistics publications. The results of this study show that the SAC model provides more accurate parameter estimates compared to the linear regression model. The factors that significantly influence the PoU in a given regency/city include per capita rice production, per capita realization of food social assistance, the PoU levels in neighboring regions, and error terms in surrounding areas.Kata kunci: pembobot spasial berbasis jarak; prevalensi ketidakcukupan konsumsi pangan; regresi spasialKeywords: Distance-based weight; prevalence of undernourishment; spatial regression