cover
Contact Name
Dania Siregar
Contact Email
jsamtk.unj@gmail.com
Phone
+6281316044605
Journal Mail Official
jsa@unj.ac.id
Editorial Address
Kampus A Universitas Negeri Jakarta, Lt.6 Gd. Dewi Sartika Jalan Rawamangun Muka, Jakarta Timur.
Location
Kota adm. jakarta timur,
Dki jakarta
INDONESIA
Jurnal Statistika dan Aplikasinya
ISSN : -     EISSN : 26208369     DOI : https://doi.org/10.21009/JSA.041
Jurnal Statistika dan Aplikasinya JSA is dedicated to all statisticians who wants to publishing their articles about statistics and its application. The coverage of JSA includes every subject that using or related to statistics.
Articles 180 Documents
COMPARISON OF OVERSAMPLING, UNDERSAMPLING, AND SMOTE TECHNIQUES FOR MULTICLASS BALANCE DATA HANDLING IN RANDOM FOREST AND MULTINOMIAL LOGISTIC REGRESSION Fadjryani; Asfar; Nazwa; Tokandari, Allin Floria; Lestari, Tri Andayani; Ghani, Muhammad Azi Zarir
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09207

Abstract

Class imbalances in multiclass classifications are an important challenge in applied machine learning, particularly in the medical field such as predicting how patients will exit. Although various studies have demonstrated the effectiveness of resampling techniques, the best combination of classification algorithms and balancing methods for highly unbalanced multiclass hospital data is still rarely studied. This study aims to compare the performance of Random Forest (RF) and Multinomial Logistic Regression (MLR) algorithms in dealing with class imbalances using three resampling techniques: Random Oversampling (ROS), Random Undersampling (RUS), and Synthetic Minority Oversampling Technique (SMOTE). The dataset used included 1,032 inpatients with Non-Insulin-Dependent Diabetes Mellitus (NIDDM) at Undata Hospital, Central Sulawesi, for the period January 2021 to December 2023. Data pre-processing includes coding, normalization, and data sharing by stratified sampling (80:20). Feature selection was conducted using Recursive Feature Elimination (RFE), and model evaluation was conducted with 5-fold cross-validation using accuracy, recall, F1-score, and MCC metrics. The results showed that the combination of RF and ROS provided the best performance with an accuracy of 93.65%, F1-macro of 0.935, and a balanced accuracy of 0.95. This combination has been shown to be able to recognize minority classes well without sacrificing overall accuracy. In contrast, the MLR model shows the lowest performance, especially when using RUSs that cause the loss of important data. Although SMOTE is showing competitive results, it remains below ROS in this context. This study was limited to structured clinical data and only compared two types of classification models. In the future, deep learning-based approaches or advanced ensembles can be explored. The novelty of this study lies in the thorough evaluation of the combination of balancing techniques and classical classification algorithms for medical predictions with extremely unbalanced multiclass data.
COMPARISON OF SIMPLEX AND NELDER-MEAD OPTIMIZATION METHODS IN QUANTILE REGRESSION FOR BOGOR CITY RAINFALL ANALYSIS Erira, Salsa Rifda; Audina, Delia Fitri; Virgie, Meriza Immanuela; Suhaeri, ⁠Bulan Cahyani; Abyan, Muhammad Fatih; Akbar Rizki; Sartono, Bagus
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09203

Abstract

Predicting extreme rainfall is crucial for supporting planning in the agricultural sector, infrastructure development, and disaster mitigation in the city of Bogor. However, the asymmetric distribution of daily rainfall and the presence of outliers make linear regression methods less suitable. Quantile regression offers an alternative that captures the influence of explanatory variables across different parts of the data distribution, particularly in the extreme regions. This study compares the Simplex and Nelder-Mead methods for estimating quantile regression parameters on extreme rainfall data in Bogor. Daily rainfall data were obtained from the West Java BMKG Climate Station for the period from May 2024 to April 2025, comprising 365 observations, with four explanatory variables: average temperature, average humidity, sunshine duration, and average wind speed. Modeling was conducted at the 0.75, 0.85, and 0.95 quantiles to represent extreme rainfall. The results show that the Simplex method outperformed Nelder-Mead, as indicated by lower Pinball Loss and Mean Absolute Error (MAE) values at most quantiles. Humidity and average wind speed had a significantly positive effect on extreme rainfall intensity, while average temperature had a negative effect. Sunshine duration showed less consistent effects. Overall, the Simplex method is recommended for quantile regression optimization in extreme rainfall data due to its greater stability and accuracy in generating model parameters. However, this study is limited by the number of explanatory variables and the relatively short observation period. Incorporating additional variables such as air pressure, ENSO index, or topographical data, along with extending the observation period, could improve model accuracy and generalizability in future research.
APPLICATION OF MARKOV CHAIN IN MONTHLY RAINFALL PREDICTION IN AMBON CITY Rumeon, Sahril G.; Aulya, Nurul; Telussa, Silvia W.; Patty, Christi A.; Sopaliu, Fera F.; Rumalean, Fadila; Rumangun, Chelsy T.; Tuankotta, Winda; Yudistira
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09204

Abstract

Asia has a tropical climate with two main seasons influenced by monsoons, namely the rainy season and the dry season. However, in recent years, seasonal patterns have shifted due to climate change, making it difficult to predict weather, including rainfall. Ambon City, as one of the regions with high and varied rainfall in eastern Indonesia, is highly dependent on weather conditions, especially since most of its inhabitants work as fishermen and farmers. Therefore, rainfall prediction is important to support appropriate decision-making in the marine, agriculture, and hydrometeorological disaster risk mitigation sectors. This study aims to model and predict the status of monthly rainfall in Ambon City in 2025 using the Markov chain method, a first-order probability-based approach that describes transitions between circumstances based on historical data, where the chances of subsequent events depend only on current circumstances. The data used is in the form of monthly rainfall from 2015 to 2024 obtained from the Pattimura–Ambon Meteorological Station. The data were classified into four categories of precipitation: light, medium, high, and very high, which were further used to compile a one-step probability transition matrix. The results showed that the steady-state distribution of rainfall in Ambon City tended to be in the moderate category (47.90%), followed by very high (26.5%), light (20.17%), and high (5.88%). The rainfall prediction for 2025 shows a transition pattern that is close to a steady state, where month after month there is a stable trend. With this information, fishermen can be wiser in determining safe times to go to sea, and the government can design climate change adaptation and mitigation policies more effectively.
PERFORMANCE EVALUATION OF WORD EMBEDDING TECHNIQUES IN TWITTER SENTIMENT ANALYSIS USING LSTM Ladayya, Faroh; Rahayu, Widyanti; Rohimah, Siti Rohmah; Saputra, Ferdiansyah Rizki; Maulana, Thoriq Akbar; Madinah, Najwa Nur
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09206

Abstract

Opinions expressed on social media can be used as feedback on a product, both goods and services. The sentiment analysis was utilized for analyzing opinions given by the public via social media. The sentiment contained in an opinion can be positive, negative, or neutral. This study aims to compare the performance of three word embedding techniques—Word2Vec, GloVe, and FastText—when combined with a Long Short-Term Memory (LSTM) model for sentiment classification of Indonesian Twitter data. LSTM was selected due to its ability to model sequential text data and capture long-term contextual dependencies that are often present in natural language. To enable sentiment classification using LSTM, textual data from social media were transformed into numerical vectors. Thus, the word embedding technique is used to convert text into a vector. The vector that had been obtained will be used as input for LSTM. All embeddings were evaluated under the same preprocessing pipeline and LSTM architecture to ensure a fair comparison. Model performance was assessed using accuracy, precision, recall, F1-score, and ROC/AUC metrics. The results indicate that the LSTM model effectively captures sentiment patterns in Indonesian tweets, with Word2Vec achieving the best overall performance, followed by GloVe and FastText. These findings suggest that domain-adapted word embeddings remain highly effective for sentiment analysis in Indonesian social media contexts.
CLUSTERING OF COUNTRIES BASED ON WORLD HAPPINESS INDICATORS USING K-MEANS Adylla, Fahira Puti; Lestari, Dian; Widyaningsih, Yekti
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09209

Abstract

Happiness is a multidimensional concept encompassing emotional well-being, life satisfaction, and perceived quality of life. The increasing use of happiness indicators as complementary measures of development beyond economic growth has attracted growing attention in statistical and applied research. This study aims to classify countries based on a comprehensive set of world happiness indicators using the K-Means clustering method. The indicators include the Happiness Index (subjective), gross domestic product (GDP) per capita, social support, healthy life expectancy, freedom to make life choices, generosity, negative perceptions of corruption, crime index, and cost of living. The optimal number of clusters is determined using the Silhouette Index, while Biplot analysis is employed to visualize cluster characteristics and relationships among indicators. The results identify three distinct clusters. Cluster 1 is dominated by countries with low happiness levels, Cluster 2 represents countries with moderate happiness profiles, and Cluster 3 consists of countries with high happiness levels. The findings demonstrate the effectiveness of multivariate clustering techniques in revealing structural patterns in happiness data and provide empirical evidence that may support comparative statistical analysis and policy-oriented applications.
MODELING DISASTER RISK IN INDONESIA: A LATENT VARIABLE MODELING APPROACH TO HEVA ASSESSMENT Herliansyah, Riki; Fitria, Irma; Rauf, Nurul Maqfirah; Achmad, Adha Karamina
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09201

Abstract

Indonesia, as the world's largest archipelagic nation, faces significant disaster risks due to its position at the convergence of three major tectonic plates. This study employs Generalized Linear Latent Variable Models (GLLVM) to analyze relationships among 12 Hazard, Exposure, and Vulnerability Assessment (HEVA) indicators across 34 Indonesian provinces. The HEVA dataset used in this study was obtained from the United Nations University – Institute for Environment and Human Security (UNU-EHS), which provides harmonized global risk indicators for hazard intensity, exposure levels, and socioeconomic–environmental vulnerability. Unlike conventional approaches assuming variable independence, GLLVM captures complex dependency structures through latent variables, providing deeper insights into multidimensional disaster risk patterns. Model-based ordination analysis reveals distinct spatial risk patterns. Eastern provinces (Papua, Maluku) demonstrate high physical vulnerability and exposure despite lower hazard levels, while Java provinces show moderate hazards but lower vulnerability due to better infrastructure and governance. A notable negative correlation (r < -0.70) between hazard levels and vulnerability indicators suggests that regions frequently exposed to disasters develop stronger adaptation capacity. Conversely, vulnerability indicators show very strong positive correlations (r > 0.90), indicating interconnections requiring holistic interventions. Incorporating geographical covariates such as population, number of islands, and provincial areas reveals significant relationships with HEVA indicators. Population shows negative associations with physical and environmental vulnerability but positive relationships with climate and geophysical hazards, i.e., the corresponding 95% CIs do not contain zero, reflecting urbanization's dual nature. The number of islands positively correlates with multiple vulnerability indicators, highlighting structural challenges in archipelagic disaster management, including limited accessibility and infrastructure connectivity. Provincial areas demonstrate positive relationships with vulnerability indicators but negative associations with economic exposure, indicating concentrated economic activities in urban centers. These findings emphasize differentiated spatial approaches for disaster mitigation.
APPLICATION OF THE GUSTAFSON–KESSEL ALGORITHM FOR IDENTIFYING SPATIAL PATTERNS OF NATURAL DISASTERS IN EAST NUSA TENGGARA Nufus, Mitha Rabiyatul; Chandrawati; Widyaningrum, Erlyne Nadhilah
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09205

Abstract

This study examines spatial patterns of disaster vulnerability across districts and cities in East Nusa Tenggara Province, one of Indonesia’s most disaster-prone regions. Although previous studies have highlighted the province’s exposure to multiple hazards, limited attention has been given to clustering methods capable of capturing non-homogeneous and elliptical data structures. This research aims to classify regional disaster vulnerability based on the characteristics of disaster occurrences and to provide empirical support for more targeted mitigation strategies. Secondary data on floods, forest fires, hurricanes, and landslides recorded in 2023 were analyzed using the adaptive Gustafson–Kessel clustering algorithm. The optimal number of clusters was determined using the Silhouette validity index. The results identify three distinct vulnerability groups: regions highly prone to multiple types of disasters, regions predominantly affected by a single hazard, and regions with relatively low disaster risk. The resulting spatial patterns reveal clear differences in disaster intensity and complexity among regions, emphasizing the need for location-specific disaster management policies. This study contributes to disaster risk analysis by demonstrating the applicability of the Gustafson–Kessel algorithm in capturing complex spatial vulnerability patterns that are often overlooked by conventional clustering approaches.
FORECASTING THE PRICE OF CURLY RED CHILI PEPPERS IN EAST JAVA PROVINCE USING ARIMA MODEL WITH ITERATIVE OUTLIER DETECTION PROCEDURE Erdien, Fareka; Rahayu, Widyanti; Sumargo, Bagus; Wulansari, Ika Yuni; Ali, Didiq Rosadi
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09208

Abstract

Curly red chili is one of the vegetables with high economic value because it plays a role in supporting the food industry and meeting domestic needs. Fluctuations in the price of curly red chili peppers can change at any time, requiring forecasting to prevent losses for economic actors. This research aims to get the best model for forecasting and determine the accuracy of forecasting the price of curly red chili. The Autoregressive Integrated Moving Average (ARIMA) model is one method that can be used for forecasting with limitations requiring data that must be stationary. Outliers in the ARIMA model affect the autocorrelation structure of a time series so that the estimated values of the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) become biased so that forecasting with the ARIMA model is less accurate and requires handling outliers in the form of outlier detection, one of which is an iterative procedure. From this study, it was found that the ARIMA(0,2,3) model with outlier detection was the best model for forecasting. Forecasting tends to show a downward trend with an accuracy level of MAPE value of 4.612, which means that the model is very good for forecasting.
Front Matter Jurnal Statistika dan Aplikasinya Vol. 9 No. 2, December 2025 JSA, Journal Editor
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09200

Abstract

Back Matter Jurnal Statistika dan Aplikasinya Vol. 9 No. 2, December 2025 JSA, Journal Editor
Jurnal Statistika dan Aplikasinya Vol. 9 No. 2 (2025): Jurnal Statistika dan Aplikasinya
Publisher : LPPM Universitas Negeri Jakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21009/JSA.09299

Abstract