p-Index From 2021 - 2026
7.537
P-Index
This Author published in this journals
All Journal FORUM STATISTIKA DAN KOMPUTASI Media Statistika Statistika JURNAL MATEMATIKA STATISTIKA DAN KOMPUTASI IPTEK The Journal for Technology and Science CAUCHY: Jurnal Matematika Murni dan Aplikasi Sosioinforma JUITA : Jurnal Informatika Jurnal Pengelolaan Sumberdaya Alam dan Lingkungan (Journal of Natural Resources and Environmental Management) International Journal of Advances in Intelligent Informatics Scientific Journal of Informatics JOIN (Jurnal Online Informatika) Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Indonesian Journal of Applied Statistics Jurnal Penelitian Pertanian Tanaman Pangan BAREKENG: Jurnal Ilmu Matematika dan Terapan JOURNAL OF APPLIED INFORMATICS AND COMPUTING SINTECH (Science and Information Technology) Journal MIND (Multimedia Artificial Intelligent Networking Database) Journal JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) Jurnal Aplikasi Statistika & Komputasi Statistik FIBONACCI: Jurnal Pendidikan Matematika dan Matematika Inferensi International Journal of Advances in Data and Information Systems InPrime: Indonesian Journal Of Pure And Applied Mathematics ESTIMASI: Journal of Statistics and Its Application Majalah Ilmiah Matematika dan Statistika (MIMS) Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Journal of Applied Data Sciences Enthusiastic : International Journal of Applied Statistics and Data Science Prosiding Seminar Nasional Official Statistics Jurnal Natural Eduvest - Journal of Universal Studies Xplore: Journal of Statistics PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS Parameter: Jurnal Matematika, Statistika dan Terapannya Scientific Journal of Informatics Journal of Mathematics, Computation and Statistics (JMATHCOS) Advance Sustainable Science, Engineering and Technology (ASSET) Indonesian Journal of Statistics and Its Applications Journal on Mathematics Education
Claim Missing Document
Check
Articles

Sentiment Analysis of Tokopedia Customer Reviews Using BiLSTM and IndoBERT with Comparative Analysis of Preprocessing and Labeling Methods Anadra, Rahmi; Wijayanto, Hari; Sadik, Kusman
International Journal of Advances in Data and Information Systems Vol. 6 No. 3 (2025): December 2025 - International Journal of Advances in Data and Information Syste
Publisher : Indonesian Scientific Journal

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59395/ijadis.v6i3.1458

Abstract

This study addresses key challenges in Indonesian sentiment analysis related to preprocessing, labeling strategies, and class imbalance. It compares the performance of BiLSTM and IndoBERT using user reviews collected from Tokopedia. The dataset was manually and automatically labeled, then processed under three preprocessing schemes. Both models were trained with tuned hyperparameters and imbalance-handling techniques and evaluated through twenty rounds of stratified five-fold cross-validation. Performance was assessed using balanced accuracy and F1-score. IndoBERT achieved the highest results, with balanced accuracy up to 0.85 and F1-scores up to 0.83, while BiLSTM reached balanced accuracy up to 0.78 and F1-scores up to 0.76. Applying class weight and focal loss improved model performance by approximately 2% to 11% over the baseline. BiLSTM demonstrated greater training efficiency, requiring only 1 to 2.5 minutes per epoch, compared with IndoBERT’s 2.6 to 3.6 minutes. Although manual labeling remained superior in capturing contextual nuance and emotional cues, GPT-based labeling showed strong agreement with the human annotations. A four-way ANOVA revealed that all main factors and several interactions significantly influenced classification outcomes. Overall, BiLSTM provides faster training efficiency, whereas IndoBERT delivers higher predictive accuracy.
The Impact of the L1/L2 Ratio on Selection Stability and Solution Sparsity along the Elastic Net Regularization Path in High-Dimensional Genomic Data Fahira, Fani; Sadik, Kusman; Suhaeni, Cici; M Soleh, Agus
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.12059

Abstract

High-dimensional genomic datasets (p>n) pose persistent challenges for predictive modeling and biomarker-oriented feature selection due to multicollinearity and instability of selected feature sets under resampling. Although Elastic Net is widely used to address correlated predictors via combined L1/L2 regularization, the practical role of the L1/L2 mixing ratio (α) is often treated as a secondary tuning choice driven primarily by predictive accuracy. This study investigates how varying α shapes the trade-off among selection stability, solution sparsity, and predictive performance along the Elastic Net regularization path. Experiments were conducted using the publicly available METABRIC breast cancer cohort (n = 1,964) with 21,113 gene expression features and a binary overall survival status outcome. Logistic regression with Elastic Net penalty was fitted across a grid of α values, with the regularization strength (λ) selected by cross-validation. Feature selection stability was evaluated under repeated resampling using the Jaccard index, Dice coefficient, and Adjusted Rand Index (ARI), while sparsity was summarized by the average number of non-zero coefficients; predictive performance was assessed using AUC, accuracy, and F1-score. Results show a monotonic decline in stability as α increases: α = 0.2 yields the highest stability (Jaccard 0.324, Dice 0.487, ARI 0.434), whereas LASSO (α = 1.0) produces the lowest stability (Jaccard 0.278, Dice 0.431, ARI 0.400). In contrast, predictive performance varies only marginally across α (AUC 0.696–0.704; accuracy 0.666–0.671; F1-score 0.738–0.742), while sparsity changes substantially (average selected features 110–204). Coefficient path analyses further illustrate abrupt shrinkage under LASSO versus smoother, group-preserving shrinkage under Elastic Net, consistent with improved reproducibility under lower-to-moderate α. Frequency-of-selection analysis highlights genes repeatedly selected across resampling, supporting interpretability of stable configurations without claiming causal biomarker validity. Overall, the findings demonstrate that α is a substantive modeling choice that materially affects stability and sparsity even when accuracy is similar, motivating stability-aware tuning for high-dimensional genomic prediction and reproducible feature discovery.
Household Climate Resilience Index and Its Determinants: An Empirical Study in DKI Jakarta Sundari, Marta; Sadik, Kusman; Wigena, Aji Hamim; Fitrianto, Anwar; Boer, Rizaldi
Jurnal Pengelolaan Sumberdaya Alam dan Lingkungan (Journal of Natural Resources and Environmental Management) Vol 16 No 2 (2026): Jurnal Pengelolaan Sumberdaya Alam dan Lingkungan (JPSL)
Publisher : Pusat Penelitian Lingkungan Hidup, IPB (PPLH-IPB) dan Program Studi Pengelolaan Sumberdaya Alam dan Lingkungan, IPB (PS. PSL, SPs. IPB)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/jpsl.16.2.162

Abstract

Climate change has intensified environmental pressures in urban coastal areas, particularly in DKI Jakarta, where recurrent flooding, tidal inundation, and heat extremes threaten urban sustainability. This study developed a Household Climate Resilience Index (HCRI) to assess the resilience of urban households to climate-related hazards using a robust principal analysis (RPCA) framework. The analysis was based on household survey data from 221 respondents across 17 urban villages in Jakarta, encompassing four resilience dimensions: exposure, sensitivity, incremental adaptation, and transformational adaptation. RPCA with a minimum covariance determinant estimator was applied to minimize the influence of outliers and ensure stable component estimation. The results reveal clear spatial heterogeneity in resilience, characterized by a distinct north–south gradient: northern coastal areas such as Kamal, Koja, and Pluit show the lowest resilience due to high flood exposure and land subsidence, whereas central and southern areas exhibit stronger adaptive capacity. The key determinants of resilience include flood frequency, household education levels, per-family expenditure, and proactive adaptation behaviors. The Kendall correlation test (τ = 0.518, p = 0.015) confirmed a significant positive association between flood occurrence and low resilience levels. The developed HCRI provides a robust, data-driven framework to support targeted climate adaptation policies and urban resilience planning in Jakarta, Indonesia. HCRI outputs, together with the identified key determinants (flood frequency, education, per-family expenditure, and proactive adaptation), can guide the prioritization of urban environmental management and adaptation investments in the most vulnerable urban villages, including drainage upgrading, land subsidence control, and coastal protection.
DETECTION OF ADULTERATION IN COCONUT MILK USING CUCKOO SEARCH-OPTIMIZED XGBOOST ON HIGH-DIMENSIONAL FTIR SPECTRAL DATA Sentana Putra, I Gusti Ngurah; Sadik, Kusman; Soleh, Agus Mohamad; Suhaeni, Cici
JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) Vol 10, No 3 (2025)
Publisher : STKIP PGRI Tulungagung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29100/jipi.v10i3.8376

Abstract

Coconut milk adulteration is an important issue because it can reduce food quality and endanger consumers. This study aims to develop a rapid and accurate detection method for coconut milk adulteration using a combination of FTIR spectroscopy technology and the XGBoost machine learning algorithm optimized with the Cuckoo Search Algorithm (CSA). FTIR spectral data from traditional and instant coconut milk samples were analyzed using Standard Normal Variate (SNV) and Savitzky-Golay (SG) preprocessing to reduce noise and clarify spectral features. The XGBoost model was then optimized through CSA with hyperparameter tuning. The results showed that the combination of SNV+SG preprocessing increased the model accuracy by 84.44%, with a precision of 92.73% and an F1-score of 79.94%. In addition, CSA optimization provided a 19.7% increase in accuracy compared to the model without tuning. These findings prove the effectiveness of the CSA-XGBoost approach in analyzing high-dimensional spectral data and is a potential solution in efficiently detecting the authenticity of coconut milk. In conclusion, this approach has the potential to be widely applied to test the authenticity of other food products quickly, non-destructively and accurately.
Perbandingan Performa MSGARCH, LSTM, dan Hybrid MSGARCH-LSTM pada Peramalan Data Deret Waktu yang Mengandung Heteroskedastisitas Freya, Wa Ode Rona; Sadik, Kusman; Susetyo, Budi
ESTIMASI: Journal of Statistics and Its Application Vol. 7, No. 1, Januari, 2026 : Estimasi
Publisher : Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/ejsa.v7i1.45934

Abstract

Volatility forecasting is crucial for estimating potential portfolio losses, particularly in cryptocurrency markets like Bitcoin, which exhibit high and irregular price fluctuations. Models from the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) family, including Markov Switching GARCH (MSGARCH), are widely used to handle heteroscedastic data and capture regime changes. Meanwhile, Long Short-Term Memory (LSTM) is effective for modeling nonlinear and complex patterns in financial time series. This study proposes a hybrid MSGARCH-LSTM model by incorporating MSGARCH predictions as additional input to the LSTM. The model is evaluated using simulated data resembling Bitcoin's characteristics, with Heteroscedasticity Mean Absolute Error (HMAE) as the primary metric, and analyzed using ANOVA and Tukey's post-hoc test. The results identify four superior hybrid configurations, all of which significantly outperform the standalone MSGARCH and LSTM models. Based on the characteristics of Bitcoin data, the MSGARCH (2-regime with sged error distribution)-LSTM model is selected for empirical analysis. This model achieved an HMAE of 0.3197 and an HMSE of 0.2088, with accuracy improvements of 61.20% and 83.50% compared to the standalone MSGARCH model. These findings indicate that the hybrid MSGARCH-LSTM model improves volatility forecasting accuracy in highly volatile cryptocurrency markets.
Technical Analysis of the Indonesian Stock Market with Gated Recurrent Unit and Temporal Convolutional Network Siti Aisyah; Yenni Angraini; Kusman Sadik; Bagus Sartono; Gerry Alfa Dito
JUITA: Jurnal Informatika JUITA Vol. 12 No. 2, November 2024
Publisher : Department of Informatics Engineering, Universitas Muhammadiyah Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30595/juita.v12i2.23464

Abstract

Big data is essential in the age of 4.0 industry as it becomes the basis of decision making. Deep learning research in the last few years has been proven effective in understanding complex big data patterns, especially in the finance sector. The rapid growth of the Indonesian stock market in the last 20 years, which was driven by globalization, prompted fluctuation in the Bursa Efek Jakarta (JKSE) which was influenced by stock prices, commodity prices, and exchange rate. This study identifies the main indicators of Indonesian stock market crisis, applies and compares deep learning models, particularly Gated Recurrent Unit (GRU) and Temporal Convolutional Network (TCN), in predicting stock prices. This study identified 20 JKSE crisis points between the 2002-2023 period with average return value at around -6%. All variables correlated positively with JKSE, with SET.BK as the highest correlated variable in lag 0. The American and European stock market, commodity price, and exchange rate tend to show a pattern opposite to the JKSE crisis. Predictor variables such as STI, HIS, KLSE, KS11, SET.BK, PSEI.PS, RUT, and USDIDR are chosen based on significant cross correlation and average return plot. Hyperparameter tuning and cross validation within a 3 years window concluded that the GRU model is accurate and efficient, with RMSE value at 43.35568 and MAE value at 33.66909 in the validation data.
PEMODELAN DATA TERSENSOR KANAN MENGGUNAKAN ZERO INFLATED NEGATIVE BINOMIAL DAN HURDLE NEGATIVE BINOMIAL Rumahorbo, Kusni Rohani; Susetyo, Budi; Sadik, Kusman
Indonesian Journal of Statistics and Applications Vol 3 No 2 (2019)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v3i2.247

Abstract

Health is a very important thing for humanity. One way to look at a person's health condition is through the number of unhealthy days which can also shows the productivity of the community in a region. Modeling the number of unhealthy days which are examples of count data can be done using Poisson regression. Problems that are often faced in data counts are overdispersion and excess zero. Poisson regression cannot be applied to data that experiences both of these. Zero Inflated Negative Binomial and Hurdle Negative Binomial modeling was performed on data with 2 conditions, uncensored and censored. The explanatory variables used are gender, age, marital status, education level, home ownership status and rural-urban status. According to the results of the AIC and RMSE calculation, Zero Inflated Negative Binomial on censored data showed the best performance for estimating the number of unhealthy days.
KAJIAN REGRESI KEKAR MENGGUNAKAN METODE PENDUGA-MM DAN KUADRAT MEDIAN TERKECIL Khotimah, Khusnul; Sadik, Kusman; Rizki, Akbar
Indonesian Journal of Statistics and Applications Vol 4 No 1 (2020)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i1.502

Abstract

Regression is a statistical method that is used to obtain a pattern of relations between two or more variables presented in the regression line equation. This line equation is derived from estimation using ordinary least squares (OLS). However, OLS has limitations that are highly dependent on outliers data. One solution to the outliers problem in regression analysis is to use the robust regression method. This study used the least median squares (LMS) and multi-stage method (MM) robust regression for analysis of data containing outliers. Data analysis was carried out on generation data simulation and actual data. The simulation results of regression analysis in various scenarios are concluded that the LMS and MM methods have better performance compared to the OLS on data containing outliers. MM method has the lowest average parameter estimation bias, followed by the LMS, then OLS. The LMS has the smallest average root mean squares error (RMSE) and the highest average R2 is followed by the MM then the OLS. The results of the regression analysis comparison of the three methods on Indonesian rice production data in 2017 which contains 10% outliers were concluded that the LMS is the best method. The LMS produces the smallest RMSE of 4.44 and the highest R2 that is 98%. MM's method is in the second-best position with RMSE of 6.78 and R2 of 96%. OLS method produces the largest RMSE and lowest R2 that is 23.15 and 58% respectively.
Simulation Study of Robust Geographically Weighted Empirical Best Linear Unbiased Predictor on Small Area Estimation: Simulasi Metode Prediksi Tak Bias Linier Terbaik Empiris Terboboti Geografis Kekar pada Pendugaan Area Kecil Rakhsyanda, Naima; Sadik, Kusman; Indahwati, Indahwati
Indonesian Journal of Statistics and Applications Vol 5 No 1 (2021)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v5i1p50-60

Abstract

Small area estimation can be used to predict the population parameter with small sample sizes. For some cases, the population units that are close spatially may be more related than units that are further apart. The use of spatial information like geographic coordinates are studied in this research. Outlier contaminations can affect small area estimations. This study was conducted using simulation methods on generated data with six scenarios. The scenarios are the combination of spatial effects (spatial stationary and spatial non-stationary) with outlier contamination (no outlier, symmetric outliers, and non-symmetric outliers). The purpose of this study was to compare the geographically weighted empirical best linear unbiased predictor (GWEBLUP) and robust GWEBLUP (RGWEBLUP) with direct estimator, EBLUP, and REBLUP using simulation data. The performance of the predictors is evaluated using relative root mean squared error (RRMSE). The simulation results showed that geographically weighted predictors have the smallest RRMSE values for scenarios with spatial non-stationary, therefore offer a better prediction. For scenarios with outliers, robust predictors with smaller RRMSE values offer more efficiency than non-robust predictors.
On the Use of Zero-Inflated Mixed Models for Count Data: A Simulation and Empirical Evidence Anang Kurnia; Zafira Fakhriyah; Kusman Sadik; Dian Handayani
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1303

Abstract

This paper evaluates the performance of classical count regression models (Poisson, Negative Binomial, Generalized Poisson), zero-inflated models (Zero-Inflated Poisson/ZIP, Zero-Inflated Negative Binomial/ZINB, Zero-Inflated Generalized Poisson/ZIGP), and zero-inflated mixed models (ZIPMM, ZINBMM, ZIGPMM) for over-dispersed count data, particularly due to excess zeros and unobserved heterogeneity. Using simulation and empirical studies, we evaluated the performance of the models based on their predictive capability and their ability to yield valid inferences through hypothesis testing. The simulation, replicated 1000 times, involves 27 scenarios that combine various sample sizes, proportions of zero counts, and response variable distributions. Our findings indicate that ZIGPMM and ZINBMM provide the smallest root mean square error (RMSE) values. Although the Poisson model yields a relatively small RMSE, it does not adequately account for overdispersion, leading to underestimated standard errors and potentially misleading significance tests. The negative binomial model yields dispersion estimates closest to 1, indicating good performance, whereas ZIGP, ZINB, ZIGPMM, and ZINBMM perform better when zero counts are extremely high. Empirical analysis of data on under-five mortality due to pneumonia in Java Island, Indonesia, indicates that ZINB, ZINBMM, and ZIGPMM have the smallest Akaike Information Criterion (AIC), making them the most suitable models. These models show that exclusive breastfeeding and vitamin A have no significant effect on under-five child mortality due to pneumonia, while severe malnutrition has a statistically significant impact (α=0.05).
Co-Authors . Erfiani . Indahwati A.Tuti Rumiati Aam Alamudi Abdullah, Adib Roisilmi Achmad Fauzan Agus Mohamad Soleh Ahmad Rifai Nasution Aji Hamim Wigena Akbar Rizki Akbar Rizki Akmala Firdausi Alfiryal, Naufalia Amalia, Rahmatin Nur Anadra, Rahmi Ananda Shafira Anang Kurnia Andespa, Reyuli Andi Okta Fengki ASEP SAEFUDDIN Astari, Reka Agustia Astari, Reka Agustia Aulya Permatasari Azka Ubaidillah Bagus Sartono Budi Susetyo Budi Susetyo Cici Suhaeni Cici Suhaeni Dian Handayani Dito, Gerry Alfa Dwi Agustin Nuriani Sirodj Efriwati Efriwati Embay Rohaeti Eminita, Viarti Evita Purnaningrum Fahira, Fani FARDILLA RAHMAWATI Farit Mochamad Afendi Fitrianto, Anwar Freya, Wa Ode Rona Gerry Alfa Dito Haikal, Husnul Aris Hari Wijayanto Hasnataeni, Yunia Hazan Azhari Zainuddin Hermawati, Neni I Gusti Ngurah, Sentana Putra I Made Sumertajaya I Wayan Mangku Indahwati Indahwati Indahwati Intan Arassah, Fradha Iqbal, Teuku Achmad Isnanda, Eriski Kamila, Sabrina Adnin Khairi A N Khairil Anwar Notodiputro Khikmah, Khusnia Nurul khusnul khotimah Khusnul Khotimah Kusni Rohani Rumahorbo Latifah, Leli Lili Puspita Rahayu Logananta Puja Kusuma M Soleh, Agus Mochamad Ridwan Mochamad Ridwan, Mochamad Mohammad Masjkur Muh Nur Fiqri Adham Muhammad Yusran Mulianto Raharjo Naima Rakhsyanda Nisrina Az-Zahra, Putri Nur Khamidah NURADILLA, SITI Nusar Hajarisman Pangestika, Dhita Elsha Parwati Sofan, Parwati Purnama Sari Rakhsyanda, Naima Rifqi Aulya Rahman Rita Rahmawati Rizaldi Boer Rizki, Akbar Rizqi, Tasya Anisah ROCHYATI ROCHYATI Rumahorbo, Kusni Rohani Sahamony, Nur Fitriyani Saleh, Agus Muhammad Satriyo Wibowo Sentana Putra, I Gusti Ngurah Siregar, Jodi jhouranda Siti Aisyah Siti Raudlah Sitti Nurhaliza Soleh, Agus M Suhaeni, Cici Sundari, Marta Supriatin, Febriyani Eka Tendi Ferdian Diputra Titin Suhartini Titin Suhartini, Titin Tri Wahyuni Uswatun Hasanah Utami Dyah Syafitri Viarti Eminita Widhiyanti Nugraheni Yenni Angraini Yenni Kurniawati Yuli Eka Putri Zafira Fakhriyah