Claim Missing Document
Check
Articles

Found 16 Documents
Search

Two-stage Gene Selection and Classification for a High-Dimensional Microarray Data Rochayani, Masithoh Yessi; Sa'adah, Umu; Astuti, Ani Budi
JOIN (Jurnal Online Informatika) Vol 5 No 1 (2020)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v5i1.569

Abstract

Microarray technology has provided benefits for cancer diagnosis and classification. However, classifying cancer using microarray data is confronted with difficulty since the dataset has high dimensions. One strategy for dealing with the dimensionality problem is to make a feature selection before modeling. Lasso is a common regularization method to reduce the number of features or predictors. However, Lasso remains too many features at the optimum regularization parameter. Therefore, feature selection can be continued to the second stage. We proposed Classification and Regression Tree (CART) for feature selection on the second stage which can also produce a classification model. We used a dataset which comparing gene expression in breast tumor tissues and other tumor tissues. This dataset has 10,936 predictor variables and 1,545 observations. The results of this study were the proposed method able to produce a few numbers of selected genes but gave high accuracy. The model also acquired in line with the Oncogenomics Theory by the obtained of GATA3 to split the root node of the decision tree model. GATA3 has become an important marker for breast tumors.
Geographically Weighted Random Forest Model for Addressing Spatial Heterogeneity of Monthly Rainfall with Small Sample Size Damayanti, Rismania Hartanti Putri Yulianing; Astutik, Suci; Astuti, Ani Budi
CAUCHY: Jurnal Matematika Murni dan Aplikasi Vol 10, No 1 (2025): CAUCHY: JURNAL MATEMATIKA MURNI DAN APLIKASI
Publisher : Mathematics Department, Universitas Islam Negeri Maulana Malik Ibrahim Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18860/cauchy.v10i1.32161

Abstract

Rainfall modeling often involves complex spatial patterns that vary across locations. Traditional spatial models such as Geographically Weighted Regression (GWR) assume linear relationships and may fall short in capturing nonlinear interactions among predictors and the small sample size is more challenging to fix the assumptions. To address this limitation, this study applies the Geographically Weighted Random Forest (GWRF) method is a hybrid approach that integrates Random Forest (RF), a non-parametric machine learning algorithm with geographically weighted modeling. GWRF is advantageous as it accommodates both spatial heterogeneity and nonlinear relationships, making it suitable for modeling monthly rainfall, which is inherently spatially varied and influenced by complex factors. This study aims to implement and evaluate the performance of the GWRF model in monthly rainfall prediction across East Java. The model is tested using various numbers of trees to determine the optimal structure, and its performance is assessed using Root Mean Square Error (RMSE), Akaike Information Criterion (AIC), and corrected AIC (AICc). Results indicate that the model tends to overestimate the Out-of-Bag (OOB) Error at all tree variations, with the smallest RMSE (85.68) achieved at 750 trees. Humidity emerges as the most influential variable in predicting monthly rainfall in the region, based on variable importance analysis
Geographically Weighted Poisson Regression Modeling Using Adaptive Gaussian Kernel Weighting For Mapping Maternal Mortality Rates In East Java Ngoro, Inayati; Pramoedyo, Henny; Astuti, Ani Budi
Jambura Journal of Biomathematics (JJBM) Volume 6, Issue 4: December 2025
Publisher : Department of Mathematics, Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/jjbm.v6i4.30411

Abstract

Maternal Mortality Rate (MMR) is a key public health indicator that reflects spatial variation across districts in East Java.  This study aims to model the spatial distribution of MMR using Geographically Weighted Poisson Regression (GWPR) with an Adaptive Gaussian Kernel weighting function. Secondary data were obtained from the 2022 East Java Provincial Health Profile, covering 38 districts and municipalities. The results indicate that GWPR outperforms the classical Poisson regression. The intercept β=2.889 (exp=17.95) suggests an average of 18 maternal deaths in the absence of predictor effects. The coverage of the fourth antenatal care visit (K4) has a significant negative effect ( β=-0.027; exp = 0.973), indicating that a 1% increase in K4 coverage reduces MMR by approximately 2.7%. Conversely, obstetric complications managed by midwives show a significant positive effect (β= = 0.0173; exp = 1.017), meaning that a 1% increase in complications raises MMR by 1.7%. Other predictorsfirst antenatal care visit (K1), ironfolic acid (IFA) supplementation, and number of health workersare not statistically significant. This study underscores the importance of expanding K4 coverage and strengthening complication management as priority strategies to reduce maternal mortality.  Furthermore, GWPR-based mapping enables more targeted maternal health interventions tailored to local characteristics.
ALGORITMA DBSCAN DAN SHARED NEAREST NEIGHBOR DALAM PENGELOMPOKKAN SPASIAL PRODUKTIVITAS JERUK SIAM DI INDONESIA Oktavia , Nur Sofi Sely; Iriany, Atiek; Astuti, Ani Budi
MATHunesa: Jurnal Ilmiah Matematika Vol. 13 No. 3 (2025)
Publisher : Universitas Negeri Surabaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26740/mathunesa.v13n3.p45-58

Abstract

Penerapan dua algoritma analisis density-based cluster, yaitu DBSCAN dan Shared Nearest Neighbor (SNN), dengan integrasi Principal Component Analysis (PCA) pada Produktivitas Jeruk Siam di Indonesia tahun 2023. Variabel yang digunakan terdiri dari 7 indikator jeruk siam, yaitu produktivitas (Ton/Pohon), produksi (Ton), pertumbuhan produksi (%), luas panen (Hektar), rata-rata suhu bulanan (°C), rata-rata kelembaban bulanan (%), dan rata-rata curah hujan bulanan (mm). Data yang digunakan merupakan data sekunder tahun 2023 yang diperoleh melalui publikasi BPS dan Kementerian Pertanian RI. Hasil penelitian menunjukkan bahwa SNN memiliki stabilitas pengelompokan yang lebih baik dibandingkan DBSCAN, dan penerapan PCA meningkatkan kinerja DBSCAN dan SNN. Model terbaik diperoleh dari SNN pada data PCA dengan tiga komponen utama (PC3), dengan Silhouette Coefficient sebesar 0,872. Algoritma ini menghasilkan 3 cluster, yaitu Cluster 0 mencakup 32 provinsi dengan skala produksi besar dan kondisi agroklimat yang beragam, Cluster 1 terdiri dari 3 provinsi dengan produksi kecil namun pertumbuhan sangat tinggi, sehingga terpisah dari sentra produksi utama, dan Cluster 2 mencakup 3 provinsi yang memiliki karakteristik lokal unik dengan skala produksi rendah hingga nol.
Bayesian IGARCH Modeling of Jakarta Composite Index Volatility Using Hamiltonian Monte Carlo Algorithm Maulana, Eka Dani; Sumarminingsih, Eni; Nurjannah; Astuti, Ani Budi; Astutik, Suci
Science and Technology Indonesia Vol. 11 No. 1 (2026): January
Publisher : Research Center of Inorganic Materials and Coordination Complexes, FMIPA Universitas Sriwijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26554/sti.2026.11.1.261-279

Abstract

Time series models that model volatility in financial data, especially in stock market indices such as the Jakarta Composite Index (JCI), are Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models. Following the ratification of the revised Armed Forces Law in March 2025, the JCI experienced increasing volatility, indicating persistent volatility. The problems in the JCI data require a time series model that can capture persistent volatility, namely the Integrated Generalized Autoregressive Conditional Heteroskedasticity (IGARCH) model. Parameter estimation for IGARCH models generally uses the Maximum Likelihood Estimation (MLE) method, which has limitations in handling parameter uncertainty. The Bayesian approach can address parameter uncertainty through the Markov Chain Monte Carlo (MCMC) methods. Among these, Hamiltonian Monte Carlo (HMC) is more efficient than Metropolis-Hastings and Gibbs Sampling, particularly in exploring complex posterior distributions. This study utilizes daily closing price data of the Jakarta Composite Index (JCI) as the main observation variable, observed from April 3, 2023, to April 9, 2025. This study aims to construct a volatility model for the Jakarta Composite Index (JCI) using a Bayesian IGARCH model with an HMC algorithm. This research only uses the IGARCH(1,1) model. The model has a strong ability to capture the JCI’s volatility structure, and its point forecasts are stable. However, credible intervals reveal the uncertainty level, so the volatility of JCI may decrease or increase.
Flood Prediction Using Modeling Extreme Rainfall in East Java, Indonesia Irsandy, Diego; Astutik, Suci; Astuti, Ani Budi
Plantropica: Journal of Agricultural Science Vol. 11 No. 1 (2026): Februari
Publisher : Department of Agronomy, Faculty of Agriculture, Brawijaya University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21776/ub.jpt.2026.011.1.2

Abstract

Extreme value theory (EVT) is a statistical method that is concerned with the analysis of the extreme values of a distribution. EVT is often used to model the behavior of rare and extreme events, such as floods caused by extreme rainfall phenomena. There are two methods for identifying the movement of extreme values, namely Block Maxima (BM) and Peaks over Threshold (POT). The Generalized Extreme Value (GEV) distribution has three parameters and is used to model the distribution of extreme values using the BM method. On the other hand, the classic method of EVT does not capture uncertainty in the data. The Bayesian method is one of the statistical methods that can use information from data and prior knowledge. This research aims to model EVT-BM using a Bayesian approach for rainfall data at eleven weather stations in Jawa Timur. The result shows that all rainfall distributions at different weather conditions have a value of the parameter shape equal to 0, which implies a Weibull distribution. This paper also provides return level of 6 months, 2, 5, and 10 years respectively.