Claim Missing Document
Check
Articles

Mendeteksi Unsur Depresi pada Unggahan Media Sosial Menggunakan Metode Machine Learning dengan Optimasi Berbasis Inspirasi Alam Santoso, Zein Rizky; Wigena, Aji Hamim; Kurnia, Anang
ESTIMASI: Journal of Statistics and Its Application Vol. 6, No. 2, Juli, 2025 : Estimasi
Publisher : Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20956/ejsa.v6i2.45516

Abstract

Social media has now become an inseparable part of everyday life, including in expressing emotions and mental states. One popular platform is X (formerly Twitter), where many users indirectly share signs of depression. This study develops a classification model to detect indications of depression in social media posts, using machine learning algorithms and feature selection techniques based on nature-inspired algorithms. The classification algorithms used include Naïve Bayes, k-Nearest Neighbors (k-NN), Decision Tree, Random Forest, and XGBoost. Each algorithm is combined with feature selection techniques using Particle Swarm Optimization (PSO), Bat Algorithm (BA), and Flamingo Search Algorithm (FSA). The models are evaluated based on accuracy, precision, recall, F1-score, and the number of features used. The results show that the combination of the Random Forest method with FSA-based feature selection (RF-FSA) delivers the best performance, with an accuracy of 82.2%, balanced precision and recall, and efficient feature usage. Another strong alternative is XGBoost with FSA (XGB-FSA), although it requires more features and longer computational time. This study demonstrates that selecting the right feature selection algorithm, particularly FSA, can significantly improve both the accuracy and efficiency of depression text classification models. The resulting model is expected to serve as a useful tool for early detection of depression symptoms from social media posts, allowing for quicker and more targeted interventions.
D-OPTIMAL DESIGNS FOR SPLIT-PLOT MIXTURE PROCESS VARIABLE DESIGNS OF THE STEEL SLAG EXPERIMENT Arina, Faula; Wigena, Aji Hamim; Sumertajaya, I Made; Syafitri, Utami
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 1 (2022): BAREKENG: Jurnal Ilmu Matematika dan Terapan
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (661.096 KB) | DOI: 10.30598/barekengvol16iss1pp303-312

Abstract

The nature of the steel slag concrete experiment followed a mixture process variable (MPV) design. In this study, the concrete is composed of five mixture components, cement, fine aggregate, coarse aggregate, percentage steel slag replaced the fine aggregate and water, and process variable was the size of steel slag. Due to the constraints of the components, the experimental region was not a simplex. The standard MPV of a quadratic model produces large experimental runs. In this paper, D-optimal design with split- plot MPV approach was proposed. The five mixture components were assigned as the subplot factors and the process variable was assigned as the whole plot factors. The main objective of this information is a modified point exchange algorithm was developed to generate the D-optimal design. In addition, the paper investigates related issue namely, the estimation of the covariant matrix in MPV split-plot design. The final design consisted of 18 whole plots each of size 2 and experiment design with 36 observations
THE PROMINENCE OF VECTOR AUTOREGRESSIVE MODEL IN MULTIVARIATE TIME SERIES FORECASTING MODELS WITH STATIONARY PROBLEMS Rohaeti, Embay; Sumertajaya, I Made; Wigena, Aji Hamim; Sadik, Kusman
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 4 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (688.398 KB) | DOI: 10.30598/barekengvol16iss4pp1313-1324

Abstract

One of the problems in modelling multivariate time series is stationary. Stationary test results do not always produce all stationary variables; mixed stationary and non-stationary variables are possible. When stationary problems are found in multivariate time series modelling, it is necessary to evaluate the model's performance in various stationary conditions to obtain the best forecasting model. This study aims to get a superior multivariate time series forecasting model based on the goodness of the model in various stationary conditions. In this study, the evaluation of the model's performance through simulation data modelling is then applied to the actual data with a stationary problem, namely Bogor City inflation data. The best model in simulation modelling is based on the stability of RMSE and MAD in 100 replications. The results are that the VAR model is the best in various stationary conditions. Meanwhile, the best model on actual data modelling is based on evaluation in 4 folds for model fitting power and model forecasting power. The Bogor City inflation data modelling with the mixed stationary problem resulted in the best model, namely the VAR(1) model. This means the VAR model is good enough to be used as a forecasting model in mixed stationary conditions. Thus, in this study, based on the goodness of the model in two modelling scenarios in various stationary conditions, overall, it was found that the VAR model was superior to the VARD and VECM models.
APPLICATION OF PENALIZED SPLINE-SPATIAL AUTOREGRESSIVE MODEL TO HIV CASE DATA IN INDONESIA Pigitha, Nindi; Djuraidah, Anik; Wigena, Aji Hamim
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 1 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (389.821 KB) | DOI: 10.30598/barekengvol17iss1pp0527-0534

Abstract

Spatial regression analysis is a statistical method used to perform modeling by considering spatial effects. Spatial models generally use a parametric approach by assuming a linear relationship between explanatory and response variables. The nonparametric regression method is better suited for data with a nonlinear connection because it does not need linear assumptions. One of the nonparametric regression methods is penalized spline regression (P-Spline). The P-spline has a simple mathematical relationship with mixed linear model. The use of a mixed linear model allows the P-Spline to be combined with other statistical models. PS-SAR is a combination of the P-Spline and the SAR spatial model so that it can analyze spatial data with a semiparametric approach. Based on data from monitoring the development of the HIV situation in 2018, the number of HIV cases in Indonesia shows a clustered pattern that indicate spatial dependence. In addition, the relationship between the number of positive cases and the factors tends to be nonlinear. Therefore, this study aims to apply the PS-SAR model to HIV case data in Indonesia. The resulting model is evaluated based on the estimates of autoregressive spatial coefficient, MSE, MAPE, and Pseudo R2. Based on the results, the PS-SAR model has an autoregressive spatial coefficient similar to the SAR model and has smaller MSE and MAPE than the SAR model.
PRE-PROCESSING DATA ON MULTICLASS CLASSIFICATION OF ANEMIA AND IRON DEFICIENCY WITH THE XGBOOST METHOD Nurrahman, Fathu; Wijayanto, Hari; Wigena, Aji Hamim; Nurjanah, Nunung
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp0767-0774

Abstract

Anemia and iron deficiency are health problems in Indonesia and globally. In Multiclass Classification, data problems often occur, such as missing data, too many variables, and unbalanced data. Then pre-processing data will be carried out using MissForest imputation, Boruta featuring selection, and SMOTE to help improve the performance of the classification model in predicting a particular class. After the data pre-processing process is carried out, classification modeling will be carried out using the XGBoost algorithm. It was found that when pre-processing the data could improve the performance of the model in predicting multiclass classification for cases of anemia and iron deficiency in women in Indonesia by 0.815 for the accuracy value and 0.9693 for the AUC value
SMALL AREA ESTIMATION WITH HIERARCHICAL BAYES FOR CROSS-SECTIONAL AND TIME SERIES SKEWED DATA Yuniarty, Titin; Indahwati, Indahwati; Wigena, Aji Hamim
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 18 No 1 (2024): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol18iss1pp0493-0506

Abstract

Small Area Estimation (SAE) is a method based on modeling for estimating small area parameters, that applies Linear Mixed Model (LMM) as its basic. It is conventionally solved with Empirical Best Linear Unbiased Prediction (EBLUP). The main requirement for LMM to produce high precision estimates is normally distributed. The observation unit is food crop farmer households from Sulawesi Tenggara Province to estimate food and non-food per capita expenditure at the district/city level using SAE that has been positively skewed. Applying EBLUP for positively skewed data will result less accurate estimates. Meanwhile, transformation will be potentially result biased estimates. Therefore, the problem of skewed data and small area level in this research was completed by Hierarchical Bayes (HB) on combination cross-sectional and time series under skew-normal distribution assumption. The results obtained were skew-normal SAE HB model was significantly reducing Relative Root Mean Squared Error (RRMSE) than the direct estimation. It indicates that SAE modeling is able to provide a shrinkage effect on the direct estimation results. But, there is slightly different interpretating between direct estimation and skew-normal SAE HB. It is possible because the modeling used assumption that the autocorrelation coefficient is equal to 1 or known as the random walk effect. However, in reality, Susenas is not a panel data, so unit of observation for each time period may be different. Therefore, further research should be compared it with the skew-normal or another skewed distribution that assumes the autocorrelation coefficient is unknown and should be estimated in the model.
Biclustering-Based Analysis to Identify Fruit Production Potential in Indonesia Using Plaid Model Algorithm Alwani, Nadira Nisa; Sumertajaya, I Made; Wigena, Aji Hamim
Scientific Journal of Informatics Vol. 12 No. 3: August 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i3.25054

Abstract

Purpose: The application of biclustering using the plaid model aims to simultaneously identify mapping or grouping patterns of provinces and fruit type in Indonesia. The performance evaluation of the plaid model algorithm is used to assess its capability to discover and generate optimal biclusters, thereby representing the relationship between regions and fruit types with similar production characteristics. Methods: The plaid model algorithm produces optimal biclusters by configuring parameter scenarios such as model selection, managing the number of layers, and determining threshold values for rows and columns. The Average Mean Square Residue (MSR) value and the number of biclusters that can provide the most relevant data are used to determine the optimal parameter selection. Result: The plaid model algorithm effectively grouped provinces and fruit varieties into multiple biclusters. The row-constant model was choosen based on the average MSR value of 2.0537, which formed five overlapping biclusters across provinces and fruit types. Several provinces, such as Central Java and West Java, demonstrated a high potential for rose apples, breadfruit, and salak. Other provinces showed comparatively moderate levels of production. Novelty: This study presents a novel way to apply the plaid model biclustering algorithm to data on fruit varieties in various Indonesian provinces. Rarely used in horticulture, this method offers an alternative perspective on structured commodity mapping, especially when identifying specific patterns between fruit varieties and geographic distribution.
The Impact of Using A Linear Model for the Ordinal Response of Mixture Experiments Syafitri, Utami Dyah; Erfiani, Erfiani; Soleh, Agus M; Wigena, Aji Hamim
ZERO: Jurnal Sains, Matematika dan Terapan Vol 9, No 2 (2025): Zero: Jurnal Sains Matematika dan Terapan
Publisher : UIN Sumatera Utara

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30829/zero.v9i2.25760

Abstract

In a sensory test, the response is a Likert scale, which belongs to the ordinal scale. The ordinal response can be analyzed using a linear model approach; however, this approach can be misleading. This research aims to compare three different methods for ordinal response: the average score, the second-order Scheffe model, and the ordinal logistic model. The case study focused on the response to the taste of cookies resulting from the mixture experiment. The mixture experiment is one type of experimental design which is commonly used for product formulation. The research involved three ingredients with different lower bonds. The D-optimal design which also the {3,2} simplex-lattice design was chosen for the experiment. The three methods were conducted, and they all yielded the same results for the optimum composition; however, the ordinal model provided more information about the data's characteristics. The optimal formulation of each ingredient was 10%, 20%, 70%.
Perbandingan Metode Particle Swarm Optimization dan Artificial Bee Colony pada Support Vector Machine Hasibuan, Rafika Aufa; Afendi, Farit Mochamad; Wigena, Aji Hamim
JEPIN (Jurnal Edukasi dan Penelitian Informatika) Vol 11, No 1 (2025): Volume 11 No 1
Publisher : Program Studi Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/jp.v11i1.91235

Abstract

Optimasi metode klasifikasi merupakan aspek krusial dalam meningkatkan akurasi model, terutama dalam analisis data medis yang kompleks dan memiliki karakteristik peubah yang beragam. Penelitian ini membandingkan performa klasifikasi dari Support Vector Machine (SVM) konvensional dengan dua metode optimasi berbasis metaheuristik yaitu, PSO-SVM dan ABC-SVM. Evaluasi dilakukan pada empat dataset medis, yaitu Breast Cancer, AIDS Disease, Darwin Disease, dan Parkinson Disease, dengan variasi seleksi peubah berbasis proporsi sebesar 30%, 50%, 70% dan 100% dari total peubah pada masing-masing dataset. Hasil penelitian menunjukkan bahwa metode PSO-SVM dan ABC-SVM secara konsisten mampu meningkatkan akurasi klasifikasi dibandingkan SVM standar. Pada beberapa dataset seperti Breast Cancer dan Parkinson Disease, akurasi meningkat dari 96,22% dan 85,53% (SVM) menjadi 100% dengan metode PSO-SVM dan ABC-SVM. Pada dataset AIDS Disease, akurasi meningkat dari 87,36% menjadi 100%. Sementara itu, pada dataset Darwin Disease yang memiliki tingkat overlap tertinggi (OV = 0,99727), peningkatan akurasi lebih terbatas, dari 83,76% (SVM) menjadi 91,65% (ABC-SVM). Proporsi terbaik yang ditemukan bervariasi antar dataset. Namun secara umum proporsi 70% dan 100% menunjukkan hasil akurasi yang paling stabil dengan waktu komputasi yang efisien pada PSO-SVM. Sedangkan pada ABC-SVM, peningkatan akurasi yang tinggi disertai waktu eksekusi yang jauh lebih besar, terutama pada dataset berdimensi tinggi. Analisis lebih lanjut juga menunjukkan bahwa metode optimasi efektif dalam mengatasi tantangan overlapping dan ketidakseimbangan kelas secara moderat, namun efektivitasnya menurun pada kondisi yang lebih kompleks. Dengan demikian, penggunaan metode optimasi PSO-SVM dan ABC-SVM dapat menjadi pendekatan yang efisien untuk meningkatkan akurasi klasifikasi data medis, selama disesuaikan dengan karakteristik data dan sumber daya komputasi yang tersedia.
Performance Analysis of Robust Functional Continuum Regression to Handle Outliers Ismah, Ismah; Erfiani, Erfiani; Wigena, Aji Hamim; Sartono, Bagus
InPrime: Indonesian Journal of Pure and Applied Mathematics Vol. 6 No. 1 (2024)
Publisher : Department of Mathematics, Faculty of Sciences and Technology, UIN Syarif Hidayatullah

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/inprime.v6i1.38928

Abstract

Robust functional continuum regression (RFCR) is an innovation as a development of functional continuum regression that can be applied to functional data and is resistant to outliers. The resistance of RFCR depends on the applied weighting function. This study aims to evaluate the RFCR performance to handle outliers. We propose the various weighting functions in this evaluation, i.e., Huber, Hampel, Ramsay, and Tukey (Bisquare), which do not eliminate or give zero weight to observed data identified as outliers. This contribution is essential to determining the appropriate RFCR method without eliminating the outlier data. The result shows that the RFCR performance with the Huber weighting function is better than the others, based on the goodness of fit, consisting of the root means square error of prediction (RMSEP), the correlation between the actual data and the model, and the mean absolute error (MAE).Keywords: functional data analysis; Huber weighted function; Hampel weighted function; Ramsay weighted function; Tukey (Bisquare) weighted function. AbstrakRegresi kontinum fungsional kekar (RFCR) merupakan inovasi yang merupakan pengembangan dari regresi kontinum fungsional yang dapat diaplikasikan pada data fungsional dan tahan terhadap outlier. Resistansi RFCR bergantung pada fungsi pembobotan. Penelitian ini bertujuan untuk mengevaluasi kinerja RFCR. Kami mengusulkan beberapa fungsi pembobotan dalam evaluasi tersebut, yaitu Huber, Hampel, Ramsay, dan Tukey (Bisquare), dengan tidak menghilangkan atau memberikan bobot nol pada data observasi yang teridentifikasi sebagai outlier. Kontribusi ini penting untuk menentukan metode RFCR yang tepat tanpa menghilangkan data outlier. Hasil menunjukkan bahwa kinerja RFCR dengan fungsi pembobotan Huber lebih baik dibandingkan fungsi pembobotan lain berdasarkan goodness of fit, yang terdiri dari root mean square error of prediksi (RMSEP), korelasi antara data aktual dan model, dan mean kesalahan absolut (MAE).Kata Kunci: analisis data fungsional; fungsi berbobot Huber; fungsi tertimbang Hampel; fungsi tertimbang Ramsay; fungsi berbobot Tukey (Bisquare). 2020MSC: 62J99, 62R10