cover
Contact Name
Mustakim
Contact Email
Mustakim
Phone
-
Journal Mail Official
ijaidm@uin-suska.ac.id
Editorial Address
-
Location
Kab. kampar,
Riau
INDONESIA
Indonesian Journal of Artificial Intelligence and Data Mining
ISSN : 26143372     EISSN : 26146150     DOI : -
Core Subject : Science,
Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM) is an electronic periodical publication published by Puzzle Research Data Technology (Predatech) Faculty of Science and Technology UIN Sultan Syarif Kasim Riau, Indonesia. IJAIDM provides online media to publish scientific articles from research in the field of Artificial Intelligence and Data Mining. IJAIDM will be published 2 (two) times a year, in March and September, each edition contains 7 (seven) articles. Articles may be written in English or Indonesia.
Arjuna Subject : -
Articles 250 Documents
Application of Categorical Boosting Model in Classifying Diseases of Tomato Leaves Rahmah, Fitria; Annisa, Selvi; Anggraini, Dewi
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.38869

Abstract

Tomatoes are a strategic horticultural commodity whose productivity is often hampered by leaf diseases, particularly early blight and late blight. Manual identification through visual inspection is often inaccurate due to the similarity of symptoms between diseases. This study aims to improve the performance of tomato leaf disease classification using machine learning by overcoming the limitations of previous research by Ningsih et al., which focused solely on disease classes and did not include healthy leaf samples, thereby risking the model failing to recognize normal plant conditions. The proposed methodology integrates the VGG16 architecture as a feature extractor with the Categorical Boosting (CatBoost) algorithm as a classifier. The dataset sourced from Kaggle was cleaned and resized to 224x224 pixels, resulting in 3,285 images. The experimental results show that integrating VGG16 with CatBoost achieves good performance. The accuracy score achieved is 93.1%, while the F1 scores achieved are 90.2% (healthy leaves), 90.3% (early blight), and 98.6% (late blight). Compared to the research by Ningsih et al., this approach not only expands the scope of classification by including the healthy leaf class, but also shows better accuracy in identifying the health conditions of tomato plants.
Personalized Behavioral Analytics for GPS-Validated Attendance Systems Using K-Means Clustering and Individual-Baseline Anomaly Detection Abidin, Ashari; Dinata, Riadi Marta; Satrio, Bambang; Petrus, Risma; Lamsir, Seno
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.38881

Abstract

This study develops and evaluates a GPS-based attendance analytics framework integrating three complementary analytical layers for higher education environments. The proposed system combines spatial validation using Haversine-based geofencing, behavioral segmentation through K-Means clustering with multi-metric validation, and personalized anomaly detection employing individual-baseline Z-Score computation. Empirical evaluation utilized 4,300 attendance records from 13 lecturers at FSTT ISTN Jakarta over a 16-month period. K-Means clustering with K=3 achieved a Silhouette Score of 0.634 and a Davies-Bouldin Index of 0.621, identifying three behavioral segments: High Performers (30.8%), Moderate (38.5%), and Improvement Needed (30.8%). The personalized Z-Score method detected 19.9% more anomalies compared to population-based thresholds and reduced detection inequity across lecturer groups. Practically, the framework transforms passive attendance logging into a decision-support tool that enables differentiated monitoring, early behavioral change detection, and fairer evaluation policies. However, the study is limited by a relatively small sample size (13 lecturers) within a single institutional context, which may affect model generalizability. Broader validation across larger and multi-institutional datasets is recommended for future work.
Analyzing Opinion Polarization on Joko Widodo's Diploma Using Machine Learning Julianti, Julianti; Wajidi, Farid; Musawwir, Musawwir
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.38616

Abstract

The authenticity of former President Joko Widodo's diploma has become a hot topic on the digital space, especially in the comments section of Kompas TV's YouTube channel. The wide diversity of opinions reflects a polarization of public opinion that is worth further analysis. Given the large volume of text data from public comments, manual analysis is ineffective; a technology-based approach is needed to systematically group opinions. Therefore, this study was conducted to analyze public opinion polarization using a machine learning approach. Two classification algorithms, Naive Bayes and Random Forest, were used to distinguish between pro and con public comments on the issue. Data were obtained through an automated collection process (web scraping), followed by text pre-processing and TF-IDF (Term Frequency–Inverse Document Frequency) word weighting. The test results showed that the Random Forest algorithm performed best, with an accuracy of 91%, while Naïve Bayes achieved only 74%. This shows that the Random Forest method is more effective than the Naïve Bayes approach in detecting unstructured text patterns. This study concludes that machine learning can be used effectively to identify trends in public opinion on social media and can serve as a basis for further research using word embedding and deep learning models.
CT Radiomics and Ensemble Learning for 5-Year Survival Prediction in Colorectal Liver Metastases Astuti, Widya; Widodo, Catur Edi; Soesanto, Qidir Maulana Binu
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.39071

Abstract

Colorectal liver metastases (CRLM) significantly impact patient survival with high recurrence rates. Traditional prognostic models often overlook tumor heterogeneity, leading to suboptimal risk stratification. To address this, radiomics was employed to quantify sub-visual tumor phenotypes, while ensemble learning was selected to robustly handle high-dimensional feature complexity and improve generalization capability. This retrospective study analyzed 145 CRLM patients from The Cancer Imaging Archive, extracting 1130 radiomics features from preoperative CT scans alongside clinical variables. Data were split into training (n=101) and testing (n=44) sets, with feature selection reducing the input to 16 key features. Three ensemble models (XGBoost, LightGBM, Random Forest) were optimized using Optuna, incorporating SMOTE and isotonic calibration. On the test set, XGBoost achieved ROC-AUC 0.918, sensitivity 0.739, and specificity 0.952. LightGBM yielded ROC-AUC 0.916, sensitivity 0.782, and specificity 0.904. Random Forest recorded ROC-AUC 0.888, sensitivity 0.826, and specificity 0.667. Key features included "progression or recurrence" and wavelet-based texture metrics reflecting tumor heterogeneity. These findings demonstrate the effectiveness of combining CT radiomics with gradient boosting models to capture complex prognostic patterns. This integration enhances 5-year survival prediction in CRLM, offering a non-invasive tool for personalized risk stratification and improved clinical decision-making compared to the currently utilized traditional prognostic models.
Convolutional Neural Networks-Based Deep Learning for Diabetic Retinopathy Detection Nurmalasari, Mieke; Kurniawati, Anastasia Cyntia Dewi; Herwanto, Agus; Kurniawati, Dyah; Muchlis, Husni Abdul; Pertiwi, Tria Saras
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.38631

Abstract

Diabetic retinopathy (DR) is a major complication of diabetes that can cause permanent vision loss, affecting about 35% of people with type 2 diabetes worldwide. However, existing diagnostic models often struggle with class imbalance and limited generalizability across diverse real-world datasets. Early detection is crucial, yet manual screening is time-consuming and depends on expert assessment. This study develops an automated DR diagnostic system using deep learning to classify fundus images by severity. The model uses an EfficientNetB3 CNN pretrained on ImageNet, combined with CLAHE preprocessing to enhance image contrast. The preprocessing steps include resizing, CLAHE, normalization, and data augmentation (±20° rotation, horizontal flipping, and ZCA whitening). The dataset is the Gaussian-filtered APTOS 2019 set, consisting of 2,750 images across five DR levels (0–4). The model achieved 95% training accuracy and 75% validation accuracy, with overfitting observed after epoch 14. While training performance was high, evaluation metrics (Precision, Recall, F1-Score, and AUC) indicate the need for early stopping or regularization to improve generalization. Overall, CNN-based deep learning can effectively automate DR detection, though further optimization is required for better performance on unseen data. Clinically, this automated pipeline offers a reliable decision-support tool to prioritize high-risk patients for immediate ophthalmological review
Classification of Wild Edible Plants Using InceptionV3 with Transfer Learning and Metadata Integration as a Decision Support System Fauzi, Ridho Nur; Naibaho, Julius Panda Putra; De Kweldju, Alex
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.39091

Abstract

Deep learning has advanced intelligent systems for plant identification; however, distinguishing edible wild plants remains challenging due to limited datasets and the need for contextual information beyond visual classification. This study develops a Convolutional Neural Network (CNN) framework that integrates metadata as a decision support system to enhance food safety and strengthen community-based food security. A dataset of 16,076 images across 34 classes of edible wild plants was collected and enriched with metadata containing plant descriptions, consumption status, and nutritional values. The dataset was split into 75% training, 20% validation, and 5% testing to ensure reliable evaluation. The proposed solution employs InceptionV3 with transfer learning as the primary model, chosen for its ability to capture complex visual features in limited datasets, while MobileNetV3-Large serves as a lightweight comparative architecture. Results show that InceptionV3 achieved superior performance with a test accuracy of 0.87 and F1-score of 0.88, whereas MobileNetV3-Large obtained only 0.03 accuracy, indicating poor generalization. This highlights the importance of selecting architectures with sufficient depth for domains characterized by high visual variability. Metadata integration enhanced the system’s role as a decision support tool, providing contextual information such as edibility status and nutritional content. The novelty of this research lies in combining CNN-based classification with metadata integration, transforming the system into a practical framework for safe consumption decisions. Limitations include the dataset containing only edible plants. Future work should incorporate non-edible classes, evaluate performance under real-world conditions, and explore advanced architectures and explainable AI techniques to improve robustness, transparency, and accessibility.
Hybrid Support Vector Regression-Genetic Algorithm Model for Forecasting Stock Prices Albab, Muhammad Ulil; Yoga Siswa, Taghfirul Azhima; Hasudungan, Rofilde
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.39057

Abstract

The stock market exhibits a high level of volatility, which often leads to significant price fluctuations and increases the risk of financial losses for investors. Therefore, stock price prediction is an important tool to support investment decision-making, particularly for PT Aneka Tambang Tbk (ANTM.JK). This study aims to predict ANTM stock prices by applying the Support Vector Regression (SVR) method optimized using a Genetic Algorithm (GA). The data used in this study consist of 1202 historical stock price data of ANTM from September 11, 2020 to September 11, 2025, obtained from Investing.com, and the data are normalized using the Min-Max normalization method. The dataset is divided into training data and testing data using an 80:20 ratio, where 80% of the data are used for training and 20% for testing. The SVR model is constructed using the Radial Basis Function (RBF) kernel, while the GA is employed to optimize the SVR parameters in order to obtain the optimal parameter combination, with main GA parameters including population size of 50, 30 generations, crossover rate of 0.8, and mutation rate of 0.1. Model performance is evaluated by comparing the prediction results of SVR without optimization and GA-optimized SVR using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). The experimental results indicate that the application of the GA improves the predictive performance of the model. The SVR model without optimization produces RMSE, MAE, and MAPE values of 85.48, 59.02, and 2.62%, respectively. After parameter optimization using GA, the model performance improves as indicated by reduced error values, with RMSE of 75.97, MAE of 52.42, and MAPE of 2.42%
Evaluating Single and Hybrid Feature Selection for Rainfall Prediction Using XGBoost Widoyono, Bambang; Nadhif, Muhammad Fahmy; Eryadi, Ridha Adjie
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.39110

Abstract

Rainfall prediction is challenging due to the complex and nonlinear nature of meteorological data. Previous studies using XGBoost with feature selection have demonstrated superior performance compared to other models, but evaluations have focused solely on error metrics (RSME, SME, MAE). Recent research suggests that predictive models should be evaluated for generalization, stability, interpretability, and computational efficiency to ensure their reliability. To close this gap, this study uses 8,750 hourly records obtained from Open-Meteo with 81 engineered features to evaluate XGBoost under three scenarios: without feature selection, single feature selection (MI, Boruta, SHAP, mRMR, ReliefF), and hybrid feature selection. Our findings demonstrate that accuracy is not always increased by feature selection. It does, however, increase interpretability, decrease overfitting, and improve computational efficiency. SHAP provides the most reliable performance among single methods, achieving lower RMSE (0.72632) and improved stability. Hybrid feature selection produces the most balanced performance gap = 0.01325, and stable variance = 0.03315 while reducing feature complexity to 35 variables. This study theoretically shows the value of multidimensional evaluation that goes beyond error metrics. In practical terms, this study suggests a feature selection method for rainfall prediction systems that are effective, reliable, and simple to understand.
Development of an Early Warning System for Predicting Student Academic Failure Using PSO-Based Machine Learning Rezkianti, Ni Made Novia; Lestari, Sri
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.39201

Abstract

Student academic failure is a critical issue in higher education, as it affects graduation rates and the overall quality of an institution. Early identification of students at risk is essential to enable timely academic interventions. This study aims to develop a predictive model to identify students at risk of academic failure using machine learning techniques. The dataset used in this research was obtained from the UCI Machine Learning Repository and includes students’ demographic, socio-economic, and academic attributes. This study applies Particle Swarm Optimization integrated with Mutual Information (PSO-MI) as a feature selection method. It compares the performance of K-Nearest Neighbor (KNN) and Neural Network (NN) classification algorithms. The feature selection process identified 12 relevant features related to students' academic performance and administrative information. Model evaluation was conducted using two validation schemes: split validation with an 80:20 ratio and k-fold cross-validation, and performance was assessed using precision, recall, and F1 Score metrics. The experimental results show that the Neural Network model with PSO-MI-based feature selection consistently outperformed the KNN model under both validation schemes. In the cross-validation experiment, the Neural Network model achieved an accuracy of 0.91, a precision of 0.91, a recall of 0.89, and an F1-score of 0.90, indicating better performance in identifying students at risk of dropout. These findings demonstrate that integrating PSO-based feature selection with Neural Network classification offers a promising approach to predicting academic failure. The proposed framework can support the development of early warning systems to help educational institutions identify at-risk students and implement timely academic interventions
A Hybrid Deep Feature Based VGG19 and Support Vector Machine Approach for Durian Leaf Classification Santoti, Jennifer Velensia; Devella, Siska
Indonesian Journal of Artificial Intelligence and Data Mining Vol 9, No 1 (2026): March 2026
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v9i1.38831

Abstract

Durian leaf classification has remained challenging due to high visual similarity among superior durian varieties and the limited robustness of conventional convolutional neural network models that rely on Softmax classifiers. This study aimed to address this limitation by investigating a deep feature-based classification framework that combined VGG19 as a feature extractor with a Support Vector Machine classifier. The experiments were conducted on a dataset of 1,530 durian leaf images representing four varieties: Bawor, Duri Hitam, Musang King, and Super Tembaga. Four experimental scenarios were designed to evaluate classification performance using Support Vector Machine and Softmax classifiers under both imbalanced and balanced data conditions through the application of Synthetic Minority Over-sampling Technique. The research gap addressed in this study lay in the absence of prior investigations that systematically evaluated the integration of VGG19 and Support Vector Machine for durian leaf variety classification under varying data distributions. Experimental results showed that the proposed VGG19–Support Vector Machine framework consistently achieved higher accuracy and more stable performance than Softmax-based models. This study demonstrated that replacing the conventional Softmax classifier with a Support Vector Machine significantly improved classification robustness compared to previous approaches that employed end-to-end convolutional neural network architectures.