cover
Contact Name
Jumanto
Contact Email
jumanto@mail.unnes.ac.id
Phone
+628164243462
Journal Mail Official
sji@mail.unnes.ac.id
Editorial Address
Ruang 114 Gedung D2 Lamtai 1, Jurusan Ilmu Komputer Universitas Negeri Semarang, Indonesia
Location
Kota semarang,
Jawa tengah
INDONESIA
Scientific Journal of Informatics
ISSN : 24077658     EISSN : 24600040     DOI : https://doi.org/10.15294/sji.vxxix.xxxx
Scientific Journal of Informatics (p-ISSN 2407-7658 | e-ISSN 2460-0040) published by the Department of Computer Science, Universitas Negeri Semarang, a scientific journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the field of information systems and information technology as well as a review-general review of the development of the theory, methods, and related applied sciences. The SJI publishes 4 issues in a calendar year (February, May, August, November).
Articles 16 Documents
Search results for , issue "Vol. 12 No. 2: May 2025" : 16 Documents clear
Mental Health Chatbot Application on Artificial Intelligence (AI) for Student Stress Detection Using Mobile-Based Naïve Bayes Algorithm Mariyana, Ekanata Desi Sagita; Novita, Mega; Nur Latifah Dwi Mutiara Sari
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.24307

Abstract

Purpose: This study aims to design and evaluate a chatbot-based artificial intelligence system to identify stress levels in students using the Naïve Bayes classification method. With increasing mental health concerns among students, early stress detection is considered crucial for timely intervention Methods: This study proposes an AI-based chatbot system to detect student stress levels using a comparative approach between Naïve Bayes and Support Vector Machine (SVM) algorithms. A Kaggle dataset with 15 psychological and academic indicators was preprocessed and balanced using SMOTE. Naïve Bayes showed higher accuracy (90%) than SVM (89%). The trained model was deployed via Flask with Ngrok tunneling and integrated into a Flutter mobile app connected to the Gemini AI API for real-time stress screening. This research offers a practical and scalable solution for early mental health detection in students through intelligent chatbot interaction. Result: The findings show that the Naïve Bayes model achieves a classification accuracy of 90%, slightly surpassing the SVM model, which records an accuracy of 89%. Evaluation through ROC and AUC metrics supports the reliability of Naïve Bayes in detecting stress levels. The integrated chatbot offers a responsive and engaging platform for preliminary mental health assessments. Novelty: This research presents a unique contribution by combining AI-driven stress detection with a real-time chatbot interface, offering an accessible and scalable approach to student mental health support. The integration of machine learning models with conversational AI provides an innovative solution for early intervention. Future developments may involve deep learning and more diverse psychological inputs to further improve accuracy and effectiveness.
Integration of Random Forest, ADASYN, and SHAP for Diabetes Prediction and Interpretation Aulia, Hozana; Wibowo, Adi; Sutrisno, Sutrisno
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.24314

Abstract

Purpose: Diabetes is a chronic disease with a globally rising prevalence. Early detection of individuals at risk is essential to prevent long-term complications. This study aims to develop a diabetes prediction model that not only achieves high classification accuracy but also provides transparent explanations of the factors influencing its predictions. Methods: The study utilized the Pima Indians Diabetes Dataset, which contains clinical data from 768 female patients aged over 21. The methodology included data preprocessing (handling of missing values and feature engineering, such as the creation of Age_BMI and Glucose_BMI features), a 70:30 train-test split, class imbalance handling using the ADASYN technique, model development using the Random Forest algorithm with hyperparameter tuning via GridSearchCV, and model interpretability analysis using SHAP. Result: The proposed model achieved an accuracy of 79.2% and a recall of 85.2% on the test data. SHAP analysis revealed that Glucose, Age_BMI, BMI, and DiabetesPedigreeFunction were the most influential features in predicting diabetes. Furthermore, the SHAP heatmap indicated that individuals aged 30–50 years with obesity were at the highest risk. These findings align with existing medical literature, reinforcing the role of metabolic and age-related factors in diabetes development. Novelty: This study presents an integrative approach combining class balancing (ADASYN), classification (Random Forest), and model interpretability (SHAP) in a unified framework for diabetes prediction. It emphasizes the importance of transparent model interpretation for healthcare professionals, enabling not only predictive outcomes but also actionable insights into risk factors. The findings support future research opportunities, including the integration of lifestyle variables and external validation using real-world clinical data from diverse populations.
Optimizing LSTM-CNN for Lightweight and Accurate DDoS Detection in SDN Environments Kartadie, Rikie; Kusjani, Adi; Kusnanto, Yudhi; Harnaningrum, Lucia Nugraheni
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.24531

Abstract

Purpose: This study optimizes the LSTM-CNN model to detect Distributed Denial of Service (DDoS) attacks in Software-Defined Networking (SDN)-based networks and improves accuracy, computational efficiency, and class imbalance handling. Methods: We developed an Improved LSTM-CNN by removing the Conv1D layer, reducing LSTM units to 64, and using 21 features with a 5-timestep approach. The InSDN dataset (50,000 samples) was preprocessed with one-hot encoding, MinMaxScaler normalization, and sequence formation. Class imbalance was managed using class weights (0:2.0, 1:0.5) instead of SMOTE, with performance compared against Baseline LSTM-CNN and Dense-only models optimized with the Sine Cosine Algorithm (SCA). Result: The Improved LSTM-CNN achieved 0.99 accuracy, 0.93 F1-score for Benign traffic, and 1.00 for Malicious traffic, with ~25,000 parameters and 125 ms inference time on Google Colab. It outperformed Baseline LSTM-CNN (0.08 accuracy) and was more efficient than Dense-only (46,000 parameters), with a false positive rate of ~1%. Novelty: This research presents a lightweight, efficient DDoS detection solution for SDN, leveraging temporal modeling and class weights, suitable for resource-constrained controllers like OpenDaylight or ONOS. However, its generalization is limited by dataset diversity, necessitating broader validation.
Evaluation of Ridge Classifier and Logistic Regression for Customer Churn Prediction on Imbalanced Telecommunication Data Rofik, Rofik; Unjung, Jumanto
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.24620

Abstract

Purpose: Customer churn is a crucial issue for companies, especially those in the telecommunications sector, as it has a direct impact on revenue and new customer acquisition costs. The purpose of this research is to create a customer churn prediction model through performance comparison between the Logistic Regression algorithm and Ridge Classifier, considering the effect of data balancing. Methods: This study developed a churn classification model by comparing the Logistic Regression and Ridge Classifier algorithms in three scenarios: without data balancing, balancing using SMOTE, and balancing using GAN. The dataset used was Telco Customer Churn from Kaggle. Model evaluation was performed using a confusion matrix with accuracy, precision, recall, and F1-score metrics, with a primary focus on the accuracy metric. Result: The results show that data balancing using SMOTE and GAN does not improve model accuracy. The highest accuracy was achieved by the Ridge Classifier without data balancing, at 82.47%, followed by Logistic Regression at 82.25%. However, the recall and F1-score metrics improved when using SMOTE. The highest recall was achieved by Ridge Classifier at 75.34% and Logistic Regression at 75.07% in the SMOTE 50:50 scenario. The highest F1-score was also achieved by Ridge Classifier at 64.76% and Logistic Regression at 64.68% followed by the SMOTE 50:30 scenario. Meanwhile, the precision metric tends to decrease after data balancing. Novelty: The uniqueness of this study lies in the comparison of the performance of the Ridge Classifier and Logistic Regression in data balancing scenarios using SMOTE and GAN, which has not been widely discussed in previous studies. The main findings show that the highest accuracy is achieved when the Ridge Classifier model uses original data or without applying SMOTE or GAN data balancing. However, data balancing using SMOTE has been proven to significantly improve the recall and F1-score metrics.
Freshwater Filling Optimization Based on Price Using XGBoost and Particle Swarm Optimization on Cargo Ship Voyage Yulianto, Ilham; Fauzi, Muhammad Dzulfikar; Safitri, Pima Hani
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.24988

Abstract

Purpose: Efficient freshwater management is critical in cargo ship operations, yet current practices often involve fixed refilling strategies that ignore price differences across ports and fail to predict actual consumption accurately. These inefficiencies lead to unnecessary operational costs. To address this, the study introduces a combined approach using XGBoost for predict freshwater usage and Particle Swarm Optimization (PSO) to minimize refilling costs through optimal port selection. Methods: Freshwater demand was predicted using an XGBoost regression model trained on real operational data from 2024, which included historical voyage distances and freshwater consumption records from cargo ships. Based on these predictions, Particle Swarm Optimization (PSO) was applied to identify cost-efficient refilling locations along each ship’s route, minimizing total water procurement cost while satisfying operational constraints. The proposed framework was validated through simulated voyage scenarios to evaluate its impact on cost efficiency and planning effectiveness. Result: The integration of XGBoost and PSO effectively optimized freshwater refilling strategies, achieving a relative prediction error of 9.48% in freshwater consumption prediction and cost savings from 9 to 40% from across 3 ships sample through strategic port selection based on consumption patterns and price variability. Novelty: Unlike prior works focused on fuel or generic logistics optimization, aim of this study is to combine XGBoost and PSO for optimizing freshwater refilling on cargo ship voyages using actual operational data. The results demonstrate practical, scalable improvements in cost efficiency, making a novel contribution to maritime resource planning.
The Empirical Best Linear Unbiased Prediction and The Emperical Best Predictor Unit-Level Approaches in Estimating Per Capita Expenditure at the Subdistrict Level Fauziah, Ghina; Kurnia, Anang; Djuraidah, Anik
Scientific Journal of Informatics Vol. 12 No. 2: May 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i2.25037

Abstract

Purpose: This study aims to estimate and evaluate per capita expenditure at the subdistrict level in Garut Regency by employing unit-level Small Area Estimation (SAE) techniques, specifically utilizing the Empirical Best Linear Unbiased Predictor (EBLUP) and the Empirical Best Predictor (EBP) methods. Methods: The data used in this study are socio-economic data, specifically per capita household expenditure in Garut Regency. Socio-economic data generally skew positively rather than the normal distribution, so a method that can approximate or come close to the normal distribution is needed, for example, log-normal transformation. To improve the performance of EBLUP, which may lead to inefficient estimators because of violation of the assumption of normality, this study proposes the Empirical Best Predictor (EBP) method. It handles positively skewed data by applying log-normal transformation to sample data so that it more closely conforms to the desired distribution. Result: The EBP results are more stable than EBLUP since EBLUP is highly sensitive to outliers, and in cases where the normality assumption is violated, it produces a significant mean square error and inefficient estimators. Evaluating the estimates with both EBLUP and EBP shows Relative Root Mean Squared Error (RRMSE) values above 25%, especially in the subdistricts of Pamulihan, Sukaresmi, and Kersamanah. This is probably due to the household samples being taken in these three subdistricts being comparatively small compared to the other. Novelty: In this research, we use EBP to improve the performance of EBLUP, which produces inefficient estimators when the normality assumption is violated.

Page 2 of 2 | Total Record : 16