Claim Missing Document
Check
Articles

Found 35 Documents
Search

Analisis Komparatif Model Regresi Machine Learning untuk Prediksi Prestasi Akademik Siswa dengan Optimasi Hyperparameter Hose, Fernando; Robet, Robet; Hendri, Hendri
JURNAL RISET KOMPUTER (JURIKOM) Vol. 12 No. 6 (2025): Desember 2025
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/jurikom.v12i6.9240

Abstract

Low accuracy in the early identification of at-risk students often hinders timely academic intervention. This study analyzes and compares seven machine learning algorithms to predict student academic achievement, aiming to provide a foundation for a reliable early warning model. The dataset includes 2.392 students with 15 features covering demographics, learning behavior, and environmental support. Model training was performed using GridSearchCV optimization combined with stratified cross-validation to mitigate overfitting.Performance was evaluated using MAE, RMSE, and R². The results show CatBoost performed the best R² = 0,774; RMSE = 0,581; MAE = 0,306) followed by LightGBM (R² = 0,771) and Gradient Boosting (R² = 0,767), while MLP showed the lowest performance. Feature importance analysis placed GPA as the dominant predictor, followed by absenteeism and weekly study time. These findings affirm the superiority of boosting-based models in capturing complex nonlinear relationships and provide a practical framework for educational institutions to build data-driven early warning systems.
Application of Bagging and Boosting Methods for Heart Disease Classification Parapak, Yehezkiel E.A; Robet, Robet; Hendrik, Jackri
Journal of Applied Computer Science and Technology Vol. 6 No. 2 (2025): Desember 2025
Publisher : Indonesian Society of Applied Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52158/we9asn06

Abstract

Cardiovascular disease remains a primary contributor to global mortality, underscoring the urgent need for accurate and early diagnostic tools. This study aims to develop a robust classification model for heart disease by conducting a comparative analysis of six ensemble machine learning algorithms, comprising three from the Bagging family (Random Forest, Bagged Decision Tree, Extra Trees) and three from the Boosting family (AdaBoost, Gradient Boosting, XGBoost). The research utilizes the publicly available UCI Cleveland Heart Disease dataset, which exhibits a mild class imbalance. To address this, the Synthetic Minority Over-sampling Technique (SMOTE) was strategically applied to the training data. The performance of each model was rigorously evaluated using accuracy, precision, recall, and F1-score. Experimental results revealed that the Extra Trees algorithm, when combined with SMOTE, achieved the highest overall performance with 90% accuracy, 96% precision, 82% recall, and an 88% F1-score. The primary contribution of this work lies in its comprehensive analysis demonstrating that the randomization strategy of Extra Trees provides a superior and more reliable framework for this classification task compared to other common ensemble techniques, particularly after data balancing. These findings confirm that an integrated approach of ensemble learning and proper data balancing can significantly enhance the development of fair and effective diagnostic tools to support medical professionals.
Comprehensive Comparison of TF-IDF and Word2Vec in Product Sentiment Classification Using Machine Learning Models Sinaga, Asra Gretya; Robet, Robet; Pribadi, Octara
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11582

Abstract

Sentiment analysis supports data-driven decisions by turning product reviews into reliable polarity labels. We compare four text representations, TF-IDF, TF-IDF reduced via SVD, Word2Vec (trained from scratch), and a hybrid TF-IDF(SVD-300). Word2Vec, for sentiment classification of Indonesian Shopee product reviews from Kaggle (~2.5k texts). After normalization (with optional emoji handling and Indonesian stemming), ratings are mapped to binary sentiment (≤2 negative, ≥4 positive; 3 discarded). Each representation is evaluated with Logistic Regression, Support Vector Machines (linear/RBF), Naive Bayes, and Random Forest under stratified 5-fold cross-validation. TF-IDF with Logistic Regression (C=1.0) yields the best results (F1-macro = 0.816 ± 0.026; Accuracy = 0.816 ± 0.026), with LinearSVC as a strong runner-up. Word2Vec (scratch) performs lower, consistent with limited data being insufficient to learn stable embeddings, while the hybrid representation offers only modest gains over Word2Vec and does not surpass TF-IDF. These findings indicate that TF-IDF is the most reliable and consistent representation for small, short-text review datasets, and they underscore the impact of feature design on downstream classification performance.
Penerapan Algoritma Transformer dalam Aplikasi Parafrase Teks Otomatis Robet, Robet; Kohsasih, Kelvin Leonardi; Darwin, Jenime
TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi Vol 5 No 1 (2025): TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi
Publisher : Universitas Methodist Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46880/tamika.Vol5No1.pp103-109

Abstract

The development of Natural Language Processing (NLP) technology has enabled the creation of automated text manipulation applications, one of which is text paraphrasing. This study aims to implement a Transformer architecture with a focus on Indonesian text for automatic text paraphrasing applications. The model used is a pre-trained Text-to-Text Transfer Transformer (T5), which is fine-tuned using an Indonesian text corpus called the Indo-T5 model. During the training process, the model is trained to understand language structure and context in order to generate paraphrases that are not only grammatically correct but also semantically preserved. Evaluation was conducted using BLEU and ROUGE metrics to measure the similarity between the generated paraphrased texts and manual references. The evaluation results show that the model is capable of producing coherent, relevant paraphrased texts with a good level of lexical variation with a BLEU score of 50.1, and ROUGE-L of 61.7. Thus, this study demonstrates that Transformer-based models can be effectively applied to the task of text paraphrasing in Indonesian.
Aplikasi Deteksi Usia Berbasis Citra Menggunakan Model Deep Learning dengan Arsitektur CNN Robet, Robet; Chandra, Chandra; Setiawan, Jerico
TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi Vol 5 No 1 (2025): TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi
Publisher : Universitas Methodist Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46880/tamika.Vol5No1.pp97-102

Abstract

This research aims to design and implement an age detection application based on facial images using a deep learning approach with a Convolutional Neural Network (CNN) architecture. The model is built to recognize and extract facial features in order to estimate an individual’s age automatically. Facial image datasets were obtained from public sources and enhanced through augmentation techniques such as rotation, flipping, and lighting adjustment to increase data variability. The training process involved splitting the data into training, validation, and testing sets. The model was evaluated using accuracy, precision, recall, and F1-score metrics. The gender detection system achieved an accuracy of 82.99% with a precision of 80.95% for males and 84.47% for females. Recall scores were 85.15% for males and 80.12% for females. For age detection, precision, recall, and F1-score varied across different age groups. Overall, the model demonstrates exemplary performance in age prediction, though it still faces challenges in distinguishing closely spaced age categories.