Claim Missing Document
Check
Articles

Found 3 Documents
Search

Evaluasi Performa Random Forest, XGBoost, dan LightGBM dalam Diagnosis Dini Diabetes Mellitus Hendra, Hendra Kurniawan; Asmaul Dwi Akbar; Nicholas Svensons; Yandi Jaya Antonio; Karnila, Sri; Safitri, Egi; Nurjoko, Nurjoko
JUPITER (Jurnal Penelitian Ilmu dan Teknologi Komputer) Vol 17 No 2 (2025): Jurnal Penelitian Ilmu dan Teknologi Komputer (JUPITER)
Publisher : Teknik Komputer Politeknik Negeri Sriwijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Diabetes mellitus is a long-term condition marked by elevated blood sugar levels, which can lead to serious complications such as heart disease, kidney failure, and vision impairment. Early detection plays a vital role in minimizing these risks and enhancing patients' quality of life. This research focuses on assessing the performance of three machine learning algorithms—Random Forest, XGBoost, and LightGBM—in predicting diabetes risk. The dataset utilized originates from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), comprising 768 samples with 9 key features. The research methodology involves multiple stages, including data collection, preprocessing, addressing data imbalance using SMOTE, data splitting for training and testing, algorithm implementation, and model evaluation through accuracy, precision, recall, F1-score, and Area Under the Curve (AUC) metrics. Findings reveal that Random Forest delivers the highest performance with an AUC score of 86%, followed by XGBoost (83%) and LightGBM (82%). With its strong accuracy, this model holds potential as a valuable tool for early diabetes diagnosis, contributing to faster and more precise medical decision-making.
Klasterisasi Data Penjualan Menggunakan Algoritma K-Mean Dengan RapidMiner Panjaitan, Tiodora Priska; Asmaul Dwi Akbar; Sabrina Nur Rahmah; Stefani Cinthia Ernadi; Mochammad Akmal Fatoni; Fatkhul Inayah; Uli Vicilia Sitorus
Journal of Data Science Methods and Applications Vol. 1 No. 1 (2025)
Publisher : Program Studi Sains Data - Institut Informatika dan Bisnis Darmajaya

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

ABSTRACTThis research aims to identify the optimal number of clusters in the dataset using the K-Means algorithm and the Elbow method in Rapidminer software. The method used is K-Means to cluster data and the Elbow method to determine the optimal number of clusters. The results of research using the K-Means algorithm have obtained the optimal number of clusters. From the results of processing test data with the number of clusters (k= 2 – 5), it was found that cluster 2 had the highest number of domestic chicken egg sales compared to cluster 1, namely 41 purchases.
Prediksi Diagnosa Penyakit Jantung (Cardiovascular Diseases) Menggunakan Algoritma Machine Learning Rini Nurlistiani; Mia Sabina; Asmaul Dwi Akbar
Journal of Data Science Methods and Applications Vol. 1 No. 1 (2025)
Publisher : Program Studi Sains Data - Institut Informatika dan Bisnis Darmajaya

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Heart disease remains a global health concern, being the leading cause of mortality with substantial impacts on the population. This research addresses the challenges in early detection and prediction of heart diseases, considering the complex and diverse nature of Cardiovascular Diseases (CVD). With limitations in diagnostic tools and healthcare resources, the study explores the application of machine learning algorithms for accurate predictions. Building upon previous research, various machine learning algorithms, including Random Forest, Multilayer Perceptron, Gaussian Processes, and M5P, were employed to predict heart disease-related data. The research involved comprehensive data pre-processing, visualization, model fitting, and evaluation stages. The dataset, sourced from the Hungarian Institute of Cardiology, comprised 14 attributes. Results demonstrated the effectiveness of the selected machine learning models, with Random Forest exhibiting outstanding performance, closely followed by Multilayer Perceptron. Gaussian Processes performed relatively well, while M5P provided a complex model structure offering additional insights. The use of 10-fold cross-validation enhanced the stability of model evaluation. Statistical analysis and data visualization contributed to a thorough understanding of model performance and dataset characteristics. In conclusion, this research contributes to developing accurate predictive models for heart disease detection. The findings offer valuable insights into algorithm performance and dataset characteristics, guiding future health science and information technology efforts for improved preventive and diagnostic measures. The methodology employed, including machine learning algorithms and cross-validation, presents a robust approach for future research in cardiovascular health prediction