Claim Missing Document
Check
Articles

Found 6 Documents
Search
Journal : Malcom: Indonesian Journal of Machine Learning and Computer Science

Exoplanet Classification Through Machine Learning: A Comparative Analysis of Algorithms Using Kepler Data Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 3 (2024): MALCOM July 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i3.1303

Abstract

This study delves into the classification of exoplanets using data from the Kepler Space Telescope, comparing a suite of machine learning (ML) models to ascertain their efficacy in distinguishing confirmed planets, candidates, and false positives. With a dataset meticulously preprocessed for quality, completeness, and relevance, we embarked on an analytical journey employing models like Decision Tree, Random Forest, Hist Gradient Boosting, CatBoost, AdaBoost, LightGBM, XGBoost, Extra Trees, Logistic Regression, and XGBoost RF. These models underwent rigorous evaluation across metrics such as Accuracy, Precision, Recall, and F1 Score, revealing an unprecedented level of performance. Our findings showcased a near-uniform perfection in model predictions, with scores touching the zenith of 1.0 across most metrics for the majority of models, indicating their flawless prediction capabilities. This remarkable performance, however, was nuanced by the Gaussian NB model's slightly less than perfect scores of 0.99, highlighting a minor deviation due to its probabilistic nature. While these results underscore the models' exceptional accuracy and reliability in classifying exoplanetary data, they also prompt a critical examination of potential overfitting, the dataset's complexity, and the models' generalizability to unseen data. 
Comparative Analysis of Machine Learning Models for Chronic Disease Indicator Classification Using U.S. Chronic Disease Indicators Dataset Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 3 (2024): MALCOM July 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i3.1403

Abstract

The prevalence of chronic diseases poses significant challenges to public health systems worldwide. This study evaluates the performance of four machine learning models—Gradient Boosting Classifier, Support Vector Machine (SVM), Logistic Regression, and Random Forest—in classifying chronic disease indicators using the U.S. Chronic Disease Indicators (CDI) dataset. The models were assessed based on accuracy, precision, recall, F1 score, classification report, and confusion matrix to determine their effectiveness. The Gradient Boosting Classifier outperformed other models with an accuracy of 64.36%, precision of 63.72%, recall of 64.36%, and F1 score of 63.88%. While SVM and Random Forest demonstrated moderate performance, Logistic Regression served as a baseline for comparison. The study highlights the Gradient Boosting Classifier's superiority in handling the complexities of the CDI dataset, suggesting its potential for improving chronic disease prediction and management. Future research should focus on refining these models, addressing class imbalances, and incorporating domain knowledge to enhance interpretability and applicability in real-world scenarios.
Comparative Analysis of Neural Network Architectures for Predicting Chronic Disease Indicators Using CDC’s Chronic Disease Indicators Dataset Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 3 (2024): MALCOM July 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i3.1406

Abstract

This research evaluates the performance of three machine learning models—Neural Network (NN), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN) using Long Short-Term Memory (LSTM) units—in predicting chronic disease indicators using the CDC's Chronic Disease Indicators (CDI) dataset. The study employs a comprehensive preprocessing pipeline and 5-fold cross-validation to ensure robustness and generalizability of the results. The CNN model outperformed both the NN and RNN models across all key performance metrics, achieving an accuracy of 0.6303, precision of 0.6445, recall of 0.6303, and F1 score of 0.5950. The superior performance of the CNN is attributed to its ability to capture spatial hierarchies and interactions within the structured dataset. The findings underscore the importance of selecting appropriate machine learning architectures based on the data characteristics. This research provides valuable insights for public health officials and policymakers to enhance chronic disease monitoring, early detection, and intervention strategies. Future work will explore hybrid models and advanced techniques to further improve predictive performance. This study highlights the potential of CNNs in public health informatics and sets a foundation for further research in this domain
Predicting Student Performance Using Deep Learning Models: A Comparative Study of MLP, CNN, BiLSTM, and LSTM with Attention Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 4 (2024): MALCOM October 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i4.1668

Abstract

This study aims to predict student performance using deep learning models, including Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), Bidirectional Long Short-Term Memory (BiLSTM), and Long Short-Term Memory with Attention (LSTM with Attention). The dataset comprises student demographic and educational factors, and the models are evaluated using metrics such as MAE, RMSE, R², MSLE, and MAPE. The results show that the CNN model outperforms other models, achieving the highest accuracy in predicting student test scores. The MLP model also performs well, while the BiLSTM and LSTM with Attention models exhibit lower predictive performance. High MAPE values across models suggest a need for alternative metrics in future research. This study highlights the importance of selecting suitable model architectures for predictive tasks in education, emphasizing the effectiveness of convolutional layers in capturing complex patterns.
Spam Detection in YouTube Comments Using Deep Learning Models: A Comparative Study of MLP, CNN, LSTM, BiLSTM, GRU, and Attention Mechanisms Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 4 (2024): MALCOM October 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i4.1671

Abstract

This study explores the effectiveness of various deep learning models for detecting spam in YouTube comments. Six models were evaluated: Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), and Attention mechanisms. The dataset consists of 1,956 real comments extracted from popular YouTube videos, representing both spam and legitimate messages. The preprocessing phase involved tokenization and padding of text sequences to prepare them for model input. Results reveal that the LSTM model achieved the highest test accuracy of 95.65%, outperforming other models by capturing sequential dependencies and context within comments. The CNN model also demonstrated high accuracy, underscoring the importance of local pattern recognition in text classification. While BiLSTM and Attention models offered comparable performance, their marginal improvement over LSTM indicates that sequential modeling plays a crucial role in this task. The GRU model, despite being computationally efficient, showed slightly lower accuracy compared to LSTM and BiLSTM. The MLP model, serving as a baseline, exhibited limited performance, emphasizing the need for advanced architectures in spam detection. These findings suggest that combining sequential modeling with local feature extraction could lead to more robust spam detection systems. 
Comparative Analysis of Machine Learning Models for Intrusion Detection in Internet of Things Networks Using the RT-IoT2022 Dataset Airlangga, Gregorius
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 2 (2024): MALCOM April 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i2.1304

Abstract

This research investigates the performance of various machine learning models in developing an Intrusion Detection System (IDS) for the complex and evolving security landscape of Internet of Things (IoT) networks. Employing the RT-IoT2022 dataset, which captures a diverse array of IoT devices and attack methodologies, we meticulously evaluated four prominent models: Gradient Boosting, Random Forest, Logistic Regression, and Multi-Layer Perceptron (MLP). Our results indicate that both Gradient Boosting and Random Forest achieved perfect scores with an accuracy, precision, recall, and F1 score of 1.00, suggesting their superior ability to classify and predict security incidents within the dataset. Logistic Regression demonstrated commendable consistency with scores of 0.96 across all metrics, proposing a balance between model complexity and performance. The MLP model closely followed, with an accuracy, precision, recall, and F1 score of 0.99, highlighting its potential in capturing complex, nonlinear data relationships. These findings underscore the critical role of machine learning in fortifying IoT networks against cyber threats and the need for continuous model evaluation against real-world data. The study provides a pathway for future research to refine these IDS models for operational efficiency and sustainability in the dynamic IoT security domain.