Claim Missing Document
Check
Articles

Comparative Analysis of Machine Learning Models for Classifying Human DNA Sequences: Performance Metrics and Strategic Recommendations Airlangga, Gregorius
Journal of Computer System and Informatics (JoSYC) Vol 5 No 3 (2024): May 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i3.5168

Abstract

This study presents a comprehensive evaluation of seven machine learning models applied to the classification of human DNA sequences, highlighting their performance and potential applications in genomics. We explored Logistic Regression, Support Vector Machines (SVM), Random Forest, Decision Trees, Gradient Boosting, Naive Bayes, and XGBoost, using a 5-fold StratifiedKFold cross-validation method to ensure robustness and reliability in our findings. Naive Bayes demonstrated exceptional performance with near-perfect accuracy, precision, recall, and F1 scores, suggesting its suitability for rapid and efficient genomic classification. Logistic Regression also showed high efficacy, proving effective even in multi-class classifications of complex genetic data. Conversely, Decision Trees and SVM struggled with overfitting and computational efficiency, respectively, indicating the need for careful parameter tuning and optimization in practical applications. The study addresses these challenges and proposes strategies for enhancing model robustness and computational efficiency, such as advanced regularization techniques and hybrid modeling approaches. These insights not only aid in selecting appropriate models for specific genomic tasks but also pave the way for future research into integrating machine learning with genomic science to advance personalized medicine and genetic research. The findings encourage ongoing refinement of these models to unlock further potential in genomic applications.
Comparative Analysis of Deep Learning Architectures for DNA Sequence Classification: Performance Evaluation and Model Insights Airlangga, Gregorius
Journal of Computer System and Informatics (JoSYC) Vol 5 No 3 (2024): May 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i3.5170

Abstract

The classification of DNA sequences using deep learning models offers promising avenues for advancements in genomics and personalized medicine. This study provides a comprehensive evaluation of several deep learning architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), Gated Recurrent Units (GRUs), Bidirectional LSTMs (BiLSTMs), and hybrid models combining CNNs with various recurrent networks, to classify human DNA sequences into functional categories. We employed a dataset of approximately 100,000 labeled sequences, ensuring a balanced representation across seven distinct classes to facilitate a fair comparison of model performance. Each model was assessed based on accuracy, precision, recall, F1 score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). The CNN model demonstrated superior accuracy (74.86%) and the highest AUC (94.64%), indicating its effectiveness in capturing spatial patterns within sequences. LSTM and GRU models showed commendable performance, particularly in balancing precision and recall, suggesting their capability in managing sequential dependencies. However, hybrid models did not perform as expected, showing lower overall metrics, which highlighted challenges in model integration and complexity management. The findings suggest that while CNNs excel in feature extraction, sequence-based models like LSTMs and GRUs provide valuable capabilities in capturing long-range dependencies, essential for comprehensive genomic analysis. The study underscores the need for optimized hybrid models and further research into model robustness and generalizability.
A Hybrid Ensemble Approach for Enhanced Fraud Detection: Leveraging Stacking Classifiers to Improve Accuracy in Financial Transaction Airlangga, Gregorius
Journal of Computer System and Informatics (JoSYC) Vol 5 No 4 (2024): August 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i4.5840

Abstract

Fraud detection in financial transactions presents a significant challenge due to the evolving tactics of fraudsters and the inherent imbalance in datasets, where fraudulent activities are rare compared to legitimate transactions. This study proposes a Hybrid Model utilizing a stacking ensemble technique that combines multiple machines learning algorithms, including Random Forest, Gradient Boosting, SVM, LightGBM, and XGBoost, to enhance the accuracy of fraud detection systems. The Hybrid Model is evaluated against traditional machine learning models using a comprehensive cross-validation approach, with results indicating a near-perfect accuracy of 99.99%, outperforming all individual models. The study also examines the trade-offs associated with the Hybrid Model, including increased computational demands and reduced interpretability, highlighting the need for careful consideration when deploying such models in real-world scenarios. Despite these challenges, the Hybrid Model's ability to significantly reduce both false positives and false negatives makes it a powerful tool for financial institutions aiming to mitigate the risks associated with fraudulent activities. In conclusion, the findings demonstrate the effectiveness of hybrid ensemble methods in fraud detection, providing a robust solution that balances the complexities of real-world applications with the need for high accuracy. The research underscores the potential of advanced machine learning techniques in enhancing the security and reliability of financial transactions, offering valuable insights for the development of future fraud detection systems.
Anemia Classification Using Hybrid Machine Learning Models: A Comparative Study of Ensemble Techniques on CBC Data Airlangga, Gregorius
Journal of Computer System and Informatics (JoSYC) Vol 5 No 4 (2024): August 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i4.5848

Abstract

Anemia is a prevalent and potentially serious medical condition characterized by a deficiency in the number or quality of red blood cells. Accurate classification of anemia types is crucial for ensuring appropriate treatment, as different types of anemia require distinct therapeutic approaches. However, the classification of anemia presents specific challenges due to the complexity of the condition, the variability in CBC data, and the need to differentiate between multiple anemia types that may present with overlapping symptoms. In this study, we explore the application of hybrid machine learning models to classify anemia types using Complete Blood Count (CBC) data. We evaluated the performance of various models, including DecisionTree, RandomForest, XGBoost, LightGBM, CatBoost, and ensemble methods such as Stacking and Voting. The ensemble models, particularly Stacking and Voting, demonstrated superior performance with balanced accuracy reaching 0.9976 and F1 scores of 0.9964, significantly outperforming individual classifiers. These results underscore the efficacy of ensemble techniques in handling the complex and imbalanced datasets commonly encountered in medical diagnostics. Despite their high accuracy, we identified challenges related to model interpretability, computational demands, and data quality. The complexity and resource requirements of these models may limit their practical application in resource-constrained environments. This study provides a comprehensive analysis of the trade-offs between model complexity, accuracy, and interpretability, offering valuable insights for the deployment of machine learning models in clinical settings. Our findings highlight the potential of hybrid models to improve anemia diagnosis, suggesting their integration into healthcare systems could enhance diagnostic accuracy and patient outcomes. Future work will focus on expanding the dataset, refining model interpretability, and addressing ethical considerations in the use of AI in healthcare.
Comparative Analysis of Path Planning Algorithms for Multi-UAV Systems in Dynamic and Cluttered Environments: A Focus on Efficiency, Smoothness, and Collision Avoidance Sukwadi, Ronald; Airlangga, Gregorius; Basuki, Widodo Widjaja; Kristian, Yoel; Rahmananta, Radyan; Sugianto, Lai Ferry; Nugroho, Oskar Ika Adi
International Journal of Robotics and Control Systems Vol 4, No 4 (2024)
Publisher : Association for Scientific Computing Electronics and Engineering (ASCEE)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31763/ijrcs.v4i4.1555

Abstract

This study evaluates the performance of various path planning algorithms for multi-UAV systems in dynamic and cluttered environments, focusing on critical metrics such as path length, path smoothness, collision avoidance, and computational efficiency. We examined several algorithms, including A*, Genetic Algorithm, Modified A*, and Particle Swarm Optimization (PSO), using comprehensive simulations that reflect realistic operational conditions. Key evaluation metrics were quantified using standardized methods, ensuring the reproducibility and clarity of the findings. The A* Path Planner demonstrated efficiency by producing the shortest and smoothest paths, albeit with minor collision avoidance limitations. The Genetic Algorithm emerged as the most robust, balancing path length, smoothness, and collision avoidance, with zero violations and high feasibility. Modified A* also performed well but exhibited slightly less smooth paths. In contrast, algorithms like Cuckoo Search and Artificial Immune System faced significant performance challenges, especially in adapting to dynamic environments. Our findings highlight the superior performance of the Genetic Algorithm and Modified A* under these specific conditions. We also discuss the potential for hybrid approaches that combine the strengths of these algorithms for even better performance. This study's insights are critical for practitioners looking to optimize multi-UAV systems in challenging scenarios.
Analysis and Comparison of Machine Learning Techniques for DDoS Attack Classification in Network Environments Airlangga, Gregorius
Jurnal Informatika Ekonomi Bisnis Vol. 6, No. 1 (March 2024)
Publisher : SAFE-Network

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37034/infeb.v6i1.795

Abstract

This research presents a comparative analysis of machine learning techniques for classifying Distributed Denial of Service (DDoS) attacks within network traffic. We evaluated the performance of three algorithms: Logistic Regression, Decision Tree, and Random Forest, including their scaled-feature counterparts. The study utilized a robust methodology incorporating advanced data preprocessing, feature engineering, and Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance. The models were rigorously tested using a cross-validation framework, assessing their accuracy, precision, recall, and F1 score. Results indicated that the Random Forest algorithm outperformed the others, demonstrating superior predictive accuracy and consistency, albeit with higher computational costs. Logistic Regression, when feature-scaled, showed significant improvement in performance, highlighting the importance of data normalization in models sensitive to feature scaling. Decision Trees provided a quick and interpretable model, though slightly less accurate than the Random Forest. The research findings highlight the trade-offs between predictive performance and computational efficiency in selecting machine learning models for cybersecurity applications. The study contributes to the cybersecurity domain by elucidating the efficacy of ensemble techniques in DDoS attack classification and underscores the potential for model improvement through scaling and data balancing.
Optimizing Machine Learning Models for Urinary Tract Infection Diagnostics: A Comparative Study of Logistic Regression and Random Forest Airlangga, Gregorius
Jurnal Informatika Ekonomi Bisnis Vol. 6, No. 1 (March 2024)
Publisher : SAFE-Network

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37034/infeb.v6i1.854

Abstract

Urinary Tract Infections (UTIs) present a significant healthcare challenge due to their prevalence and diagnostic complexity. Timely and accurate diagnosis is critical for effective treatment, yet traditional methods like microbial cultures and urinalysis are often slow and inconsistent. This study introduces machine learning (ML) as a transformative solution for UTI diagnostics, particularly focusing on logistic regression and random forest models renowned for their interpretability and robustness. We conducted a meticulous hyperparameter tuning process using a rich dataset from a clinic in Northern Mindanao, Philippines, incorporating demographic, clinical, and urinalysis data. Our research outlines a detailed methodology for applying and refining these ML models to predict UTI outcomes accurately. Through comprehensive hyperparameter optimization, we enhanced the predictive performance, demonstrating a significant improvement over standard diagnostic practice. The findings reveal a clear superiority of the random forest model, achieving a top testing accuracy of 0.9814, compared to the best-performing logistic regression model's accuracy of 0.7626. This comparative analysis not only validates the efficacy of ML in medical diagnostics but also emphasizes the potential clinical impact of these models in real-world settings. The study contributes to the burgeoning literature on ML applications in healthcare by providing a blueprint for optimizing ML models for clinical use, particularly in diagnosing UTIs. It underscores the promise of ML in augmenting diagnostic precision, thereby potentially reducing the global healthcare burden associated with UTIs.
Efficacy of Machine Learning Techniques in Diagnosing Urinary Tract Infections: A Study Utilizing a Philippine Clinical Dataset Airlangga, Gregorius
Jurnal Informatika Ekonomi Bisnis Vol. 6, No. 1 (March 2024)
Publisher : SAFE-Network

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37034/infeb.v6i1.855

Abstract

This research delves into the potential of machine learning models, namely Support Vector Machine (SVM), XGBoost, and LightGBM, to enhance the diagnosis of Urinary Tract Infections (UTIs) based on a comprehensive dataset collected from a local clinic in Northern Mindanao, Philippines, spanning from April 2020 to January 2023. The study integrates clinical variables such as age, gender, and various urine test results including color, transparency, and the presence of substances like glucose, protein, and cells, to determine the most accurate diagnostic model. The dataset presented unique preprocessing challenges, such as converting infant ages into decimal numbers. The SVM with a linear kernel showed remarkable test accuracy of 98.25%, indicating its robustness in handling linear separability in the data. Meanwhile, XGBoost and LightGBM, both with optimal hyperparameter configurations, achieved comparable accuracies of 97.95%. These results underscore the significance of machine learning in medical diagnostics, particularly in settings where swift and reliable decision-making is crucial. Our findings suggest that while ensemble methods like XGBoost and LightGBM are powerful tools for complex datasets, a well-tuned SVM can provide superior accuracy, thus advocating for a data-centric approach in model selection.
Anomaly Detection in Blockchain Transactions: A Machine Learning Approach within the Open Metaverse Airlangga, Gregorius
Jurnal Informatika Ekonomi Bisnis Vol. 6, No. 2 (June 2024)
Publisher : SAFE-Network

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37034/infeb.v6i2.864

Abstract

This study investigates the application of machine learning models for anomaly detection and fraud analysis in blockchain transactions within the Open Metaverse, amid the growing complexity of digital transactions in virtual spaces. Utilizing a dataset of 78,600 transactions that reflect a broad spectrum of user behaviors and transaction types, we evaluated the efficacy of several predictive models, including RandomForest, LinearRegression, SVR, DecisionTree, KNeighbors, GradientBoosting, AdaBoost, Bagging, XGB, and LightGBM, based on their Mean Cross-Validation Mean Squared Error (Mean CV MSE). Our analysis revealed that ensemble methods, particularly RandomForest and Bagging, demonstrated superior performance with Mean CV MSEs of -0.00445 and -0.00415, respectively, thereby highlighting their robustness in the complex transaction dataset. In contrast, LinearRegression and SVR were among the least effective, with Mean CV MSEs of -224.67 and -468.57, indicating a potential misalignment with the dataset's characteristics. This research underlines the importance of selecting appropriate machine learning strategies in the context of blockchain transactions within the Open Metaverse, showcasing the need for advanced, adaptable approaches. The findings contribute significantly to the financial technology field, particularly in enhancing security and integrity within virtual economic systems, and advocate for a nuanced approach to anomaly detection and fraud analysis in blockchain environments.
Deep Learning for Anomaly Detection and Fraud Analysis in Blockchain Transactions of the Open Metaverse Airlangga, Gregorius
Jurnal Informatika Ekonomi Bisnis Vol. 6, No. 2 (June 2024)
Publisher : SAFE-Network

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37034/infeb.v6i2.865

Abstract

This study investigates the application of deep learning models for anomaly detection and fraud analysis within blockchain transactions of the Open Metaverse. Given the burgeoning complexity and scale of virtual environments, ensuring the integrity and security of blockchain transactions is paramount. We employed three deep learning architectures: Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM), to analyze and predict transactional anomalies. Using a dataset comprising 78,600 records of metaverse transactions, each model was rigorously evaluated through a 5-fold cross-validation approach, focusing on the Mean Squared Error (MSE) as the primary performance metric. The MLP model demonstrated superior predictive accuracy with the lowest average CV MSE, suggesting its effectiveness in capturing the intricate patterns of blockchain transactions. The study's findings highlight the nuanced capabilities of each model in addressing the specific challenges of fraud analysis and anomaly detection in the metaverse's blockchain environment. By providing a comparative analysis of these deep learning approaches, this research contributes to the strategic development of security measures in the Open Metaverse, promoting a secure and trustworthy digital economy.