Claim Missing Document
Check
Articles

Found 4 Documents
Search

Advancing breast cancer prediction: machine learning, data balancing, and ant colony optimization Aouragh, Abd Allah; Bahaj, Mohamed; Toufik, Fouad
Bulletin of Electrical Engineering and Informatics Vol 13, No 6: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i6.8298

Abstract

Breast cancer constitutes a significant threat to women's health worldwide. The World Health Organization (WHO) reports around 2.3 million new cases each year, making this disease the primary reason for cancer-related fatalities among women. In light of this alarming situation, developing innovative tools for early detection and optimal treatment is imperative, as it directly addresses the pressing need to enhance our capabilities in the quest to overcome breast cancer. This study fits in with this approach, introducing a comparative assessment of multiple machine learning algorithms and integrating data preprocessing, data balancing and feature selection techniques. The studied Coimbra dataset, composed of 116 records and including 10 medical characteristics, exhibited promising performance in all classification metrics, reaching an accuracy of 89.74%, and an area under the receiver operating characteristic curve (AUC-ROC) of 89.68%. These findings highlight the significant potential of our approaches to improve breast cancer treatment and detection systems, providing health practitioners with more efficient resources.
Enhancing hypertension prediction: a hybrid machine learning optimization approach Aouragh, Abd Allah; Bahaj, Mohamed; Toufik, Fouad
Indonesian Journal of Electrical Engineering and Computer Science Vol 37, No 1: January 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v37.i1.pp347-355

Abstract

Early identification of hypertension is crucial to prevent its serious complications, which can lead to devastating health effects by threatening lifestyle quality and significantly increasing premature mortality. This study aims to evaluate the effectiveness of machine learning techniques in predicting the presence of hypertension from an unbalanced dataset consisting of 4,363 records and 35 features. To balance the dataset, we employed the synthetic minority over-sampling technique (SMOTE) algorithm. In addition, to select the most relevant features, we used ant colony optimization. Next, we applied various algorithms, including logistic regression (LR), K-nearest neighbors (KNNs), support vector machine (SVM), extra trees (ETs), and AdaBoost (AB). We also evaluated the optimization of hyperparameters using two methods: Bayesian optimization (BO) and particle swarm optimization (PSO). The results reveal that the combination of AB with BO demonstrated superior performance, with an accuracy of 97.60%, a recall of 98.93%, and a precision of 98.59%. This research emphasizes the potential of machine learning techniques for anticipating hypertension and highlights the importance of optimization techniques in improving predictive models’ performance.
Balancing and metaheuristic techniques for improving machine learning models in brain stroke prediction Aouragh, Abd Allah; Bahaj, Mohamed; Toufik, Fouad
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i1.pp473-481

Abstract

A brain stroke, medically referred to as a stroke, represents a critical condition triggered by the disruption of blood flow to a region of the brain. Early detection of stroke is crucial to prevent fatal complications. In this study, we worked with an unbalanced dataset of 4981 entries on stroke, which we balanced using the K-means synthetic minority over-sampling technique (KMeansSMOTE) algorithm. We then employed five machine learning algorithms: decision tree, random forest, support vector machine, K-nearest neighbors, and gradient boosting. We compared the hyperparameter optimization of these algorithms using four metaheuristic techniques: gray wolf optimization, particle swarm optimization, genetic algorithm, and artificial bee colony. The models' effectiveness was evaluated using multiple metrics, such as accuracy, recall, precision, F1-score, and area under the receiver operating characteristic curve. Our findings indicate that the random forest optimized by the genetic algorithm achieved the best performance, with an accuracy of 97.39% and an F1-score of 97.35%. This study highlights the effectiveness of balancing and metaheuristics techniques in optimizing machine learning models for stroke forecasting.
Integrating BERT fine-tuning and genetic algorithm for superior depression detection in social media Aouragh, Abd Allah; Bahaj, Mohamed; Toufik, Fouad
International Journal of Electrical and Computer Engineering (IJECE) Vol 16, No 3: June 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v16i3.pp1474-1484

Abstract

Early detection of depression is crucial for minimizing its adverse effects on mental and physical health. Recent advancements in natural language processing facilitate the large-scale analysis of social media texts to identify depressive tendencies. Our study introduces a novel approach by integrating a genetic algorithm for hyperparameter tuning, optimizing the classification performance beyond conventional methods. We provide a comprehensive comparison of vectorization techniques, including term frequency-inverse document frequency (TF-IDF), Word2Vec, and a fine-tuned bidirectional encoder representation from transformers (BERT) model specifically adapted to our dataset. Using a dataset of 7,731 entries, we implemented standard pre-processing steps such as stop word removal and lemmatization before vectorizing the text. Five machine learning algorithms—decision tree, logistic regression, random forest, gradient boosting, and support vector machine—were evaluated, with hyperparameter tuning performed using a genetic algorithm. The highest accuracy (95.99%) and F1-score (95.91%) were achieved with the combination of fine-tuned BERT, support vector machine, and genetic algorithm optimization. This study demonstrates the advantages of integrating BERT fine-tuning with genetic optimization, outperforming traditional TF-IDF and Word2Vec approaches in depression detection.