Data imbalance is a serious challenge in developing machine learning models for sleep disorder classification. When models are trained on an uneven distribution of classes, classification performance for minority classes such as insomnia and sleep apnea is often low. As a result, the overall accuracy may seem elevated, yet the sensitivity to important cases to be weak. Therefore, this research aims to design and develop a robust sleep disorder classification model with the AdaBoost algorithm, with improved performance through the integration of two main approaches, namely data balancing technique utilizing SMOTE and hyperparameter optimization using Optuna. This research contributes by showing that the combination of the two approaches can significantly improve model performance, not only in terms of global accuracy, but also accuracy on previously overlooked minority classes. The dataset utilized is the Sleep Health and Lifestyle Dataset which consists of 374 synthesized data and is divided into three categories: insomnia, sleep apnea, and none. This method stages include data preprocessing, data division using train-test split (80:20), application of SMOTE to balance the class distribution, hyperparameter tuning using Optuna, and model training with the AdaBoost algorithm. Evaluation was performed using classification metrics: accuracy, precision, recall, and F1-score. Results showed that mix of SMOTE and Optuna yielded the best results, accuracy 90.6%, F1-score 0.83871 for insomnia, and 0.81250 for sleep apnea. This performance was consistently superior to scenarios with no SMOTE or no tuning. This confirms the importance of using combination strategies to obtain fair and accurate classification on medical data. Future research is recommended to use real datasets as well as test the capabilities of this research on other models such as XGBoost or LightGBM.
Copyrights © 2025