This study aims to improve the accuracy of the Support Vector Machine (SVM) model in classifying fitness status (fit/unfit) based on physiological parameters and lifestyle using the Fitness Classification Dataset, which is a synthetic dataset designed to represent fitness indicators such as BMI, height, weight, heart rate, blood pressure, nutritional quality, sleep duration, and activity index. The dataset has an imbalanced class distribution and contains a combination of numerical and categorical features, thus requiring comprehensive preprocessing. This study applies two optimization techniques, namely RandomizedSearchCV for efficient hyperparameter tuning and SMOTE for handling class imbalance. The experimental results show that the baseline SVM model produces an accuracy of 75.75%, while the combination of SVM + RandomizedSearchCV + SMOTE increases the accuracy to 80%, or an increase of 4.25%. In addition, the AUC value also increased from 0.835 in the baseline to 0.850 in the optimized model. These findings indicate that the integration of RandomizedSearchCV and SMOTE significantly improves the model's ability to capture non-linear patterns while increasing sensitivity to minority classes. Overall, this study proves that the optimized SVM pipeline is capable of providing more stable and accurate performance in fitness status classification tasks and can be used as a reference for developing predictive models in other health domains.
Copyrights © 2025