Hair loss is a multifactorial condition influenced by genetics, hormonal imbalance, lifestyle choices, and environmental factors. This study investigates the potential of machine learning (ML) to predict hair loss using a diverse dataset comprising categorical and numerical indicators related to these contributing variables. We applied an extensive data preprocessing pipelineincluding handling missing values, frequency encoding, and engineered interaction featuresto improve model input quality. Five ML algorithms (Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, and XGBoost) along with an ensemble voting classifier were trained and evaluated on a balanced dataset. While performance metrics such as accuracy and F1-score remained modest, with the highest values around 50%, the analysis revealed the prominent role of age, stress, and nutritional deficiency in hair loss. Despite the limited predictive capability of the current feature set, this study presents a reproducible framework for ML-driven health diagnostics and identifies key directions for future work. Enhancing data granularity and incorporating richer clinical inputs could significantly boost prediction accuracy in subsequent studies.
Copyrights © 2025