This study analyzes individual health insurance data using logistic regression to classify premiums into high and low categories based on ten medical and demographic predictors. By transforming a continuous premium variable into binary classes, high and low premiums, this research evaluates the significant factors influencing premium pricing decisions. Logistic regression was selected for its ability to model binary outcomes and estimate the likelihood that the customer belongs to the high premium category. To evaluate the significance of the predictors and the overall model fit, the Likelihood Ratio Test and Wald test were performed, identifying Age and Weight as significant predictors affecting premium classification. The final logistic regression model has an excellent ability to predict, given the area under the curve (AUC) of 0.97 and a 95% accuracy. These results show how well logistic regression works to enhance risk classification and support data driven policy modifications in insurance underwriting procedures.
Copyrights © 2026