he insurance sector operates by managing the transfer of risk from policyholders to insurance providers, where premiums are charged as compensation for the assumed risk. Traditionally, premium determination in motor vehicle insurance relies on the Generalized Linear Model (GLM), which requires the response variable to follow a distribution from the exponential family and may have limitations in capturing non-linear relationships and complex interactions among rating factors. To address these limitations, this study compares the performance of the Generalized Linear Model (GLM) and the Gradient Boosting Machine (GBM) in modeling claim frequency and claim severity for motor vehicle insurance premiums. The analysis is conducted using an insurance dataset obtained from a public data repository, and both models are evaluated using K-Fold Cross Validation. Model performance is assessed based on the Root Mean Square Error (RMSE), which measures the average magnitude of prediction errors and is commonly used to evaluate predictive accuracy. The results indicate that the GBM consistently produces lower RMSE values than the GLM for both claim frequency and claim severity modeling, indicating superior predictive performance. However, despite its higher accuracy, the GBM model lacks the interpretability inherent in the GLM framework, which remains crucial for transparency and regulatory considerations in insurance premium determination. These findings suggest that while GBM is effective for improving prediction accuracy, GLM remains valuable for interpretability, and a complementary use of both approaches may provide optimal results in actuarial pricing applications
Copyrights © 2026