This Author published in this journals
All Journal Teika
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Perbandingan Model Regresi Nonlinear Polynomial, Ridge, dan Lasso untuk Prediksi Biaya Asuransi Kesehatan Berdasarkan Kerangka CRISP-DM Siti Rachmania Putri; Supriadi, Fidi; Setiadi, David
TeIKa Vol 15 No 2 (2025): Jurnal
Publisher : Fakultas Teknologi Informasi - Universitas Advent Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36342/3kxrvj44

Abstract

The escalating cost of healthcare necessitates accurate prediction methods for determining medical insurance premiums. This research compares the performance of three nonlinear regression models, namely Polynomial, Ridge, and Lasso, in estimating individual health insurance costs. The research process follows the CRISP-DM framework, which includes the stages of business understanding, data processing, modeling, and evaluation. The dataset used is the Medical Cost Personal Dataset from Kaggle, containing 1,338 individual data points with seven demographic and behavioral features. Six outliers in the BMI and charges features were removed using the IQR method, while categorical features were encoded with One Hot Encoding. Numerical features were transformed using second-degree Polynomial Features to capture nonlinear relationships, and then the data was split into 80% training and 20% testing. Evaluation used the Mean Squared Error (MSE) and R-squared (R²) metrics. The results indicate Ridge Regression yielded the best performance with an R² value of 0.857 and an MSE of 2.35×10⁷. This model is more stable and effective in handling multicollinearity compared to the other two models. Nevertheless, the average prediction error of approximately USD 4,800 suggests the need for increased accuracy through parameter tuning or data augmentation before being implemented in a real business environment.