Garuda - Garba Rujukan Digital

Teika

Vol 15 No 2 (2025): Jurnal

Siti Rachmania Putri (Unknown)
Supriadi, Fidi (Unknown)
Setiadi, David (Unknown)

Publish Date
20 Jan 2026

The escalating cost of healthcare necessitates accurate prediction methods for determining medical insurance premiums. This research compares the performance of three nonlinear regression models, namely Polynomial, Ridge, and Lasso, in estimating individual health insurance costs. The research process follows the CRISP-DM framework, which includes the stages of business understanding, data processing, modeling, and evaluation. The dataset used is the Medical Cost Personal Dataset from Kaggle, containing 1,338 individual data points with seven demographic and behavioral features. Six outliers in the BMI and charges features were removed using the IQR method, while categorical features were encoded with One Hot Encoding. Numerical features were transformed using second-degree Polynomial Features to capture nonlinear relationships, and then the data was split into 80% training and 20% testing. Evaluation used the Mean Squared Error (MSE) and R-squared (R²) metrics. The results indicate Ridge Regression yielded the best performance with an R² value of 0.857 and an MSE of 2.35×10⁷. This model is more stable and effective in handling multicollinearity compared to the other two models. Nevertheless, the average prediction error of approximately USD 4,800 suggests the need for increased accuracy through parameter tuning or data augmentation before being implemented in a real business environment.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Teika

Website

Abbrev

teika

Publisher

Universitas Advent Indonesia

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Languange, Linguistic, Communication & Media

Description

TeIKa (Teknologi Informasi dan Komunikasi) Journal invites scholars, researchers, and students to contribute the result of their studies and researches in the areas related to Information and Communication Technology work which covers Information System, Computer Networks, Computer Security, ...

Article Info

Abstract

Perbandingan Model Regresi Nonlinear Polynomial, Ridge, dan Lasso untuk Prediksi Biaya Asuransi Kesehatan Berdasarkan Kerangka CRISP-DM

Article Info

Abstract