Variance : Journal of Statistics and Its Applications
Vol 7 No 2 (2025): VARIANCE: Journal of Statistics and Its Applications

EVALUATING NEARMISS AND SMOTE FOR VEHICLE INSURANCE FRAUD CLAIM CLASSIFICATION WITH A RANDOM FOREST CLASSIFIER

Yusuf, Feby Indriana (Unknown)
Handamari, Endang Wahyu (Unknown)



Article Info

Publish Date
30 Nov 2025

Abstract

This study evaluates the detection of fraudulent car insurance claims in unbalanced data by comparing two resampling techniques, namely NearMiss (undersampling) and SMOTE (oversampling), combined with Random Forest. The public dataset, consisting of 1,000 observations and 40 features, was preprocessed for missing value handling, label encoding, and min–max normalization, and split into 70% training data and 30% test data. Three scenarios were evaluated: original data (unbalanced), NearMiss, and SMOTE, using accuracy, precision, sensitivity (recall), specificity, and F1-score evaluations. The analysis results show that NearMiss provides the most balanced performance for antifraud purposes, with a sensitivity of 0.865, an F1-score of 0.667, and an accuracy of 0.787. For the original unbalanced data, the model achieved a sensitivity of 0.297 and an accuracy of 0.767. SMOTE achieved the highest precision (0.567) and accuracy (0.783), but its sensitivity was lower than that of NearMiss. These findings confirm that the selection of resampling techniques must be aligned with operational objectives: NearMiss is more appropriate when the priority is to capture as many fraud cases as possible, while SMOTE is more suitable when false positive control is prioritized.

Copyrights © 2025






Journal Info

Abbrev

variance

Publisher

Subject

Mathematics

Description

Jurnal ini diterbitkan oleh Program Studi Statistik Fakultas Matematika dan Ilmu Pengetahuan Alam, Universitas Pattimura, Ambon. Jurnal ini diterbitkan 2 kali pada bulan Juni dan ...