Class imbalance in datasets poses significant challenges to traditional machine learning models, such as Support Vector Machines (SVM), leading to poor performance in minority class classification. To address this issue, this study introduces a hybrid approach, Posterior Probability and Correlation-SVM (PC-SVM), which combines posterior probability estimation and correlation analysis. The purpose of this research is to enhance SVM's ability to classify imbalanced datasets by weighting attributes based on their correlation with the target class and leveraging posterior probabilities to refine decision boundaries. The methodology includes preprocessing datasets to ensure data quality, applying correlation analysis to calculate attribute weights, and using these weights to transform input features into posterior probability estimates. The transformed features serve as inputs to the SVM for classification. Experiments were conducted on two datasets: Yeast and Churn, which exhibit varying degrees of class imbalance. The results demonstrate that the PC-SVM model achieves 100% accuracy, precision, recall, and F1-scores across all classes, significantly outperforming the standard SVM. The approach effectively mitigates the bias toward majority classes by improving sensitivity to minority instances. This study highlights the robustness and reliability of the PC-SVM model in handling imbalanced data classification. In conclusion, integrating posterior probabilities with correlation-based attribute weighting significantly enhances the performance of SVMs on imbalanced datasets. Future research should focus on extending this approach to multiclass problems and optimizing its computational efficiency.
Copyrights © 2025