The purpose of this research is to study how the application of virtual sample generation using the linear interpolation and gaussian noise augmentation method impacts the improvement of prediction model performance in the case of corrosion inhibition efficiency using pyridazine. Random Forest Regressor, Gradient Boosting Regressor, and Bagging Regressor are the models used. The coefficient of determination (R2) values for each model are -0.06, 0.05, and 0.12 on the initial data; the RMSE values are 34.80, 32.90, and 31.65, respectively. After the use of virtual sample development, the R2 values significantly increased to 0.99, 0.96, and 0.99, while the RMSE values significantly decreased to 1.59, 2.88, and 1.25. The research results show that the linear interpolation method can enrich the dataset without altering the data distribution pattern, this method significantly improves the model's accuracy. This performance improvement demonstrates the ability of virtual sample generation to overcome the limitations of the original data; ultimately, this results in a more accurate and reliable predictive model. In the field of material efficiency prediction especially for material technology applications and corrosion control this research helps develop data augmentation methods for similar cases.
Copyrights © 2025