House price prediction is a crucial issue in the property sector because it is influenced by various interrelated factors, such as building characteristics and environmental conditions. Accurate prediction using conventional approaches is often difficult and can lead to errors in decision-making. Therefore, this study aims to develop and compare the performance of house price prediction models using three machine learning algorithms: Linear Regression, Random Forest, and Gradient Boosting. The dataset used is the Home Value Insights Dataset on Kaggle, which consists of 1,000 houses with eight main attributes. The research stages include data pre-processing, dividing training and test data, model training, parameter optimization using GridSearchCV, and performance evaluation based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R²) metrics using the 10-Fold Cross Validation method. The test results show that Linear Regression provides the best performance with an R² value of 0.8539 and a lower prediction error rate than Random Forest and Gradient Boosting. Although the ensemble model shows competitive results, increasing model complexity does not result in a significant increase in accuracy, so Linear Regression is considered the simplest, most efficient, and most easily interpreted approach for house price prediction systems on datasets with characteristics that tend to be linear.
Copyrights © 2025