Everyone wants a place to live, especially close to work, shopping centers, easy transportation, low crime rates and others. Pricing must also pay attention to external factors, not just the house. Determining this price is sometimes difficult for some people. Therefore, the aim of this research is to predict real-estate prices by taking these factors into account. Prediction results are very useful for sellers who have difficulty determining prices and also for prospective buyers who are confused when making financial plans to buy a house in the desired neighborhood. The dataset used in this research was obtained from Kaggle and consists of 506 samples with 14 attributes. Several machine learning algorithms, such as Extra Trees (ET), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), and CatBoost, used to predict real-estate prices. This research uses Principal Component Analysis (PCA) for feature selection techniques in data sets after the preprocessing phase and before model building. The highest accuracy model obtained is CatBoost with GridSearchCV, this model has been cross validated so there is very little chance of overfitting when given new data. The SVR model with a poly kernel uses a Principal Component (PC) of 10 and GridSearchCV gets an R2 Score of 0.87, a very large number close to the score of CatBoost with GridSearchCV.
Copyrights © 2024