This research investigates the effectiveness of two machine learning models (Logistic Regression and Random Forest) in predicting employee turnover. This research uses IBM HR Analytics employee attrition and performance dataset and performance dataset from Kaggle and implements nested ensemble models in Google Colab. After data pre-processing steps such as feature merging, generation, engineering, cleaning, coding, and normalisation, the data is divided into training and testing sets. The models were trained and evaluated based on their accuracy. The results of averaging the three departments showed that the Random Forest model achieved the highest accuracy (97.7%) compared to Logistic Regression (94.6%). Therefore, this study shows that Logistic Regression is the most suitable model to predict employee turnover in the given dataset.
Copyrights © 2024