Student success can be defined based on the period of study taken until graduation from college. Machine learning can be used to predict the factors that are thought to influence student success. To achieve optimal machine learning model performance, attention is needed on the sample size. This study aims to determine the effect of student sample size on the stability of model performance to predict student success. This research is quantitative. The data used is student data from a university in Yogyakarta from 2014 to 2019, totaling 19061 students. The target variable is the student study period in months, while the predictor variables are college entrance pathways, GPA from semester 1 to semester 6, and family socioeconomic conditions based on the father’s and mother’s income. This research uses the XGBoost model with the best hyperparameters and the bootstrap approach. Bootstrapping was performed on the original data by sampling twenty different sample sizes: 250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250, 2500, 2750, 3000, 3250, 3500, 3750, 4000, 4250, 4500, 4750, and 5000. The resulting bootstrap samples were replicated ten times. Model performance evaluation uses the Root Mean Square Error (RMSE) value. The result of this research is the XGBoost model with the best hyperparameters, obtained through the training data division scheme of 90% and testing data of 10%, which has the smallest RMSE value of 8.318. The model uses the best hyperparameters: n_estimators of 75, max_depth of 8, min_child_weight of 5, eta of 0.07, gamma of 0.2, subsample of 0.8, and colsample_bylevel of 1. The XGBoost model with optimal hyperparameters demonstrates peak performance stability at a sample size of 1750 students, as evidenced by consistent RMSE values across 10 bootstrap replications, confirming that this data quantity provides the ideal balance between prediction accuracy and stability for estimating study duration.