This study examines train versus bus transportation mode choice on the Malang–Blitar route using binary logistic regression combined with ensemble bagging. Data from 100 respondents were analyzed using 80% for training and 20% for testing with k-fold cross-validation. Variables included travel cost differences, time, safety, comfort, and ease of access. Bagging was selected over other ensemble methods due to its effectiveness in reducing variance and overfitting with small datasets. Results showed the standard logistic regression achieved 85% accuracy on test data, while ensemble bagging with 200 replications improved accuracy to 90.83% (confidence interval: 90.379%–91.187%). McNemar’s test confirmed a statistically significant improvement (p 0.01). Under equivalent conditions, 20.6% of respondents preferred trains while 79.4% chose buses. Ease of access emerged as the primary decision factor, outweighing cost and time considerations. The optimal replication number was 200; exceeding 300 replications decreased model performance. This research contributes an optimized ensemble methodology for transportation mode prediction in developing countries, demonstrating that accessibility infrastructure significantly influences passenger preferences over traditional economic factors.
                        
                        
                        
                        
                            
                                Copyrights © 2025