Abstract. Binary logistic regression is a widely used method for modeling mode choice, but it often suffers from reduced predictive accuracy when dealing with high-dimensional datasets and class imbalance. This study implements binary logistic regression with LASSO regularization to identify significant factors influencing transportation mode choice between motorcycles and Trans Metro buses in the CBD of Pekanbaru. Data from 100 respondents were collected through revealed-preference and stated-preference surveys, with class imbalance (71% motorcycle, 29% bus) addressed using the Synthetic Minority Over-sampling Technique (SMOTE). Model performance was evaluated using accuracy, AUC, precision, recall, and F1-score via Repeated Random Subsampling Validation (RRSV). Results show that the LASSO model with SMOTE increased recall from 0.125 to 0.25 and F1-score from 0.143 to 0.267 compared to the non-SMOTE model, with an accuracy of 0.621 and an AUC of 0.613, indicating improved ability to detect the minority class. Statistically significant predictors include occupation, monthly income, and ownership of an alternative vehicle. This study demonstrates that combining LASSO and SMOTE is effective in handling imbalanced data, providing strong quantitative evidence to support urban transport policy planning.
Copyrights © 2025