Estimating software development effort is crucial in project planning and management, especially in resource-constrained environments. This study piloted four modern regression models: Random Forest, Support Vector Machine (SVM), Lasso Regression, and Ridge Regression, chosen because they represent different approaches: ensemble, margin-based, and L1 and L2 regularization. Experiments were conducted using the SEERA (Software Effort Estimation with Real Attributes) dataset, consisting of 99 entries, with a modern Python pipeline including preprocessing, feature selection, Z-score normalization, data splitting (80:20), and cross-validation (5-Fold Cross Validation). Models were evaluated using MAE, RMSE, and R². Results showed that Random Forest outperformed both the 80:20 split (R² = 0.740, MAE = 3981.53) and K-Fold (R² = 0.715, MAE = 3152.03), while SVM performed the worst with a negative R². Lasso and Ridge are only competitive at 80:20 but significantly decrease on K-Fold, indicating less stability. This research contributes by providing an in-depth evaluation based on a single dataset and demonstrating that the transparent Python pipeline based on K-Fold can be replicated to improve estimation accuracy. Future research could be conducted using advanced ensemble methods (e.g., XGBoost) and evaluated on larger datasets to generalize the results.
                        
                        
                        
                        
                            
                                Copyrights © 2025