Shaadan, Norshahida
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Visualization Tools for Backward Elimination Technique in Multiple Regression Time Series Modelling of CO2 Emissions in Malaysia Mansor, Mahayaudin M.; Ibrahim, Nurain; Zakaria, Roslinazairimah; Suhaila, Jamaludin; Miswan, Nor Hamizah; Shaadan, Norshahida
JOIV : International Journal on Informatics Visualization Vol 9, No 4 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.4.3012

Abstract

Understanding multiple regression time series modelling is crucial because the procedures involve intricate statistical methods. This study incorporates a flowchart that clearly illustrates the steps for modelling a response variable affected by several explanatory variables via the backward elimination technique. The first objective of this study is to utilise ten graphical tools, comprising charts and tables, for visual assessment to support formal evaluations in model diagnostics using R programming. The aim is to provide comprehensive insights and improve the overall understanding of the modelling procedures. The visualisation tools include criteria for multicollinearity, goodness-of-fit, and underlying assumptions of normality, homoscedasticity, zero serial correlation, and volatility in the residuals. The second objective involves implementing modelling procedures to obtain a well-specified model in a real-world context, demonstrating its practical value and implications. In this instance, the selected response variable is carbon dioxide (CO2) emissions, significantly contributing to global warming. In Malaysia, CO2 emissions increased continuously from 1990 to 2022, with an alarming average annual growth rate of 4.9%. The visual diagnostics have helped guide the elimination of some explanatory variables in the initial model and refined the models, resulting in a well-specified final model that is parsimonious and explains 98.6% of the variability in CO2 emissions. The final model suggests that high fossil fuel use and GDP per capita are contributing factors to increased CO2 emissions in Malaysia. The study recommends government action and investment in renewable energy to reduce CO2 emissions by 45% by 2030 and achieve net-zero emissions by 2050.
Comparison of ensemble hybrid sampling with bagging and boosting machine learning approach for imbalanced data Malek, Nur Hanisah Abdul; Yaacob, Wan Fairos Wan; Wah, Yap Bee; Md Nasir, Syerina Azlin; Shaadan, Norshahida; Indratno, Sapto Wahyu
Indonesian Journal of Electrical Engineering and Computer Science Vol 29, No 1: January 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v29.i1.pp598-608

Abstract

Training an imbalanced dataset can cause classifiers to overfit the majority class and increase the possibility of information loss for the minority class. Moreover, accuracy may not give a clear picture of the classifier’s performance. This paper utilized decision tree (DT), support vector machine (SVM), artificial neural networks (ANN), K-nearest neighbors (KNN) and Naïve Bayes (NB) besides ensemble models like random forest (RF) and gradient boosting (GB), which use bagging and boosting methods, three sampling approaches and seven performance metrics to investigate the effect of class imbalance on water quality data. Based on the results, the best model was gradient boosting without resampling for almost all metrics except balanced accuracy, sensitivity and area under the curve (AUC), followed by random forest model without resampling in term of specificity, precision and AUC. However, in term of balanced accuracy and sensitivity, the highest performance was achieved by random forest with a random under-sampling dataset. Focusing on each performance metric separately, the results showed that for specificity and precision, it is better not to preprocess all the ensemble classifiers. Nevertheless, the results for balanced accuracy and sensitivity showed improvement for both ensemble classifiers when using all the resampled dataset.