The digital transformation has enhanced efficiency, transparency, and accessibility but has also led to a notable increase in cyber incidents, including malware attacks. According to the 2022 annual report from the Honeynet Project by the National Cyber and Encryption Agency, Indonesia experienced over 370 million cyber attacks, with 800,000 of these being malware attacks. The increasing complexity of Portable Executable files further complicates accurate classification in machine learning models. This research aims to develop an effective malware detection approach using machine learning classifiers—Random Forest, XGBoost, and AdaBoost—on raw feature dataset and integrated feature dataset. Dimension reduction techniques such as Principal Component Analysis and Linear Discriminant Analysis were utilized to enhance classification efficiency. The results demonstrated that Random Forest and XGBoost consistently outperformed AdaBoost, particularly in classifying ransomware, achieving recall values ranging from 0.72 to 0.85 and F1-scores from 0.74 to 0.81 For the trojan class, both Random Forest and XGBoost achieved recall values ranging from 0.96 to 0.97, with corresponding F1-scores between 0.95 and 0.97. Both classifiers maintained high precision, recall, and F1-scores across all malware classes, even with reduced feature sets.
Copyrights © 2024