The evolution of malware, or malicious software, has raised increasing concerns, targeting not only computers but also other devices like smartphones. Malware is no longer just monomorphic but has evolved into polymorphic, metamorphic, and oligomorphic forms. With this massive development, conventional antivirus software is becoming less effective at countering it. This is due to malware's ability to propagate itself using different fingerprint and behavioral patterns. Therefore, an intelligent machine learning-based antivirus is needed, capable of detecting malware based on behavior rather than fingerprints. This research focuses on the implementation of a machine learning model for malware detection using ensemble algorithms and feature selection to achieve optimal performance. The ensemble algorithm used is Random Forest, evaluated and compared with k-Nearest Neighbor and Decision Tree as state-of-the-art methods. To enhance classification performance in terms of processing speed, the feature selection method applied is Information Gain, with 22 features. The highest results were achieved using the Random Forest algorithm and Information Gain feature selection method, reaching a score of 99.0% for accuracy and F1-Score. By reducing the number of features, processing speed can be increased by almost fivefold.
Copyrights © 2024