Journal of Applied Data Sciences
Vol 6, No 1: JANUARY 2025

Machine Learning Techniques for Distinguishing Android Malware Variants

Irwansyah, Irwansyah (Unknown)
Kurniawan, Tri Basuki (Unknown)
Dewi, Deshinta Arrova (Unknown)
Zakaria, Mohd Zaki (Unknown)
Azmi, Nurhafifi Binti (Unknown)



Article Info

Publish Date
27 Dec 2024

Abstract

The advancement of portable devices has been quickly and dramatically reshaping the usage trend and consumer preferences of electronic devices. Android, the most common mobile operating system, has a privilege-separated protection system with a complex access control mechanism. Android apps require permission to get access to confidential personal data and device resources. However, studies have shown that various malicious applications can acquire permission and target systems and applications by misleading users. In this study, we suggest a machine-learning approach to classifying Android malware variants by mining requested permissions, real permissions, suspicious calls, and API calls that were obtained and used in Android malware applications. Selected features were selected using a feature selection called KBest. Feature selection techniques are used to minimize the scale of the features and increase the performance. Two types of Naïve Bayes classifiers, called Multinomial distribution and multivariate Bernoulli distribution, are used and compared in malware family classification for text classification. Both naïve Bayes types are evaluated using a confusion matrix based on 4022 Android malware applications belonging to 10 families. Experimental findings show that the Multinomial distribution offers a reliable performance from three tests experiment with an average accuracy of 95%.

Copyrights © 2025






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...