This study conducts a static analysis of Android applications to detect malware based on opcode and application permission features. A total of 1,000 applications were used, consisting of 500 benign and 500 malware samples. Opcode features were extracted from the classes.dex file and represented as numerical vectors using the Term Frequency–Inverse Document Frequency (TF-IDF) method. A total of 147 unique opcodes were successfully identified. In addition, application permission features were extracted from the AndroidManifest.xml file, resulting in 65 features. These two types of features were then combined to form a dataset used as input for the classification process. The classification algorithms used in this study are Random Forest and Support Vector Machine as a comparison. The model performance was evaluated using accuracy, precision, recall, and F1-score metrics. Based on the test results on the test data, the Random Forest model achieved the best accuracy of 99%, followed by SVM at 98%. These results indicate that the combination of opcode and application permission features using Random Forest is quite effective in distinguishing between benign and malware applications through static analysis. Therefore, the TF-IDF-based classification system utilizing opcode and permission features developed in this study can serve as an initial approach for Android malware detection using static analysis.
Copyrights © 2025