Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal Building of Informatics, Technology and Science

Usmany, Haidar Nafiis

Unknown Affiliation

Author-ID : 9792659

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Optimasi Deteksi Malware Android pada Dataset Drebin Menggunakan Ensemble Learning Usmany, Haidar Nafiis; Ghozi, Wildanil
Building of Informatics, Technology and Science (BITS) Vol 7 No 4 (2026): March 2026
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i4.9443

The increasing number and complexity of Android malware require detection systems that are accurate, efficient, and capable of handling high-dimensional data. Machine learning–based approaches have become one of the widely adopted solutions in cybersecurity research. However, the performance of classification models is often affected by feature redundancy and suboptimal hyperparameter configurations. This study aims to evaluate the effectiveness of combining Random Forest–based feature selection with modern boosting classification algorithms for Android malware detection. The dataset used in this study is the Drebin 215 dataset, which was selected because it is one of the most widely used benchmark datasets for Android malware detection based on static analysis, enabling more objective comparison with previous studies. Feature selection was performed using the Random Forest feature importance method to reduce data dimensionality prior to the classification stage. The classification models employed include XGBoost, Light Gradient Boosting Machine (LightGBM), and CatBoost. The experiments were conducted under two scenarios: without hyperparameter optimization (non-tuning) and with hyperparameter optimization using the Grid Search method. Model performance was evaluated using accuracy, precision, recall, F1-score, and ROC-AUC metrics, as well as computational time analysis. The experimental results show that all models achieved very strong classification performance on the Drebin benchmark dataset, with accuracy values exceeding 0.98. Among the evaluated models, LightGBM achieved the best performance, with an accuracy of 0.9900 and an F1-score of 0.9865. This performance advantage is likely influenced by the efficiency of its histogram-based learning mechanism and leaf-wise tree growth strategy, which enables faster and more effective learning on high-dimensional data. Nevertheless, the high performance observed on this benchmark dataset still requires further evaluation on more diverse datasets or dynamic environments to ensure the generalization capability of the model in real-world scenarios. The findings of this study indicate that the combination of Random Forest–based feature selection and boosting algorithms can serve as an effective approach for improving the efficiency and performance of Android malware detection systems.

Co-Authors Wildanil Ghozi

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search