Claim Missing Document
Check
Articles

Found 3 Documents
Search

Malware Detection Using K-Nearest Neighbor Algorithm and Feature Selection Supriyanto, Catur; Rafrastara, Fauzi Adi; Amiral, Afinzaki; Amalia, Syafira Rosa; Al Fahreza, Muhammad Daffa; Abdollah, Mohd. Faizal
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 1 (2024): Januari 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i1.6970

Abstract

Malware is one of the biggest threats in today’s digital era. Malware detection becomes crucial since it can protect devices or systems from the dangers posed by malware, such as data loss/damage, data theft, account break-ins, and the entry of intruders who can gain full access of system. Considering that malware has also evolved from traditional form (monomorphic) to modern form (polymorphic, metamorphic, and oligomorphic), a malware detection system is needed that is no longer signature-based, but rather machine learning-based. This research will discuss malware detection by classifying the file whether considered as malware or goodware, using one of the classification algorithms in machine learning, namely k-Nearest Neighbor (kNN). To improve the performance of kNN, the number of features was reduced using the Information Gain and Principal Component Analysis (PCA) feature selection methods. The performance of kNN with PCA and Information Gain will then be compared to get the best performance. As a result, by using the PCA method where the number of features was reduced until the remaining 32 PCs, the kNN algorithm succeeded in maintaining classification performance with an accuracy of 95.6% and an F1-Score of 95.6%. Using the same number of features as the basis, the Information Gain method is applied by sorting the features from those with the highest Information Gain score and taking the 32 best features. The result, by using this Information Gain method, the classification performance of the kNN algorithm can be increased to 96.9% for both accuracy and F1-Score.
Performance Comparison of k-Nearest Neighbor Algorithm with Various k Values and Distance Metrics for Malware Detection Rafrastara, Fauzi Adi; Supriyanto, Catur; Amiral, Afinzaki; Amalia, Syafira Rosa; Al Fahreza, Muhammad Daffa; Ahmed, Foez
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 1 (2024): Januari 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i1.6971

Abstract

Malware could evolve and spread very quickly. By these capabilities, malware becomes a threat to anyone who uses a computer, both offline and online. Therefore, research on malware detection is still a hot topic today, due to the need to protect devices or systems from the dangers posed by malware, such as loss/damage of data, data theft, account hacking, and the intrusion of hackers who can control the entire system. Malware has evolved from traditional (monomorphic) to modern forms (polymorphic, metamorphic, and oligomorphic). Conventional antivirus systems cannot detect modern types of viruses effectively, as they constantly change their fingerprints each time they replicate and propagate. With this evolution, a machine learning-based malware detection system is needed to replace the existence of signature-based. Machine learning-based antivirus or malware detection systems detect malware by performing dynamic analysis, not static analysis as used by traditional ones. This research discusses malware detection using one of the classification algorithms in machine learning, namely k-Nearest Neighbor (kNN). To improve the performance of kNN, the number of features is reduced using the Information Gain feature selection method. The performance of kNN with Information Gain will then be measured using the evaluation metrics Accuracy and F1-Score. To get the best score, some adjustments are made to the kNN algorithm, where 3 distance measurement methods will be compared to obtain the best performance along with the variations in the k values of kNN. The distance measurement methods compared are Euclidean, Manhattan, and Chebyshev, while the variations of k values compared are 3, 5, 7, and 9. The result is, kNN with the Manhattan distance measurement method, k = 3, and using information gain features selection method (reduction until 32 features remain) has the highest Accuracy and F1-Score, which is 97.0%.
The Effect of LAB Color Space with NASNetMobile Fine-tuning on Model Performance for Crowd Detection Rafid, Muhammad; Luthfiarta, Ardytha; Naufal, Muhammad; Al Fahreza, Muhammad Daffa; Indrawan, Michael
Advance Sustainable Science, Engineering and Technology Vol 6, No 1 (2024): November-January
Publisher : Universitas PGRI Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26877/asset.v6i1.17821

Abstract

In the COVID-19 pandemic, computer vision plays a crucial role in crowd detection, supporting crowd restriction policies to mitigate virus spread. This research focuses on analyzing the impact of using the RGB LAB color space on the performance of NASNetMobile for crowd detection. The fine-tuning process, involving freezing layers in various NASNetMobile base model variations, is considered. Results reveal that the model with LAB color space outperforms model with RGB color space, with an average accuracy of 94.68% compared to 94.15%. From all the test iterations, it was found that the highest performance for the NASNetMobile model occurred when freezing 10% of the layers from the back for both model LAB and RGB color spaces, with the LAB color space achieving an accuracy of 95.4% and the RGB color space achieving an accuracy of 95.1%.