Detecting malicious JavaScript remains a persistent challenge in cybersecurity, particularly as obfuscation techniques become more sophisticated. This study presents a dual-model detection framework that separates the analysis of obfuscation from malicious behavior to enhance precision. The first model detects obfuscated scripts using 20 features, including entropy, string ratios, and syntax. The second model classifies malicious code based on 92 features, incorporating outputs from the first model and semantically meaningful strings reconstructed using a novel technique called atomic search. Both models utilize the random forest algorithm and are trained on balanced datasets of labeled JavaScript samples. Experimental results demonstrate high performance, with the obfuscation model achieving 99.1% accuracy and the malicious detection model reaching 99.52%. The proposed approach provides a scalable and effective solution for detecting hidden threats in modern web environments by clearly addressing obfuscation and incorporating semantic reconstruction.
Copyrights © 2025