Habiby, Mohamad Erdda
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Deteksi Kerentanan Kode PHP Menggunakan TF-IDF, AST Parsing, dan Random Forest Habiby, Mohamad Erdda
INTERNAL (Information System Journal) Vol. 8 No. 1 (2025)
Publisher : Masoem University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32627/internal.v8i1.1411

Abstract

Vulnerabilities in PHP programming code remain one of the primary threats to the security of web applications, especially in the absence of automated analysis mechanisms during development. This study presents a hybrid system designed to detect vulnerabilities in PHP code in real time, utilizing a combination of Term Frequency-Inverse Document Frequency (TF-IDF) and Abstract Syntax Tree (AST) parsing techniques, along with the Random Forest classification algorithm. The process begins with preprocessing, cleaning, and tokenization, followed by feature extraction using TF-IDF. The next stage involves AST-based parsing to identify potentially dangerous syntax, such as the use of eval() or include() functions with parameters derived from user input. Both feature sets are combined and used to train a Random Forest model on a labeled dataset that distinguishes between vulnerable and secure code. Reliability testing was conducted on several PHP files from existing applications. The results show that the system is capable of identifying vulnerabilities such as File Inclusion and SQL Injection with confidence scores ranging from 52% to 61%, regardless of the structure or complexity of the code analyzed. Although the confidence level is not yet optimal, the system has demonstrated effectiveness in providing early warnings of potential threats. In the future, this system can be further developed as a simulation tool for static code analysis to support improved secure coding practices in software development.