INTERNAL (Information System Journal)
Vol. 8 No. 1 (2025)

Deteksi Kerentanan Kode PHP Menggunakan TF-IDF, AST Parsing, dan Random Forest

Habiby, Mohamad Erdda (Unknown)



Article Info

Publish Date
30 Jun 2025

Abstract

Vulnerabilities in PHP programming code remain one of the primary threats to the security of web applications, especially in the absence of automated analysis mechanisms during development. This study presents a hybrid system designed to detect vulnerabilities in PHP code in real time, utilizing a combination of Term Frequency-Inverse Document Frequency (TF-IDF) and Abstract Syntax Tree (AST) parsing techniques, along with the Random Forest classification algorithm. The process begins with preprocessing, cleaning, and tokenization, followed by feature extraction using TF-IDF. The next stage involves AST-based parsing to identify potentially dangerous syntax, such as the use of eval() or include() functions with parameters derived from user input. Both feature sets are combined and used to train a Random Forest model on a labeled dataset that distinguishes between vulnerable and secure code. Reliability testing was conducted on several PHP files from existing applications. The results show that the system is capable of identifying vulnerabilities such as File Inclusion and SQL Injection with confidence scores ranging from 52% to 61%, regardless of the structure or complexity of the code analyzed. Although the confidence level is not yet optimal, the system has demonstrated effectiveness in providing early warnings of potential threats. In the future, this system can be further developed as a simulation tool for static code analysis to support improved secure coding practices in software development.

Copyrights © 2025






Journal Info

Abbrev

internal

Publisher

Subject

Computer Science & IT Education Other

Description

INTERNAL (Information System Journal) is a scientific journal published by the Information Systems Study Program, Masoem University. This journal is a forum for publication of scientific papers in the form of writings by academics, researchers and practitioners on pure and applied research in the ...