There are three primary approaches to DDoS detection: anomaly-based, pattern-based, and heuristic-based. The heuristic-based method integrates both anomaly- and pattern-based techniques. However, existing DDoS detection systems face challenges in performing HTTP payload-level analysis, mainly due to high false positive rates and insufficient granularity in current datasets. To address this, the study introduces a novel heuristic approach based on a hybrid N-Gram model. This hybrid combines two components: CSDPayload+N-Gram and CSPayload+N-Gram. CSDPayload represents the gap (measured via Chi-Square Distance) between a given payload and normal traffic payloads, while CSPayload reflects the similarity (measured via Cosine Similarity) between them. These metrics form a new feature set evaluated using three datasets: CIC2019, MIB2016, and H2N-Payload. The methodology begins with packet extraction and conversion of TCP/IP traffic—specifically HTTP traffic—into hexadecimal payloads. N-Gram analysis (from 1-Gram to 6-Gram) is then applied to these payloads. For each N-Gram, frequency counts are computed, followed by calculations of Chi-Square Distance (CSD), Cosine Similarity (CS), and Pearson’s Chi-Square test to classify payloads as either benign or malicious. Subsequently, feature selection is performed using weight correlation, and the resulting features are fed into three machine learning classifiers: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Neural Network. Experimental results demonstrate high detection accuracy, particularly in the 4-Gram feature category: Neural Network achieves 99.65%, KNN 95.14%, and SVM 99.73% accuracy on average.
Copyrights © 2025