This study investigates the detection of Domain Name System over HTTPS (DoH) spoofing attacks utilizing the CIRA-CIC-DoHBrw-2020 dataset, which encompasses over 100,000 labeled DNS records categorized as either normal or malicious. Features such as packet timing, packet size, and TLS parameters are utilized for detection purposes. A systematic feature selection process is conducted utilizing the Elbow and Kneedle methods based on F-Score values derived from a built-in model evaluation. This method ensures that the top features are selected objectively and quantitatively, thereby enhancing the robustness of the model. The model is trained using the five most significant features, yielding exceptional performance metrics: a training time of just 0.5727 seconds, an inference time of 0.0157 seconds, and an inference latency of 0.0035 milliseconds per sample. Moreover, the model delivers an outstanding accuracy of 0.9995, an F1-Score of 0.9995, and an AUC-ROC of 1.0000, reflecting near-perfect detection capabilities. The classification report reveals a balanced distribution of precision, recall, and F1-Scores of 1.00 across both normal and malicious classes, based on a test sample of 14,974 entries. The Elbow plot visually confirms the optimal number of features utilized, while the SHAP beeswarm plot provides insights into how each selected feature contributes to the model’s predictions, facilitating interpretability. Additionally, the confusion matrix corroborates the model's reliability, showcasing that nearly all samples were accurately classified. The results demonstrate that the proposed methodology significantly enhances the effectiveness of DNS spoofing detection, offering a promising avenue for securing DNS over HTTPS communications.
Copyrights © 2025