bit-Tech
Vol. 8 No. 2 (2025): bit-Tech

Black Hat SEO Detection Using Ensemble Learning and Multi-Dimensional Web Content Analysis

Akhmad Zaqi Riyadi (Universitas Teknologi Yogyakarta)
Sri Wulandari (Universitas Teknologi Yogyakarta)



Article Info

Publish Date
10 Dec 2025

Abstract

The integrity of search engines is significantly threatened by manipulative Black Hat SEO (BSEO) tactics, particularly the hidden injection of illicit content such as online gambling. This issue is critically urgent in Indonesia, where attackers frequently compromise government domains (.go.id). By September 2023, over 9,000 such sites had been infiltrated using stealthy defacement and semantic confusion highlighting a gap in existing detection systems that rely on single-dimensional features or ignore real-world class imbalance. To address this, we propose an ensemble learning based detection system combining Random Forest (RF) and Support Vector Machine (SVM), supported by multi-dimensional feature engineering from URLs, meta-tags, hidden CSS/HTML elements, and high-risk keywords (e.g., “slot”, “judi”). Our manually annotated dataset comprises 582 .go.id URLs with a natural 4:1 class imbalance, mitigated via Random Oversampling during training. Evaluation on a balanced test set (146 samples) shows 93.8% ensemble accuracy, 99.6% AUC-ROC, and most critically 100% recall for the Black Hat class, ensuring minimal false negatives. The system also incorporates an internal “override logic” that flags evasion tactics like cloaking or hidden keyword injection, enhancing interpretability. Unlike deep learning alternatives that require large data and computational resources, our approach balances performance, efficiency, and transparency making it suitable for deployment by national cybersecurity agencies. This work advances both academic research and practical defense capabilities against sophisticated BSEO threats targeting public-sector websites.

Copyrights © 2025






Journal Info

Abbrev

bt

Publisher

Subject

Computer Science & IT

Description

The bit-Tech journal was developed with the aim of accommodating the scientific work of Lecturers and Students, both the results of scientific papers and research in the form of literature study results. It is hoped that this journal will increase the knowledge and exchange of scientific ...