Claim Missing Document
Check
Articles

Found 1 Documents
Search

SEO-Based Blog Content Pipeline Automation: Integrating Web Scraping and Generative AI for Digital Marketing Efficiency Aris Wahyu Murdiyanto; David Sulistiyantoro; Mukasi Wahyu Kurniawati
APPLIED SCIENCE AND TECHNOLOGY REASERCH JOURNAL Vol. 5 No. 1 (2026): Applied Science and Technology Research Journal
Publisher : Lembaga Penelitian dan Pengabdian Mayarakat (LPPM) Universitas PGRI Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31316/astro.v4i2.9454

Abstract

The consistent production of Search Engine Optimization (SEO) content remains a crucial challenge in digital marketing due to the inherent inefficiencies of manual workflows. This study aims to design, develop, and evaluate the technical feasibility of an end-to-end hybrid content automation pipeline architecture. The proposed system integrates deterministic web scraping (Selenium and BeautifulSoup) for data acquisition, Generative AI (OpenAI GPT) for text synthesis and On-Page SEO optimization, the Replicate API for visual asset generation, and the WordPress REST API for autonomous publication. Employing a Proof of Concept (PoC) method at Technology Readiness Level (TRL) 3, the system was tested across two scenarios representing varying Document Object Model (DOM) structural complexities. Empirical results demonstrate that on websites with standard HTML structures, the system successfully operated autonomously, improving computational time efficiency by 98.8% (reducing the production cycle from an estimated 195 minutes to 2.25 minutes per article). The generated content proved to optimally meet On-Page SEO indicators. However, objective evaluation also revealed technical vulnerabilities in dynamic websites utilizing Client-Side Rendering (CSR), where static scraper scripts failed to extract the text payload. This study concludes that integrating generative AI into the production pipeline offers massive SEO scalability, yet it necessitates a more adaptive data extraction mechanism to achieve universal system reliability.