Social media teams increasingly rely on early signals to prioritize content, yet forecasting engagement and identifying viral posts remain difficult under temporal drift and heavy-tailed interaction counts. This study evaluated Graph Neural Network (GNN) approaches for predicting post engagement and virality from pre-posting content-based and contextual features. The Social Media Engagement Report dataset, which contained 100,000 posts across Twitter, LinkedIn, Facebook, and Instagram spanning March 2021–March 2024, was used. Post-release variables (impressions, reach, engagement rate) were excluded to prevent leakage. A homogeneous post–post graph was constructed using k-nearest-neighbor similarity in an embedding space and exact-match links on low-cardinality context. Ridge/Logistic Regression, Random Forest, and XGBoost as the baselines were compared against GraphSAGE and GAT under a chronological train, validation, and test split. Regression used MAE, RMSE, and R2, while virality classification used ROC-AUC, PR-AUC, and Precision at the top 1% ranked posts. GraphSAGE yielded the strongest virality screening, achieving ROC-AUC = 0.66, PR-AUC = 0.54–0.56, and Precision@1% up to 0.75, substantially above non-graph baselines. For regression, GAT produced the lowest errors despite a negative R², indicating limited explained variance. Overall, similarity-graph GNNs are most effective for early virality identification, whereas exact count prediction remains challenging in a strictly pre-posting, time-aware setting.
Copyrights © 2026