Claim Missing Document
Check
Articles

Found 2 Documents
Search

Predicting Social Media Post Engagement and Virality Using Graph Neural Network Approaches and Content-Based Features Fathimah Az Zahrah; Riska Dhenabayu; Muhammad Fajar Wahyudi Rahman; Renny Sari Dewi; Zamabhungane Hadebe Aminah
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol. 11, No. 3, August 2026 (Article in Progress)
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22219/kinetik.v11i3.2686

Abstract

Social media teams increasingly rely on early signals to prioritize content, yet forecasting engagement and identifying viral posts remain difficult under temporal drift and heavy-tailed interaction counts. This study evaluated Graph Neural Network (GNN) approaches for predicting post engagement and virality from pre-posting content-based and contextual features. The Social Media Engagement Report dataset, which contained 100,000 posts across Twitter, LinkedIn, Facebook, and Instagram spanning March 2021–March 2024, was used. Post-release variables (impressions, reach, engagement rate) were excluded to prevent leakage. A homogeneous post–post graph was constructed using k-nearest-neighbor similarity in an embedding space and exact-match links on low-cardinality context. Ridge/Logistic Regression, Random Forest, and XGBoost as the baselines were compared against GraphSAGE and GAT under a chronological train, validation, and test split. Regression used MAE, RMSE, and R2, while virality classification used ROC-AUC, PR-AUC, and Precision at the top 1% ranked posts. GraphSAGE yielded the strongest virality screening, achieving ROC-AUC = 0.66, PR-AUC = 0.54–0.56, and Precision@1% up to 0.75, substantially above non-graph baselines. For regression, GAT produced the lowest errors despite a negative R², indicating limited explained variance. Overall, similarity-graph GNNs are most effective for early virality identification, whereas exact count prediction remains challenging in a strictly pre-posting, time-aware setting.
Analysis of Cryptocurrency Investment Patterns Using Machine Learning Farrel Amri Naufal Sandio; Renny Sari Dewi
INOVTEK Polbeng - Seri Informatika Vol. 11 No. 2 (2026): May
Publisher : P3M Politeknik Negeri Bengkalis

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35314/7ny29y07

Abstract

The rapid growth of cryptocurrency, particularly Bitcoin, has introduced high-return investment opportunities accompanied by extreme price volatility, posing challenges for accurate forecasting. Previous studies have applied various machine learning models for Bitcoin price prediction; however, limited attention has been given to how different training data horizons affect model performance and generalisation. This study addresses this gap by comparing three machine learning algorithms: Linear Regression (LR), XGBoost, and Long Short-Term Memory (LSTM). The analysis examines different training periods, with a primary focus on a 3-year training scenario. Historical Bitcoin data (1-minute intervals) from Kaggle was aggregated into daily observations and processed using strict chronological splitting (80:20) without data leakage. Feature engineering was applied using lag-based variables, moving averages, and volatility indicators, whilst LSTM utilised sequence windowing with 30–60 time steps. Empirical results from the 3-year training scenario show that LR and XGBoost achieve strong predictive performance (R² = 0.9757 and 0.9667), whilst LSTM performs moderately (R² = 0.72) with higher prediction errors. Additional exploratory experiments on shorter training horizons (e.g., 6 months) indicate a decline in performance across models, reflected in unstable generalisation and negative R² values on test data, suggesting overfitting. However, directional accuracy remains above 55% in the primary scenario. These findings suggest that model performance is sensitive to the length and stability of historical data. Whilst simpler models such as linear regression and tree-based methods demonstrate consistent performance in the evaluated setting, conclusions regarding model superiority should be interpreted within the scope of the experiment.