The Indonesian Journal of Computer Science
Vol. 14 No. 3 (2025): The Indonesian Journal of Computer Science

Causal-Aware Classification of Social Media Hate Speech: Enhancing Robustness and Fairness with BERT

Rasul, Pshko (Unknown)



Article Info

Publish Date
29 Jun 2025

Abstract

Social media platforms face increasing challenges in moderating hate speech effectively. While deep learning models like BERT have advanced detection performance, they often rely on spurious correlations and may exhibit bias toward marginalized communities. This paper proposes a causal-aware classification framework integrating causal inference techniques with BERT fine-tuning to improve robustness and fairness in hate speech detection. Using the HateXplain dataset, which includes labeled social media posts and annotator rationales, we construct a causal graph identifying potential confounders. Our model incorporates backdoor adjustment and invariant risk minimization (IRM) during training. Experiments demonstrate enhanced accuracy under distribution shifts and reduced demographic bias compared to baseline models.

Copyrights © 2025






Journal Info

Abbrev

ijcs

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering Engineering

Description

The Indonesian Journal of Computer Science (IJCS) is a bimonthly peer-reviewed journal published by AI Society and STMIK Indonesia. IJCS editions will be published at the end of February, April, June, August, October and December. The scope of IJCS includes general computer science, information ...