Nurul Hidayat
Department of Informatics, Universitas Jenderal Soedirman, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

RoBERTa with Sample Reweighting and Temperature Scaling for Imbalanced Toxicity Detection: A Performance–Fairness–Calibration Study Lasmedi Afuan; Nurul Hidayat; Abdul Karim
International Journal of Machine Learning (IJOML) Vol. 1 No. 1 (2026): IJOML Volume 1, Number 1, June 2026
Publisher : APJIKOM

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/ijoml.v1i1.3

Abstract

Detecting toxic language at scale requires models that are not only accurate but also robust to demographic subgroup bias and reliable in their probability estimates; however, these objectives can conflict, especially under severe class imbalance. This study investigates the performance–fairness–calibration interplay in toxicity detection using the Jigsaw Unintended Bias dataset (124,858 comments; 5.99% toxic; identity annotations in 9.39% of samples). We aim to quantify how sample reweighting and imbalance-aware training affect global discrimination, worst-subgroup behavior, and probabilistic calibration, and to assess post-hoc temperature scaling on predicted probabilities. We compare a TF-IDF + logistic regression baseline against RoBERTa variants trained without mitigation, with sample reweighting, and with an imbalance-oriented loss, using multi-metric evaluation (AUC, Min/Worst-Subgroup AUC, ECE, and NLL). RoBERTa consistently improves global AUC over the baseline (≈0.96 vs 0.9155) while worst-subgroup AUC remains substantially lower and varies modestly across RoBERTa variants (≈0.7726–0.7813). Calibration results indicate a marked gap between models: the baseline achieves the lowest ECE (0.0052), whereas RoBERTa exhibits higher ECE (≈0.0257) that increases further under reweighting and imbalance-oriented training (≈0.0490–0.0866), with NLL not improving consistently. These findings contribute empirical evidence that fairness-oriented interventions can shift error and calibration profiles, motivating holistic evaluation and methods that jointly constrain subgroup fairness and probabilistic reliability.