Cyberbullying on social media platforms has become widespread in society. Cyberbullying can take many forms, including hate speech, trolling, adult content, racism, harassment, or rants. One social media platform that has many cyberbullies is Twitter, which has been renamed 'X'. The anonymous nature of this 'X' platform allows users from all over the world to commit cyberbullying as they can freely share their thoughts and expressions without having to account for their identity. This research aims to explore the influence of IndoBERT’s semantic features on hybrid deep learning models for cyberbullying detection while integrating TF-IDF feature extraction and FastText feature expansion to enhance text classification performance. Specifically, this study examines how IndoBERT’s semantic capabilities affect the hybrid deep learning model in detecting cyberbullying on platform 'X'. This study has 30,084 tweets with a hybrid deep learning approach that combines CNN and LSTM. In the IndoBERT scenario, IndoBERT features were first combined with TF-IDF, then expanded using FastText before being applied to the hybrid deep learning model. The test results produced the highest accuracy rate by: CNN (80.69%), LSTM (80.67%), CNN- LSTM (81.18%), CNN-LSTM-IndoBERT (82.05%). This research contributes to informatics by integrating hybrid deep learning (CNN-LSTM) with IndoBERT and TF-IDF, demonstrating its effectiveness in improving cyberbullying detection in Indonesian text. Future research can explore the use of other transformer-based models such as RoBERTa or ALBERT to enhance contextual understanding in cyberbullying classification.
Copyrights © 2025