Simanjuntak, Walker Valentinus
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Enhancing Hate Speech Detection: Leveraging Emoji Preprocessing with BI-LSTM Model Amalia, Junita; Tambunan, Sarah Rosdiana; Purba, Susi Eva Maria; Simanjuntak, Walker Valentinus
Journal of Information System and Informatics Vol 7 No 2 (2025): June
Publisher : Universitas Bina Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51519/journalisi.v7i2.1147

Abstract

Microblogging platforms like Twitter enable users to rapidly share opinions, information, and viewpoints. However, the vast volume of daily user-generated content poses challenges in ensuring the platform remains safe and inclusive. One key concern is the prevalence of hate speech, which must be addressed to foster a respectful and open environment. This study explores the effectiveness of the Emoji Description Method (EMJ DESC), which enhances tweet classification by converting emojis into descriptive text or sentences. These descriptions are then encoded into numerical vector matrices that capture the meaning and emotional tone of each emoji. Integrated into a basic text classification model, these vectors help improve detection performance. The research examines how different emoji preprocessing strategies affect the performance of a BI-LSTM model for hate speech classification. Results show that removing emojis significantly reduces accuracy (68%) and weakens the model’s ability to distinguish between hate and non-hate speech, due to the loss of valuable semantic context. In contrast, retaining emoji semantics either through textual descriptions or embeddings boosts classification accuracy to 93% and 94%, respectively. The highest performance is achieved through emoji embedding, highlighting its ability to capture subtle non-verbal cues critically for accurate hate speech detection. Overall, the findings emphasize the importance of incorporating emoji-aware preprocessing techniques to enhance the effectiveness of social media content classification.