Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal bit-Tech

Salem Abdullah Salem Garfan

Universiti Pendidikan Sultan Idris

Author-ID : 9844085

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Enhancing Sentiment Classification Accuracy Through Pre-Processing In Educational Text Messenger Data Md Abdul Bakir; Suliana Sulaiman; Salem Abdullah Salem Garfan
bit-Tech Vol. 8 No. 2 (2025): bit-Tech
Publisher : Komunitas Dosen Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32877/bt.v8i2.3377

This paper discusses the critical pre-processing steps for appropriate sentiment analysis (SA) in an educational domain, especially when working on text messenger data from instant messaging applications like WhatsApp and Telegram. As these platforms often generate noisy, unstructured, and multilingual messages that include textisms, emojis, and mixed-language expressions, proper data preparation is essential to ensure reliable analytical outcomes. The primary goal of this work is to discover and validate pre-processing approaches applied for improving the model’s performance when working with such rich data. In order to do so, we performed an SLR to establish best practices on text pre-processing for SA using methods applicable for informal, user-generated content. Characteristics extracted via the SLR, namely, textism normalization, stop word removal, punctuation removal, stemming, translation of mixed-language text and tokenization were next applied to a gathered dataset from educational subject groups. These techniques achieved a great increase of 0.705 to 0.893 based on the BERT model accuracy. These results emphasize the need of well-developed pre-processing pipelines for handling multilingual and unstructured text in educational communication channels. However, the study is limited to text data from WhatsApp and Telegram, focusing only on Malay and English languages. Further studies could explore other languages, platforms and more advanced normalization processes in a way that continues to enhance the predictive capacity of pre-processing strategies for sentiment analysis across an array of educational contexts.

Co-Authors Md Abdul Bakir Suliana Sulaiman

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search