Hemtanon, Siranuch
Unknown Affiliation

Published : 3 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search

Classification and keyword extraction of online harassment text in Thai social network Hemtanon, Siranuch; Phetkrachang, Ketsara; Yangyuen, Wachira
Bulletin of Electrical Engineering and Informatics Vol 12, No 6: December 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i6.5939

Abstract

Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.
Solving missing categorical data in questionnaire responses for automated classification Aekwarangkoon, Saifon; Namponwatthanakul, Thanatep; Amonwet, Adisorn; Hemtanon, Siranuch
Bulletin of Electrical Engineering and Informatics Vol 14, No 4: August 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i4.8785

Abstract

Handling missing categorical data is critical for maintaining the accuracy and reliability of automatic classification tasks, particularly in mental health screening based on questionnaire responses. This study investigates several imputation methods, including last observation carried forward (LOCF), k-nearest neighbor (KNN) imputation, hot-deck imputation, and multivariate imputation by chained equations (MICE). Results show that KNN imputation achieves the lowest root mean square error (RMSE), indicating the most faithful reconstruction of the original data. However, for classification performance, MICE-imputed datasets produced models that outperformed those generated by other methods and even surpassed models trained on the original incomplete data. Interestingly, we also found that using observed data over multiple iterations of imputation tuning can introduce greater deviation from original missing values, but this process can help form datasets with clearer class boundaries, ultimately improving classification accuracy. These findings emphasize the need to balance data fidelity and model performance when selecting imputation strategies.
Proactive depression detection from Facebook text and behavior data Hemtanon, Siranuch; Aekwarangkoon, Saifon; Kittiphattanabawon, Nichnan
International Journal of Electrical and Computer Engineering (IJECE) Vol 12, No 5: October 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v12i5.pp5027-5035

Abstract

This paper proposes a proactive method to detect the clinical depression affected person from post and behavior data from Facebook, called text-based and behavior-based models, respectively. For a text-based model, the words that make up the posts are separated and converted into vectors of terms. A machine learning classification applies the term frequency- inverse document frequency technique to identify important or rare words in the posts. For the behavior-based model, the statistical values of the behavioral data were designed to capture depressive symptoms. The results showed that the behavior-based model was able to detect depressive symptoms better than the text-based model. Regarding performance, a detection model using behaviors yields significantly higher F1 scores than those using words in the post. The K-nearest neighbor (KNN) classifier is the best model with the highest F1 score of 1.0, while the highest F1 score of the behavior-based model is 0.88. An analysis of the predominant features influencing depression signifies that posted messages could detect feelings of self-hatred and suicidal thoughts. At the same time, behavioral manifestations identified depressed people who manifested as restlessness, insomnia, decreased concentration.