Coreid Journal
Vol. 3 No. 3 (2025): November 2025

Natural Language Processing and Random Forest for Mental Health Symptom Identification Using Social Media Data

Sugara, Sigit (Unknown)
Dauni, Popon (Unknown)
Putri, Novianti Indah (Unknown)
Saputra, Yogi (Unknown)
Suryana, Nana (Unknown)



Article Info

Publish Date
30 Nov 2025

Abstract

This study explores the implementation of machine learning models, specifically Natural Language Processing (NLP) and Random Forest, for detecting mental health symptoms based on text analysis of web-sourced data. The research addresses the challenges of analyzing highly subjective and dynamic text in social media content to identify patterns associated with anxiety, depression, and stress. The methodology involves several preprocessing steps including case folding, cleansing, language normalization, negation conversion, stopword removal, and tokenization, followed by TF-IDF weighting and Random Forest classification. The model evaluation revealed a high accuracy rate of approximately 80%, although achieving a confidence level of 75% proved challenging. This research demonstrates that despite the inherent difficulties in predicting subjectively variable text, the machine learning approaches employed show satisfactory performance in identifying mental health symptoms, offering potential for early detection and intervention systems.

Copyrights © 2025






Journal Info

Abbrev

coreid

Publisher

Subject

Computer Science & IT

Description

CoreID is a scientific journal that contains scientific papers from Academics, Researchers, and Practitioners about research on informatics and Computer. CoreID is published 3 times a year in March, July, and November. The paper is an original script and has a research base on Informatics. The scope ...