Mental health is a crucial issue affecting individual and societal well-being. This study aims to investigate and compare the performance of Machine Learning algorithms, namely Naïve Bayes and Random Forest, for text-based mental health classification. The dataset used is the Mental Health Corpus from Kaggle, consisting of 27,977 English text messages from online forums, with binary labels (0: no indication of mental disorder, 1: indication of mental disorder) pre-annotated by the dataset creators. Text preprocessing involved lowercasing, negation handling, stopword removal, slang normalization, tokenization, and stemming. Data transformation was performed using TF-IDF. Model evaluation utilized accuracy, precision, recall, and F1-score metrics, along with 5-Fold Cross Validation. Evaluation results indicate high performance for both algorithms. Naïve Bayes achieved 88.7 % accuracy, 84.2 % precision, 95.2 % recall, and 89.3 % F1-score on the test data. Random Forest demonstrated more balanced performance with 89.3 % accuracy, 88.1 % precision, 90.5 % recall, and 89.3 % F1-score. The 5-Fold Cross Validation for Naïve Bayes yielded average scores of 88.8 % accuracy, 84.4 % precision, 94.9 % recall, and 89.3 % F1-score. In contrast, Random Forest showed averages of 89.2 % accuracy, 88.8 % precision, 89.5 % recall, and 89.3 % F1-score. While Naïve Bayes had higher recall, Random Forest exhibited the best overall performance, considering the combination of accuracy, precision, and stable generalization, making it more effective for mental health text classification.
                        
                        
                        
                        
                            
                                Copyrights © 2025