Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal International Journal of Artificial Intelligence in Medical Issues

Yuli Praptomo Pamungkas Hari Sungkowo

STIMIK El Rahma Yogyakarta

Author-ID : 10256814

Computer Science & IT Dentistry Health Professions Medicine & Pharmacology Public Health

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Confidence-Aware Depression Severity Detection in Low-Resource Urdu Social Media Text: A Multilingual Machine Learning Approach Ahmad Naswin; Yuli Praptomo Pamungkas Hari Sungkowo
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/62qxjt74

Depression is a major mental health concern that requires early identification and timely intervention. Social media has become an important source of user-generated text that may reflect emotional distress, hopelessness, social withdrawal, and suicidal ideation. However, most existing depression detection studies focus on English or high-resource languages, while research on low-resource languages such as Urdu remains limited. This study investigates depression severity classification in Urdu social media text using multilingual and confidence-aware natural language processing approaches. The dataset consists of 4,000 Twitter/X posts collected between January 2024 and April 2025, annotated into four severity classes: none, mild, moderate, and severe. Each post is represented in three parallel textual forms: native Urdu script, Roman Urdu transliteration, and English translation. The dataset also includes label confidence scores, human verification indicators, cultural markers, and depression-related keywords. Several text representation scenarios were evaluated, including Urdu text, Roman Urdu text, English text, and combined multilingual features. Baseline machine learning models were developed using TF-IDF features with Logistic Regression, Linear Support Vector Machine, and Multinomial Naive Bayes. Confidence-aware learning was examined by incorporating label confidence scores as sample weights and by evaluating a high-confidence subset. The experimental results showed that all baseline models achieved perfect classification performance, with accuracy, macro F1-score, weighted F1-score, and Cohen’s Kappa values of 1.000 across the evaluated scenarios. These results indicate that the dataset contains highly separable linguistic patterns among depression severity classes. However, further inspection suggests that repeated or highly similar textual patterns may contribute to overly optimistic performance. Therefore, stricter validation using duplicate-free splitting, external datasets, and transformer-based models is recommended for future work. This study provides a preliminary benchmark for multilingual depression severity classification in low-resource Urdu text and highlights the potential of AI-driven mental health informatics as a supportive early-warning tool rather than a clinical diagnostic system

Co-Authors Ahmad Naswin

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search