Fanani, Ahmad Muhlis
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Sarcasm Detection in Indonesian YouTube Comments using Fine-Tuned IndoBERT with Class Imbalance Handling Fanani, Ahmad Muhlis; Wahyuddin, Moh. Iwan
Sinkron : jurnal dan penelitian teknik informatika Vol. 10 No. 1 (2026): Article Research January 2026
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v10i1.15607

Abstract

Sarcasm detection in Indonesian social media faces challenges in natural language processing due to implicit meanings and limited labeled datasets. YouTube, with 143 million users in Indonesia, represents a largely unexplored source of sarcastic expressions. This study aims to develop an automatic sarcasm detection system for Indonesian YouTube comments using fine-tuned IndoBERT and evaluate the performance of two IndoBERT variants. A dataset of 5,291 YouTube comments was collected and automatically labeled using GPT-4o with structured prompts based on linguistic indicators of sarcasm. Two IndoBERT variants (IndoNLU and IndoLEM) were fine-tuned with three class imbalance mitigation strategies: imbalanced, under-sampling, and class weighting. Zero-shot evaluation was conducted as a baseline to measure fine-tuning effectiveness. Models were evaluated using accuracy, precision, recall, and F1-score metrics. Pre-trained models without fine-tuning showed very limited sarcasm detection capability with F1-scores of 0.1613 for IndoNLU and 0.3519 for IndoLEM. Fine-tuning with under-sampling dramatically improved F1-scores to 0.6499 for IndoNLU and 0.6568 for IndoLEM, showing improvements up to 303%. IndoBERT-IndoNLU provided more balanced performance with 0.6424 accuracy, while IndoLEM showed higher sarcasm recall of 0.7639. Fine-tuning IndoBERT is effective for detecting sarcasm in Indonesian YouTube comments. This study contributes by providing a new labeled dataset, demonstrating the effectiveness of automatic labeling using large language models, and providing empirical evidence of the significant value of fine-tuning for Indonesian sarcasm detection.