Sinkron : Jurnal dan Penelitian Teknik Informatika
Vol. 10 No. 1 (2026): Article Research January 2026

Sarcasm Detection in Indonesian YouTube Comments using Fine-Tuned IndoBERT with Class Imbalance Handling

Fanani, Ahmad Muhlis (Unknown)
Wahyuddin, Moh. Iwan (Unknown)



Article Info

Publish Date
03 Jan 2026

Abstract

Sarcasm detection in Indonesian social media faces challenges in natural language processing due to implicit meanings and limited labeled datasets. YouTube, with 143 million users in Indonesia, represents a largely unexplored source of sarcastic expressions. This study aims to develop an automatic sarcasm detection system for Indonesian YouTube comments using fine-tuned IndoBERT and evaluate the performance of two IndoBERT variants. A dataset of 5,291 YouTube comments was collected and automatically labeled using GPT-4o with structured prompts based on linguistic indicators of sarcasm. Two IndoBERT variants (IndoNLU and IndoLEM) were fine-tuned with three class imbalance mitigation strategies: imbalanced, under-sampling, and class weighting. Zero-shot evaluation was conducted as a baseline to measure fine-tuning effectiveness. Models were evaluated using accuracy, precision, recall, and F1-score metrics. Pre-trained models without fine-tuning showed very limited sarcasm detection capability with F1-scores of 0.1613 for IndoNLU and 0.3519 for IndoLEM. Fine-tuning with under-sampling dramatically improved F1-scores to 0.6499 for IndoNLU and 0.6568 for IndoLEM, showing improvements up to 303%. IndoBERT-IndoNLU provided more balanced performance with 0.6424 accuracy, while IndoLEM showed higher sarcasm recall of 0.7639. Fine-tuning IndoBERT is effective for detecting sarcasm in Indonesian YouTube comments. This study contributes by providing a new labeled dataset, demonstrating the effectiveness of automatic labeling using large language models, and providing empirical evidence of the significant value of fine-tuning for Indonesian sarcasm detection.

Copyrights © 2026






Journal Info

Abbrev

sinkron

Publisher

Subject

Computer Science & IT

Description

Scope of SinkrOns Scientific Discussion 1. Machine Learning 2. Cryptography 3. Steganography 4. Digital Image Processing 5. Networking 6. Security 7. Algorithm and Programming 8. Computer Vision 9. Troubleshooting 10. Internet and E-Commerce 11. Artificial Intelligence 12. Data Mining 13. Artificial ...