Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
Vol 9 No 1 (2025): February 2025

Evaluating Transformer Models for Social Media Text-Based Personality Profiling

Hartanto, Anggit (Unknown)
Ema Utami (Unknown)
Arief Setyanto (Unknown)
Kusrini (Unknown)



Article Info

Publish Date
21 Jan 2025

Abstract

This research aims to evaluate the performance of various Transformer models in social media-based classification tasks, specifically focusing on applications in personality profiling. With the growing interest in leveraging social media as a data source for understanding individual personality traits, selecting an appropriate model becomes crucial for enhancing accuracy and efficiency in large-scale data processing. Accurate personality profiling can provide valuable insights for applications in psychology, marketing, and personalized recommendations. In this context, models such as BERT, RoBERTa, DistilBERT, TinyBERT, MobileBERT, and ALBERT are utilized in this study to understand their performance differences under varying configurations and dataset conditions, assessing their suitability for nuanced personality profiling tasks. The research methodology involves four experimental scenarios with a structured process that includes data acquisition, preprocessing, tokenization, model fine-tuning, and evaluation. In Scenarios 1 and 2, a full dataset of 9,920 data points was used with standard fine-tuning parameters for all models. In contrast, ALBERT in Scenario 2 was optimized using customized batch size, learning rate, and weight decay. Scenarios 3 and 4 used 30% of the total dataset, with additional adjustments for ALBERT to examine its performance under specific conditions. Each scenario is designed to test model robustness against variations in parameters and dataset size. The experimental results underscore the importance of tailoring fine-tuning parameters to optimize model performance, particularly for parameter-efficient models like ALBERT. ALBERT and MobileBERT demonstrated strong performance across conditions, excelling in scenarios requiring accuracy and efficiency. BERT proved to be a robust and reliable choice, maintaining high performance even with reduced data, while RoBERTa and DistilBERT may require further adjustments to adapt to data-limited conditions. Although efficient, TinyBERT may fall short on tasks demanding high accuracy due to its limited representational capacity. Selecting the right model requires balancing computational efficiency, task-specific requirements, and data complexity.

Copyrights © 2025






Journal Info

Abbrev

RESTI

Publisher

Subject

Computer Science & IT Engineering

Description

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) dimaksudkan sebagai media kajian ilmiah hasil penelitian, pemikiran dan kajian analisis-kritis mengenai penelitian Rekayasa Sistem, Teknik Informatika/Teknologi Informasi, Manajemen Informatika dan Sistem Informasi. Sebagai bagian dari semangat ...