TEKNIK INFORMATIKA
Vol. 17 No. 1: JURNAL TEKNIK INFORMATIKA

A Comparative Analysis of Random Forest, XGBoost, and LightGBM Algorithms for Emotion Classification in Reddit Comments

Anggraini, Nenny (Unknown)
Putra, Syopiansyah Jaya (Unknown)
Wardhani, Luh Kesuma (Unknown)
Arif, Farid Dhiya Ul (Unknown)
Hakiem, Nashrul (Unknown)
Shofi, Imam Marzuki (Unknown)



Article Info

Publish Date
20 May 2024

Abstract

This research aims to compare the performance of three classification algorithms, namely Random Forest, XGBoost, and LightGBM, in classifying emotions in Reddit comments. Emotion classification in Reddit comments is a complex classification problem due to its numerous variations and ambiguities. This research utilizes the GoEmotions Fine-Grained dataset, filtered down to 7,325 Reddit comments with 5 different basic emotion labels. In this study, data preprocessing steps, feature extraction using CountVectorizer and TF-IDF, and hyperparameter tuning using GridSearchCV for each algorithm are conducted. Subsequently, model evaluation is performed using Cross-Validation and confusion matrix. The results of the study indicate that Random Forest outperforms the XGBoost and LightGBM algorithm with an accuracy of 75.38% compared to XGBoost with 69.05% accuracy and LightGBM with 66.63% accuracy.

Copyrights © 2024






Journal Info

Abbrev

ti

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika merupakan wadah bagi insan peneliti, dosen, praktisi, mahasiswa dan masyarakat ilmiah lainnya untuk mempublikasikan artikel hasil penelitian, rekayasa dan kajian di bidang Teknologi Informasi. Jurnal Teknik Informatika diterbitkan 2 (dua) kali dalam ...