International Journal of Advances in Intelligent Informatics
Vol 8, No 3 (2022): November 2022

Semi-supervised learning for sentiment classification with ensemble multi-classifier approach

Agus Sasmito Aribowo (Universitas Pembangunan Nasional "Veteran" Yogyakarta Indonesia & FTMK UTeM Melaka Malaysia)
Halizah Basiron (Fakulti Teknologi Maklumat dan Komunikasi, Universiti Teknikal Malaysia Melaka)
Noor Fazilla Abd Yusof (Fakulti Teknologi Maklumat dan Komunikasi, Universiti Teknikal Malaysia Melaka)



Article Info

Publish Date
30 Nov 2022

Abstract

Supervised sentiment analysis ideally uses a fully labeled data set for modeling. However, this ideal condition requires a struggle in the label annotation process. Semi-supervised learning (SSL) has emerged as a promising method to avoid time-consuming and expensive data labeling without reducing model performance. However, the research on SSL is still limited and its performance needs to be improved. Thus, this study aims to create a new SSL-Model for sentiment analysis. The Ensemble Classifier SSL model for sentiment classification is introduced. The research went through pre-processing, vectorization, and feature extraction using TF-IDF and n-grams. Support Vector Machine (SVM) or Random Forest for tokenization was used to separate unigram, bigram, and trigram in model generation. Then, the outputs of these models were combined using stacking ensemble approach. Accuracy and F1-score were used for the evaluation. IMDB datasets and US Airlines were used to test the new SSL models. The conclusion is that the sentiment annotation accuracy is highly dependent on the suitability of the dataset with the machine learning algorithm. In IMDB dataset, which consists of two classes, it is better to use SVM. In the US Airlines consisting of three classes, SVM is better at improving the model performance against the baseline, but RF is better at achieving the baseline performance even though it fails to maintain the model performance.

Copyrights © 2022






Journal Info

Abbrev

IJAIN

Publisher

Subject

Computer Science & IT

Description

International journal of advances in intelligent informatics (IJAIN) e-ISSN: 2442-6571 is a peer reviewed open-access journal published three times a year in English-language, provides scientists and engineers throughout the world for the exchange and dissemination of theoretical and ...