Garuda - Garba Rujukan Digital

Jurnal Informatika: Jurnal Pengembangan IT

Vol 10, No 2 (2025)

Adiuntoro, Alwan (Unknown)
Hendrawan, Aria (Unknown)

Publish Date
30 Apr 2025

Text classification is a fundamental task in Natural Language Processing (NLP) that supports the categorization of data based on predefined labels. This study aims to evaluate the effectiveness of keyword-based labeling and sentiment analysis methods for text classification using the Quora Questions dataset. The dataset comprises 16,921 samples with imbalanced class distribution, where the opinion category dominates, while the hypothetical category is a minority class. The labeling process utilized a keyword-based approach for the fact and hypothetical categories, while the opinion category was labeled using sentiment analysis with the Vader Lexicon library. TF-IDF was employed as the feature representation method, with two approaches explored: n-gram range tuning (1–3) and without tuning. ComplementNB, designed for handling imbalanced datasets, was utilized for classification, with a training-test split of 70:30. The results show that the approach without n-gram tuning achieved the highest accuracy of 93.89%, with zero variance in cross-validation. Evaluation revealed that ComplementNB effectively handles class imbalance, as demonstrated by high precision and recall in the minority class. This study demonstrates that a simple approach combining keyword-based labeling and sentiment analysis can be effectively implemented for category-based text classification tasks, particularly in platforms like Quora. These findings are relevant for similar applications requiring real-time text classification with minimal complexity.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Jurnal Informatika: Jurnal Pengembangan IT

Website

Abbrev

informatika

Publisher

Politeknik Harapan Bersama Tegal

Subject

Computer Science & IT

Description

The scope encompasses the Informatics Engineering, Computer Engineering and information Systems., but not limited to, the following scope: 1. Information Systems Information management e-Government E-business and e-Commerce Spatial Information Systems Geographical Information Systems IT Governance ...

Article Info

Abstract

Klasifikasi Pertanyaan Quora Menggunakan Metode Keyword-based dan Analisis Sentimen dengan ComplementNB

Article Info

Abstract