Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Vol 5 No 11 (2021): November 2021

Perbandingan Pembobotan Term Frequency-Inverse Document Frequency dan Term Frequency-Relevance Frequency terhadap Fitur N-Gram pada Analisis Sentimen

Randy Ramadhan (Fakultas Ilmu Komputer, Universitas Brawijaya)
Yuita Arum Sari (Fakultas Ilmu Komputer, Universitas Brawijaya)
Putra Pandu Adikara (Fakultas Ilmu Komputer, Universitas Brawijaya)



Article Info

Publish Date
18 Oct 2021

Abstract

Sentiment analysis is a method used to extract sentiments in sentences based on their content. Sentiment analysis is a method in text mining that uses a text preprocessing process after which there is a process, namely word weighting. Term Frequency-Inverse Document Frequency (TF-IDF) is the most popular word-weighting method from the unsupervised term weighting category reported which is not suitable for grouping texts. Term Frequency-Relevance Frequency (TF-RF) is a method of combining TF and RF with the aim of getting better performance, this method focuses on all documents that contain terms or do not contain terms. Twitter is a place for people to express their thoughts about the pandemic they are experiencing. Reviews about employees being sent home on Twitter need to be classified into positive, negative, and neutral reviews, which are useful for companies and government consideration to make decisions in PSBB policies. There are several stages of this research, namely preprocessing for document processing, and using unigram and bigram features as well as word weighting using the TF-IDF and TF-RF methods in classification using the K-Nearest Neighbor classification method. The data used were 246 training data and 90 test data. The best results from the evaluation comparisons obtained are using TF.RF word weighting with the unigram feature in the KNN classification with a value of K = 3, namely accuracy of 0.677, precision of 0.526, recall of 0.654, and f-measure of 0.583. Bigram value does not have a big effect in this study because the best f-measure value is obtained Bigram with a value of 0.591, and the best unigram value is with a value of 0.583.

Copyrights © 2021






Journal Info

Abbrev

j-ptiik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...