Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Vol 4 No 1 (2020): Januari 2020

Pembentukan Daftar Stopword menggunakan Zipf Law dan Pembobotan Augmented TF - Probability IDF pada Klasifikasi Dokumen Ulasan Produk

Destin Eva Dila Purnama Sari (Fakultas Ilmu Komputer, Universitas Brawijaya)
Yuita Arum Sari (Fakultas Ilmu Komputer, Universitas Brawijaya)
Muhammad Tanzil Furqon (Fakultas Ilmu Komputer, Universitas Brawijaya)



Article Info

Publish Date
09 Mar 2020

Abstract

Stopword is an insignificant word contained in a sentence. Stopword was used to help the text preprocessing stage, especially in the stopword removal stage. Digital library was often used at this stage to get a stopword list. However, not all stopword lists in the digital library were words that were not important in the data. The main focus in this research was to find out forming stopword list and word weighting on the document classification of product review using the Zipf Law method. The method used for word weighting was Augmented Term Frequency - Probability Inverse Document Frequency. The document classification process aimed to find out the effect of forming stopword list and word weighting. Document classification using the Support Vector Machine algorithm and Polynomial Kernel. The output of the research was the result of classification accuracy. Based on the result of classification accuracy, there was an effect of forming a stopword list and weighting of words on the classification result. The best accuracy result of the document classification was found at a percentage of 15% for forming stopword list taken from term that has low constant result. The resulting accuracy consisted of a precision value of 0.73, a recall value of 0.7 and a f-measure value of 0.63.

Copyrights © 2020






Journal Info

Abbrev

j-ptiik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...