Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Vol 5 No 1 (2021): Januari 2021

Implementasi Naive Bayes Classifier untuk Klasifikasi Emosi Tweet Berbahasa Indonesia pada Spark

Rizal Aditya Nugroho (Fakultas Ilmu Komputer, Universitas Brawijaya)
Imam Cholissodin (Fakultas Ilmu Komputer, Universitas Brawijaya)
Indriati Indriati (Fakultas Ilmu Komputer, Universitas Brawijaya)



Article Info

Publish Date
26 Jan 2021

Abstract

Emotion is a natural thing that every human being has because it is a response to an event. Because emotions are owned by every human being, classifying emotions has many benefits, for example, for identifying customer complaints. Emotions can be found in textual sources such as tweets. Tweet data on Twitter itself has a size that is growing every year and a system that classifies emotions on tweets is needed that can handle the growing data quickly and accurately. In this study the classification is carried out using the Naive Bayes Classifier algorithm and also the Spark framework with the process starting from preprocessing, training to find prior and likelihood values, ​​then testing to find posterior values ​​and performing classification, and finally calculating accuracy. The Spark framework itself is used to do work in parallel for faster computing time. Based on the test results from tweet data on June 1, 2018 to June 14, 2018, the accuracy of the Naive Bayes Classifier method for the classification of Indonesian tweets on Spark has the highest average value of 0,892 when the percentage is 90% training data and 10% test data. Then the highest average value is 0,880 when using smoothing. And finally, the highest average value is 0.888 when using constant priors. Comparison of execution times from using Spark and sequentially has a very large difference that it is almost 165 times faster on Spark. In Spark, the execution time takes an average of 0,525 seconds, while in the sequential method it takes 86,564 seconds on average.

Copyrights © 2021






Journal Info

Abbrev

j-ptiik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...