Rizal Aditya Nugroho
Fakultas Ilmu Komputer, Universitas Brawijaya

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Implementasi Naive Bayes Classifier untuk Klasifikasi Emosi Tweet Berbahasa Indonesia pada Spark Rizal Aditya Nugroho; Imam Cholissodin; Indriati Indriati
Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer Vol 5 No 1 (2021): Januari 2021
Publisher : Fakultas Ilmu Komputer (FILKOM), Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Emotion is a natural thing that every human being has because it is a response to an event. Because emotions are owned by every human being, classifying emotions has many benefits, for example, for identifying customer complaints. Emotions can be found in textual sources such as tweets. Tweet data on Twitter itself has a size that is growing every year and a system that classifies emotions on tweets is needed that can handle the growing data quickly and accurately. In this study the classification is carried out using the Naive Bayes Classifier algorithm and also the Spark framework with the process starting from preprocessing, training to find prior and likelihood values, ​​then testing to find posterior values ​​and performing classification, and finally calculating accuracy. The Spark framework itself is used to do work in parallel for faster computing time. Based on the test results from tweet data on June 1, 2018 to June 14, 2018, the accuracy of the Naive Bayes Classifier method for the classification of Indonesian tweets on Spark has the highest average value of 0,892 when the percentage is 90% training data and 10% test data. Then the highest average value is 0,880 when using smoothing. And finally, the highest average value is 0.888 when using constant priors. Comparison of execution times from using Spark and sequentially has a very large difference that it is almost 165 times faster on Spark. In Spark, the execution time takes an average of 0,525 seconds, while in the sequential method it takes 86,564 seconds on average.