Imbalanced data is a challenge for the performance of classification algorithms. A situation where two classes consisting of the majority class dominate the minority class. As a result, algorithmic models tend to have high accuracy against the majority class. Imbalanced data can occur on any type of data, including data coming from Twitter. Twitter is one of the social media that is widely used to think about various things, including about the future Presidential candidate of the Republic of Indonesia in 2024. Tweet data was collected from October 8, 2022, to January 10, 2023. Anies Baswedan has a total of 34,962 tweets, Ganjar Pranowo 39,796 tweets, and Prabowo Subianto 12,398 tweets. These tweets can be identified to be categorized into positive sentiments and negative sentiments using several classification algorithm methods, namely Decision Tree, Naïve Bayes, and Deep Learning. The dataset comes from the tweets of Twitter netizens who are scraped and preprocessed using the RapidMiner tool. Prabowo Subianto's dataset achieved the best performance using the Deep Learning model with an accuracy rate of 85.42%, precision of 63.30%, recall of 91.77%, and AUC of 0.867.
Copyrights © 2023