Budi Prasetiyo
Computer Science Department, FMIPA, Universitas Negeri Semarang

Published : 6 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : Recursive Journal of Informatics

Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) Angga Riski Dwi Saputra; Budi Prasetiyo
Recursive Journal of Informatics Vol. 2 No. 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/7h63ma50

Abstract

Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine. Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments. Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English. Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm. Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.
Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines Grace Yudha Satriawan; Budi Prasetiyo
Recursive Journal of Informatics Vol. 2 No. 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/19mypm04

Abstract

Abstract. The information available on the internet nowadays is diverse and moves very quickly. Information is becoming easier to obtain by the general public with the numerous online media outlets, including news portals that provide up-to-date information insights. Various news portals earn revenue from advertising using pay-per-click methods that encourage article writers to use clickbait techniques to attract visitors. However, the negative effects of clickbait include a decrease in journalism quality and the spread of hoaxes. This problem can be prevented by using text classification to classify clickbait in news titles. One method that can be used for text classification is a neural network. Artificial neural networks use algorithms that can independently adjust input coefficient weights. This makes this algorithm highly effective for modeling non-linear statistical data. The artificial neural network algorithm, especially the Long Short-Term Memory (LSTM), has been widely used in various natural language processing fields with satisfying results, including text classification. To improve the performance of the neural network model, adjustments can be made to the model's hyperparameters. Hyperparameters are parameters that cannot be obtained through data and must be defined before the training process. In this research, the Long Short-Term Memory (LSTM) model was used in clickbait classification in news titles. Sixteen neural network models were trained with different hyperparameter configurations for each model. Hyperparameter tuning was carried out using the random search algorithm. The dataset used was the CLICK-ID dataset published by William & Sari, 2020[1], with a total of 15,000 annotated data. The research results show that the developed LSTM model has a validation accuracy of 0.8030, higher than William & Sari's research, and a validation loss of 0.4876. Using this model, researchers were able to classify clickbait in news titles with fairly good accuracy. Purpose: The study was to develop and evaluate a LSTM model with hyperparameter tuning for clickbait classification on news headlines. The thesis also aims to compare the performance of simple LSTM and bidirectional LSTM for this task. Methods: This study uses CLICK-ID dataset and applies different text preprocessing techniques. The dataset later was used to build and train 16 LSTM models with different hyperparameters and evaluates them using validation accuracy and loss. This study uses random search for hyperparameter tuning. Result: The results of the study show that the best model for clickbait classification on news headlines is a bidirectional LSTM model with one layer, 64 units, 0.2 dropout rate, and 0.001 learning rate. This model achieves a validation accuracy of 0.8030 and a validation loss of 0.4876. The results also show that hyperparameter tuning using random search can improve the performance of the LSTM models by avoiding zero probabilities and finding the optimal values for the hyperparameters. Novelty: This study compares and analyzes the different preprocessing methods on text and the different configurations of the models to find the best model for clickbait classification on news headlines. The study also uses hyperparameter tuning to tune the model into the best model and finding the optimal values for the hyperparameters.
Sentiment Analysist of the TPKS Law on Twitter Using InSet Lexicon with Multinomial Naïve Bayes and Support Vector Machine Based on Soft Voting Salsabila Rahadatul Aisy; Budi Prasetiyo
Recursive Journal of Informatics Vol. 1 No. 2 (2023): September 2023
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/zbxqcc36

Abstract

Abstract. The Indonesian Sexual Violence Law (TPKS Law) is a law that regulates forms of sexual violence. The TPKS Law reaped pros and cons in the drafting process and was officially ratified on April 12th, 2022. However, after being ratified, pros and cons can still be found and supervision is needed over the implementation of the law. Purpose: This study was conducted to identify the application and accuracy of soft voting on multinomial naïve Bayes and support vector machine algorithm, also to find out public opinion on the TPKS Law as a support tool in evaluating the law. Methods/Study design/approach: The method used is InSet lexicon for labeling with the soft voting classification method on the multinomial naive Bayes and support vector machine algorithm. Result/Findings: The accuracy obtained by applying 10 k-fold cross validation in soft voting is 84.31%, which uses a weight of 1:3 for multinomial naive Bayes and support vector machines. Soft voting obtains better accuracy than its standalone predictor, and also works well for sentiment analysis of the TPKS Law. Novelty/Originality/Value: This study using two combined lexicons (Colloquial Indonesian lexicon and the InaNLP formalization dictionary) in normalization process and using InSet lexicon as automatic labeling for sentiment analysis on TPKS Law.