Journal of Applied Data Sciences
Vol 5, No 4: DECEMBER 2024

Environment Sentiment Analysis of Bali Coffee Shop Visitors Using Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer 2 (GPT2) Model

Yuniari, Ni Putu Widya (Unknown)
Iswari, Ni Made Satvika (Unknown)
Kumara, I Made Surya (Unknown)



Article Info

Publish Date
15 Oct 2024

Abstract

Bali is one of the provinces with the most abundant natural and cultural wealth in Indonesia. One commodity that supports it is coffee. Bali Coffee is not only a gastronomic identity, but also a cultural identity which makes it have added value to be developed into various business lines. One business derivative that is quite promising is a coffee shop. However, these favorable conditions also need to be maintained to ensure good quality reaches consumers. One thing that can do is analyze reviews from customers. One of the most popular methods is Sentiment Analysis. This technique allows business to analyze customer reviews on social media. It can be a feedback to maintaining and improving quality and good relationships with customers. This research aims to create a machine learning model to analyze customer reviews at several coffee shops in Bali which are divided into three labels, namely: positive, negative and neutral. The methods used are: scraping, cleaning, stopword removal, embedding, undersampling, and modeling. The algorithms used are Bidirectional Encoder Representation from Transformer (BERT) and Generative Pre-trained Transformers (GPT). The performance metrics used in this research are precision, recall, accuracy and loss. This research succeeded in creating a sentiment analysis model for coffee shop customers in Bali. The BERT model obtained an accuracy value of 78% without undersampling with a loss in the 10th iteration of 0.27. Meanwhile, the BERT model with undersampling obtained an accuracy value of 32.85% with a loss in the 10th iteration of 0.16. The GPT2 model without undersampling gets an accuracy of 78% with a loss in the 10th iteration of 0.25. Meanwhile, the GPT model with undersampling obtained an accuracy value of 32.85% with a loss in the 10th iteration of 0.15.

Copyrights © 2024






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...