JURNAL MEDIA INFORMATIKA BUDIDARMA
Vol 7, No 3 (2023): Juli 2023

Penerapan Metode CRISP-DM dalam Klasifikasi Data Ulasan Pengunjung Destinasi Danau Toba Menggunakan Algoritma Naïve Bayes Classifier (NBC) dan Decision Tree (DT)

Yerik Afrianto Singgalen (Universitas Katolik Indonesia Atma Jaya, Jakarta)



Article Info

Publish Date
31 Jul 2023

Abstract

This study aims to implement a classification method using the Nave Bayes Classifier (NBC) algorithm on Lake Toba visitor review text data. The Cross Industry Standard Process for Data Mining (CRISP-DM) methodology comprises the following stages: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The findings of this study indicate that during the phase of business comprehension, the context of the discussion focuses on the tourism sector, specifically tourist perceptions of the quality of products and services at Lake Toba tourist destinations. At the data comprehension stage, the source of review data used was the Tripadvisor website, which contained as many as 858 reviews with the following rating classification: 8 reviews with abysmal ratings; 22 reviews with poor ratings; 81 reviews with neutral ratings; 304 reviews with good ratings; 443 reviews with excellent ratings. Data cleansing is performed at the data preparation stage so that 382 data are processed by dividing training data by 70 percent and test data by 30 percent. During the modeling phase, the performance of the NBC and DT algorithms was evaluated using and without SMOTE UPsampling operators. The comparison of NBC and DT algorithm values indicates that the model with the best performance is DT using SMOTE UPsampling operators with accuracy values (98.27 percent), precision values (98.83 percent), recall values (97.71 percent), f-measure values (98.26 percent), and AUC values (98.27 percent) (0.982). At the evaluation stage, the importance of excellent service (Quality Human Resources) and supporting infrastructure was highlighted by analyzing the results of ranking the five most frequently used terms in Lake Toba visitor review data (tourism facilities and infrastructure). At the deployment stage, it is necessary to balance the development of attractions, accessibility, lodging, and tourism-supporting amenities to generate visiting intention and revisit motivation to Lake Toba.

Copyrights © 2023






Journal Info

Abbrev

mib

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

Decission Support System, Expert System, Informatics tecnique, Information System, Cryptography, Networking, Security, Computer Science, Image Processing, Artificial Inteligence, Steganography etc (related to informatics and computer ...