Pardamean Simanjuntak, Daniel Sintong
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Peningkatan Performa Naive Bayes dengan Fitur Chi-Square pada Analisis Sentimen Komentar Pengguna Aplikasi Netflix Jusia, Pareza Alam; Pahlevi, Riza; Pardamean Simanjuntak, Daniel Sintong; Jasmir
Bulletin of Computer Science Research Vol. 5 No. 4 (2025): June 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bulletincsr.v5i4.532

Abstract

This study discusses sentiment analysis using the Naïve Bayes algorithm with Chi-Square. The purpose of this study is to determine the effect of Chi-Square feature selection on the performance of the Naïve Bayes algorithm in analyzing document sentiment. The research data was taken from Netflix Application user comments. Testing was carried out by analyzing document sentiment with and without Chi-Square feature selection. Furthermore, it was evaluated using the accuracy, precision, and recall methods. The results of this study are that the addition of CS features to NB significantly improves all evaluation metrics, especially recall and F1-score, indicating that additional features help improve the model's ability to understand data. The combination of NB + CS with a 70:30 split gives the best results, making it the optimal choice.
Word Embedding Feature for Improvement Machine Learning Performance in Sentiment Analysis Disney Plus Hotstar Comments Jasmir, Jasmir; Nurhadi, Nurhadi; Rohaini, Eni; Pahlevi B, M Riza; Pardamean Simanjuntak, Daniel Sintong
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 10 No. 2 (2024): June
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v10i2.28799

Abstract

In this research we apply several machine learning methods and word embedding features to process social media data, specifically comments on the Disney Plus Hotstar application. The word embedding features used include Word2Vec, GloVe, and FastText. Our aim is to evaluate the impact of these features on the classification performance of machine learning methods such as Naive Bayes (NB), K-Nearest Neighbor (KNN), and Random Forest (RF). NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computations, memory limitations, and ignoring irrelevant attributes. Then RF has a weakness, namely that the evaluation value can change significantly with just a slight change in the data. Feature selection in text classification is crucial for enhancing scalability, efficiency, and accuracy. Our testing results indicate that KNN achieved the highest accuracy both before and after feature selection. The FastText feature led to the highest performance for KNN, yielding balanced accuracy, precision, recall, and F1-score values.