Claim Missing Document
Check
Articles

Enhancing Review Processing in the Video Game Adaptation Domain through VADER and Rating-Based Labeling using SVM Sajmira, Danita Divka; Umam, Khothibul; Handayani, Maya Rini
Jurnal Sisfokom (Sistem Informasi dan Komputer) Vol. 14 No. 3 (2025): JULY
Publisher : ISB Atma Luhur

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32736/sisfokom.v14i3.2409

Abstract

The adaptation of video games into films or television series has increasingly become a prominent trend in the entertainment sector, often eliciting diverse reactions from audiences.A prime example is The Last of Us, a video game adaptation series that generated substantial online discussions and sentiment, and serves as the specific case study in this research. Sentiment patterns found in audience reviews of The Last of Us on IMDb are analyzed using a domain-specific classification framework tailored to the language characteristics of entertainment media. A key issue addressed is the discrepancy between numerical ratings and the sentiment conveyed in review texts, which may lead to inconsistent labeling. The study employs a machine learning technique, Support Vector Machine (SVM), coupled with two distinct labeling methods: manual labeling based on IMDb ratings, and automatic labeling using the lexicon-driven VADER tool. A total of 2,017 English reviews of The Last of Us were gathered via web scraping from IMDb, followed by preprocessing, TF-IDF feature extraction, and hyperparameter optimization using RandomizedSearchCV. These results show that the SVM model trained on VADER-labeled data achieved an accuracy of 0.97, outperforming the model trained on manually labeled data at 0.79. Lexicon-based automatic labeling provides more consistent and reliable sentiment classification, particularly in specialized domains like video game adaptation reviews. Integrating VADER labeling with SVM enhances sentiment analysis effectiveness and offers practical value for media analytics, content creation, and audience insight research.
Sentiment Classification of MyPertamina Reviews Using Naïve Bayes and Logistic Regression Dwi Yuni Saraswati; Handayani, Maya Rini; Umam, Khothibul; Mustofa, Mokhamad Iklil
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9723

Abstract

This research conducts a comparative evaluation of the effectiveness of the Naïve Bayes and Logistic Regression algorithms in mapping public perceptions of the MyPertamina application on the Google Play Store. The data consists of 2,000 user reviews obtained through a scraping technique. The research steps include labeling the reviews as positive or negative, followed by pre-processing and TF-IDF weighting. The dataset was systematically divided into two parts, with 80% allocated for model training and the remaining 20% for evaluation. The Naïve Bayes and Logistic Regression models were implemented using the Python programming language and evaluated based on accuracy, precision, recall, and F1-score metrics. The analysis shows that Logistic Regression achieved an accuracy of 86%, while Naïve Bayes achieved 81%. Logistic Regression demonstrated superior performance as it effectively captures linear relationships between features in TF-IDF representations and provides a more balanced outcome in terms of precision and recall. In contrast, Naïve Bayes is more influenced by high-frequency word distributions and does not account for feature correlations, which can limit its performance in certain contexts. Therefore, Logistic Regression is considered more suitable for sentiment classification tasks in this study. These findings emphasize the importance of selecting appropriate algorithms for sentiment analysis and suggest opportunities for future research using alternative methods to enhance predictive accuracy.
IKN Public Opinion on TikTok Before and After Efficiency Policy: CNN-LSTM on Imbalanced Data Sufiya, Ikhwanus; Umam, Khotibul; Handayani, Maya Rini
Jurnal Pendidikan Informatika (EDUMATIC) Vol 9 No 2 (2025): Edumatic: Jurnal Pendidikan Informatika
Publisher : Universitas Hamzanwadi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29408/edumatic.v9i2.30123

Abstract

Growing polarization in Ibu Kota Nusantara (IKN) stems from conventional sentiment analysis tools’ inability to decode TikTok’s contextual complexities, particularly multimodal sarcasm and vernacular-policy relationships (e.g., mangkrak for project cancellations). This study develops a policy-aware hybrid model (CNN-BiLSTM + Policy Knowledge Graph) to decode TikTok’s multimodal sarcasm and vernacular-policy links (e.g., mangkrak), enabling: youth sentiment quantification post-IKN’s 73.3% budget cuts, social criticism-socio-political reality mapping, and evidence-based interventions mitigating Global South strategic project polarization. Using the Knowledge Discovery in Databases framework, we analyzed 2,950 high-engagement TikTok comments (≥10 interactions) from verified accounts (@Polindo.id and @geraldvincentt) across two periods: pre-policy (June-August 2024) and post-policy (January-March 2025). Methodologically, slang normalization, stemming, and minority-class weighting (15×) preceded classification via a CNN-BiLSTM architecture integrated with Policy Knowledge Graphs. Results showed an 18.88% reduction in negative sentiment (83.2%-8.7%), model accuracy of 94.13% (AUC-PR 0.91), and strong correlations between vernacular terms (e.g., mandek [stagnation]) and policy outcomes (r = -0.89; p < 0.01), with investor asing mentions surging 463% post-policy. These validate deep learning-enabled social listening for real-time policy diagnostics, with implications for fiscal transparency dashboards, algorithmic bias mitigation, and context-driven policy communication prioritizing vulnerable groups in SDG infrastructure governance.
Implementasi Algoritma Random Forest dalam Klasifikasi Ulasan Pengunjung Mall Semarang untuk Pengambilan Keputusan Layanan Maizaliyanti, Annisa; Umam, Khothibul; Yuniarti, Wenty Dwi; Handayani, Maya Rini
Jurnal Pendidikan Informatika (EDUMATIC) Vol 9 No 2 (2025): Edumatic: Jurnal Pendidikan Informatika
Publisher : Universitas Hamzanwadi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29408/edumatic.v9i2.30379

Abstract

Visitor preferences for malls in Semarang are not optimal because bold reviews have not been utilized optimally in decision making. Our research aims to classify the sentiment of Google Maps reviews from 13 malls in Semarang with a total of 2,600 reviews. Labeling is done manually based on ratings, where ratings 1–3 are considered negative reviews and 4–5 as positive reviews. The classification method used is Random Forest because the ensemble approach (bagging) provides optimal results. The research process includes data collection, labeling, cleaning, data sharing, classification, and model evaluation. The data used is unbalanced and dominated by positive reviews, so the Synthetic Minority Over-sampling Technique (SMOTE) technique was applied. The overall accuracy before and after SMOTE remained the same at 84%. However, the model's performance in detecting negative reviews increased from 27% to 44% in recall and F1-score from 0.40 to 0.52, but these values ​​are still relatively low. Java Supermall Semarang is the mall with the best reviews, with a classification accuracy reaching 90%. This model is better at recognizing positive reviews, but less reliable for negative reviews. Therefore, its use as a decision-making preference needs to be done with caution. This research opens up opportunities for further development, including the use of other models such as BERT which are superior in understanding context and language in reviews.
Klasifikasi sentimen pada ulasan pengguna aplikasi Cryptocurrency di Google Play Store menggunakan algoritma Decision Tree Tsuroyya, Kamiliya; Umam, Khothibulu; Yuniarti, Wenty Dwi; Handayani, Maya Rini
AITI Vol 22 No 2 (2025)
Publisher : Fakultas Teknologi Informasi Universitas Kristen Satya Wacana

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24246/aiti.v22i2.279-293

Abstract

Cryptocurrency has become a trend in digital investment. The Pintu application exemplifies the use of digital technology for trading cryptocurrency assets. Reviews from the Google Play Store serve as an important source of data to understand the opinions of Pintu application users. This study focuses on investigating the sentiment analysis of Pintu application users sourced from the Google Play Store by implementing the Decision Tree and Random Forest algorithms. The approach used involves collecting data from the Google Play Store, which contains user reviews and ratings. The data is then labeled as positive or negative and cleaned, processed, and analyzed using Decision Tree and Random Forest algorithms. The results of the study showed that the accuracy of the Decision Tree reached 0.90, while the Random Forest achieved an accuracy of 0.88. From these results, it can be concluded that the Decision Tree is superior in classifying text mining with high accuracy. The difference between the two methods is insignificant in terms of accuracy, specifically for Decision Tree, with an accuracy of 0.90, Precision of 0.91, and recall of 0.95, and Random Forest, with an accuracy of 0.88, precision of 0.87, and recall of 0.95. User sentiment analysis of the Pintu application provides a positive response to using the Pintu application.
Analisis Sentimen Ulasan Mobile Legends di Google Play Store dan YouTube Menggunakan Pelabelan Otomatis Roberta dan Klasifikasi Random Forest Muhammad Rafid Pratama; Handayani, Maya Rini; Yuniarti, Wenty Dwi; Khothibul Umam
Jurnal Sistem Informasi Vol. 12 No. 2 (2025)
Publisher : Universitas Serang Raya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30656/jsii.v12i2.10459

Abstract

Perkembangan industri game mobile telah mendorong meningkatnya jumlah pengguna dan ulasan terhadap berbagai judul populer, salah satunya Mobile Legends: Bang Bang. Penelitian ini bertujuan untuk menganalisis persepsi pengguna terhadap aplikasi Mobile Legends melalui ulasan yang diperoleh dari Google Play Store dan YouTube. Metode yang digunakan meliputi pengambilan data secara crawling, pelabelan otomatis menggunakan model RoBERTa untuk klasifikasi sentimen (positif, negatif, dan netral), serta pemodelan menggunakan algoritma Random Forest. Dataset terdiri dari 1.400 data dari Google Play Store dan ratusan data dari YouTube yang telah melalui proses pra-pemrosesan. Evaluasi model menggunakan metrik precision, recall, dan f1-score. Hasil pengujian menunjukkan bahwa model mampu mengklasifikasikan ulasan dengan cukup baik, dengan akurasi sebesar 80% pada data Google Play Store dan 82% pada data YouTube. Model menunjukkan performa tinggi dalam mendeteksi ulasan negatif dan positif, meskipun akurasi untuk kelas netral masih rendah. Secara keseluruhan, model berbasis Random Forest cukup andal dalam mengolah data ulasan pengguna, dan memberikan wawasan mengenai persepsi masyarakat terhadap Mobile Legends di berbagai platform.
Evaluasi Efektivitas Support Vector Machine dan Random Forest dalam Klasifikasi Ulasan Pengguna Aplikasi Streaming Vidio Fastabiqul Khusna; Khothibul Umam; Siti Nur'aini; Maya Rini Handayani
Jurnal Sistem Informasi Vol. 12 No. 2 (2025)
Publisher : Universitas Serang Raya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30656/jsii.v12i2.10495

Abstract

Perkembangan pesat platform streaming telah menghasilkan banyak ulasan pengguna yang dapat dimanfaatkan sebagai sumber masukan untuk pengembangan aplikasi. Penelitian ini dilakukan untuk mengevaluasi kinerja algoritma Support Vector Machine (SVM) dan Random Forest (RF) dalam mengklasifikasikan sentimen ulasan pengguna terhadap aplikasi Vidio. Sebanyak 1.000 ulasan berbahasa Indonesia dikumpulkan menggunakan teknik web scraping dan diberi label sentimen berdasarkan rating bintang, di mana rating 1–2 dikategorikan sebagai sentimen negatif dan 3–5 sebagai sentimen positif. Data ulasan diproses melalui beberapa tahap preprocessing, seperti pembersihan teks, tokenisasi, penghapusan stopword, dan stemming, sebelum dikonversi menjadi representasi numerik menggunakan metode TF-IDF. Dataset dibagi menjadi 80% data latih dan 20% data uji. Kedua model dilatih dan dievaluasi menggunakan metrik accuracy, precision, recall, dan F1-score. Hasil penelitian menunjukkan bahwa performa yang lebih unggul diperoleh oleh algoritma SVM, dengan akurasi mencapai 76,11%, dibandingkan dengan RF yang memperoleh akurasi sebesar 71,67%. Selain itu, identifikasi ulasan dengan sentimen negatif juga dilakukan dengan lebih efektif oleh SVM. Temuan ini membuktikan bahwa klasifikasi sentimen ulasan aplikasi Vidio lebih tepat dilakukan menggunakan SVM, sehingga berpotensi mendukung otomatisasi analisis sentimen dan peningkatan kualitas layanan streaming. Hasil ini dapat diimplementasikan dalam sistem dashboard otomatis untuk mendeteksi keluhan pengguna secara real-time, memungkinkan pengembang Vidio meningkatkan pengalaman pengguna dengan respons yang lebih cepat dan tepat. Kata Kunci: Text Classification, User Sentiment, Support Vector Machine, Random Forest, Vidio.
EVALUASI HYPERPARAMTER TUNING PADA SUPPORT VECTOR MACHINE (SVM) DALAM KLASIFIKASI ULASAN HOTEL DI TRIPADVISOR Dewi, Fiashintha; Wibowo, Nur Cahyo Hendro; Handayani, Maya Rini; Umam, Khothibul
JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) Vol 10, No 3 (2025)
Publisher : STKIP PGRI Tulungagung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29100/jipi.v10i3.7774

Abstract

Dengan adanya perkembangan teknologi para wisatawan sangat dimudahkan dalam mengakses informasi mengenai pemesanan kamar hotel. Dengan adanya hal tersebut, maka ulasan dari pengguna lain sangatlah penting untuk menemukan tempat yang mereka inginkan. Studi ini membahas tentang analisa ulasan para wisatawan mengenai hotel pada Tripadvisor. Tripadvisor adalah salah satu platform pan-duan wisata terbesar di dunia, yang menawarkan wisatawan untuk merencakan serta memperoleh perjalanan memuaskan. Data diambil melalui website Hugging Face yang kemudian dilanjutkan dengan proses pre-processing data. Dataset yang digunakan berjumlah 20.491 ulasan, terdiri dari 15.093 ulasan positif dan 5.938 ulasan negatif. Tujuan dari penelitian ini untuk mengevaluasi performa model SVM dalam melakukan klasifikasi sentimen pada ulasan hotel di Tripadvi-sor. Untuk mengoptimalkan performa model, dilakukan hyperparame-ter tuning menggunakan metode GridSearchCV. Hasil menunjukkan bahwa model default SVM memiliki akurasi 91%, namun recall pada kelas negatif masih rendah (0,75). Setelah tuning, akurasi sedikit menurun menjadi 90%, tetapi recall kelas negatif meningkat menjadi 0,77. Model terbaik diperoleh pada kombinasi parameter C = 10, gamma = 0,01, dan kernel = linear, dengan precision 0,92, recall 0,94, dan f1-score 0,80. Tuning terbukti meningkatkan keseimbangan klas-ifikasi antar kelas dan sensitivitas terhadap ulasan negatif. Hasil ini menegaskan pentingnya hyperparameter tuning dalam mengoptimal-kan performa dan generalisasi model SVM pada analisis sentimen dengan data yang tidak seimbang.
Perbandingan Klasifikasi Single-Label dan Multi-Label Ulasan Pengguna Lapangan Futsal di Semarang Menggunakan SVM Syifa, Achrijal Shohib Arya; Umam, Khotibul; Handayani, Maya Rini; Aini, Siti Nur
InComTech : Jurnal Telekomunikasi dan Komputer Vol 15, No 2 (2025)
Publisher : Department of Electrical Engineering

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22441/incomtech.v15i2.33905

Abstract

Futsal merupakan cabang olahraga yang semakin populer di seluruh Indonesia, termasuk di Semarang. Penelitian ini bertujuan untuk melakukan klasifikasi sentimen ulasan pengguna mengenai lapangan futsal di Kota Semarang menggunakan metode Support Vector Machine. Data penelitian diperoleh melalui scraping ulasan Google Maps dengan ekstensi Chrome “Instant Data Scraper” dan terdiri dari 1.189 ulasan. Proses penelitian mencakup pengumpulan data, Cleaning dan pre-processing (normalisasi teks, modifikasi data, tokenisasi, stop word filtering, stemming), pelabelan (single label dan multi label), pembagian data (80% pelatihan dan 20% pengujian), pemodelan menggunakan SVM (single label dengan GridSearchCV dan multi label dengan One-vs-Rest Classifier), serta evaluasi model dengan metrik presisi, recall, dan F1-Score. Hasil menunjukkan pemodelan Support Vector Machine single-label mencapai presisi 0,84, recall 0,73, dan F1-Score 0,78. Sementara pemodelan Support Vector Machine multi-label mencapai presisi 0,96, recall 0,88, dan F1-Score 0.92. Dari ulasan yang dinalisis, sebaran data pada single-label maupun multi-label menunjukan dominasi ulasan kategori Fasilitas, menegaskan bahwa Fasilitas merupakan kategori yang paling sering dikomentari oleh pengguna. Temuan ini tidak hanya memberikan wawasan praktis bagi pengelola lapangan futsal, tetapi juga berkontribusi pada pengembangan metode klasifikasi ulasan berbasis machine learning dalam domain analisis opini, khususnya dalam membandingkan performa pendekatan single-label dan multi-label pada data multi-kategori di bidang teknologi informasi.
Identification of Buzzers in Skincare Reviews Using a Lexicon-Based Sentiment Analysis Method Pramesti, Arfiana Diah; Umam, Khothibul; Handayani, Maya Rini
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.11005

Abstract

Along with the rapid development of digital technology, social media has become the main platform for consumers to share experiences about products, including skincare products. However, it is not uncommon for reviews provided by users to not reflect authentic experiences, but rather reviews created by certain parties, or buzzers, to manipulate public perception. The presence of buzzers in skincare reviews is important to consider, as they can affect consumer trust and influence purchasing decisions. This study aims to identify the presence of buzzers in skincare product reviews using a lexicon dictionary-based sentiment analysis. Of the 529 comments analyzed, 75 comments showed negative sentiment and 454 comments showed positive sentiment. The classification results revealed that 85.8% of the comments belonged to the non-buzzer category, while 14.2% were indicated as buzzers. Evaluation of the classification model showed high accuracy, reaching 93%, but performance in detecting buzzers was limited, with a recall metric of only 0.50. This shows that while the model managed to classify non-buzzer comments well, there are still difficulties in identifying buzzer comments, mostly due to data imbalance. This research emphasizes the importance of a proper analytical approach in detecting inauthentic reviews to ensure the information consumers receive remains accurate, transparent, and accountable.