Claim Missing Document
Check
Articles

PENINGKATAN KINERJA SUPPORT VECTOR MACHINE MENGGUNAKAN MODEL BAHASA BERT UNTUK KLASIFIKASI SENTIMEN DENGAN DATASET TERBATAS Iffa, Marwika Rifattul; Agustian, Surya; Safaat, Nazruddin; Irsyad, Muhammad
ZONAsi: Jurnal Sistem Informasi Vol. 7 No. 2 (2025): Publikasi artikel ZONAsi: Jurnal Sistem Informasi Periode Mei 2025
Publisher : Universitas Lancang Kuning

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31849/zn.v7i2.26847

Abstract

Media sosial kini menjadi ruang penting bagi masyarakat untuk mengekspresikan opini secara terbuka terhadap berbagai isu terkini, salah satunya melalui platform X  yang populer di kalangan pengguna internet. Platform ini sering digunakan sebagai sumber data klasifikasi sentimen guna mengungkap persepsi masyarakat terhadap peristiwa-peristiwa yang terjadi, khususnya di bidang politik dan pemerintahan. Namun, keterbatasan dataset menjadi tantangan utama dalam proses klasifikasi karena kondisi tersebut dapat mempengaruhi akurasi dan validitas sentimen yang dihasilkan. Untuk mengatasi permasalahan tersebut, penelitian ini mengusulkan kombinasi algoritma Support Vector Machine (SVM) dengan fitur Bidirectional Encoder Representations from Transformers (BERT) yang terbukti efektif dalam menangkap konteks bahasa secara mendalam. Pendekatan ini bertujuan untuk meningkatkan performa klasifikasi sentimen terkait pengangkatan Kaesang Pangarep sebagai Ketua Umum Partai Solidaritas Indonesia (PSI) pada media sosial X. Metode penelitian meliputi tahap preprocessing text, ekstraksi fitur menggunakan BERT, serta penerapan SVM dalam proses klasifikasi sentimen. Hasil eksperimen menunjukkan bahwa model kombinasi tersebut berhasil meningkatkan F1-Score secara signifikan sebesar 3% pada data uji. Hal ini menandakan model bahasa BERT dapat meningkatkan performa SVM dalam klasifikasi sentimen
Classification of Covid-19 Vaccine Sentiment Using K-Nearest Neighbor and Fasttext on Twitter Safrizal, Afri Naldi; Surya Agustian
EKSAKTA: Berkala Ilmiah Bidang MIPA Vol. 25 No. 03 (2024): Eksakta : Berkala Ilmiah Bidang MIPA (E-ISSN : 2549-7464)
Publisher : Faculty of Mathematics and Natural Sciences (FMIPA), Universitas Negeri Padang, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/eksakta/vol25-iss03/384

Abstract

 In late 2019 came a flu-like illness that infected the lungs in the city of Wuhan. It is suspected that the disease is suspected to have originated in bats. WHO named this disease Covid-19 and the virus spread throughout the world, causing a pandemic. The government took a vaccination drive to overcome this virus, but received a response of pros and cons from the public. There are many studies that discuss people's sentiments towards vaccination, one of which is the classification of sentiments. This study discusses the classification of sentiment towards covid-19 vaccines using the K-Nearest Neighbor and Fasttext algorithms on twitter. Data is obtained by crawling using the pyton programming language and Twitter API.  Data labeling is carried out by crowdsourcing and majority voting techniques.  The data used after the balancing process are 6000 training data, 778 development data and 400 test data.  The test results after various experiments and feature engineering got the best results with an accuracy value of 69% and an f1-score of 60%. This result is the best result compared to previous studies with the same dataset.
Perbandingan Performa Klasifikasi Terjemahan Al-Qur'an Menggunakan Metode Random Forest dan Long Short Term Memory Aftari, Dhea Putri; Safaat, Nazruddin; Agustian, Surya; Yusra, Yusra; Afrianty, Iis
Journal of Computer System and Informatics (JoSYC) Vol 5 No 3 (2024): May 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i3.5156

Abstract

This study focuses on the use of the Qur'an as the primary source of Islamic teachings, aiming to facilitate Muslims' understanding of its content. To achieve this, the classification of translated Qur'anic verses was conducted. Two methods that are rarely used for Qur'anic translation data are Random Forest (RF) and Long Short Term Memory (LSTM) due to their ability to process large and complex data. The data used in this study are translations of the Qur'an that have been classified into 15 topics by previous research, but this study will only focus on 6 topics. The objective of this research is to compare the performance of RF and LSTM in classifying Qur'anic translations into 6 different categories. The results show that in the preaching category, LSTM consistently outperformed RF, with an F1-Score of 57.3% and an accuracy of 96.8%, whereas RF achieved an F1-Score of 49.4% and an accuracy of 97.5%. These findings indicate that LSTM has better performance, especially with proper preprocessing, optimal parameter tuning, and balanced data. This study provides important insights into the development of classification models for Qur'anic translation texts, highlighting the importance of proper preprocessing and parameter tuning.
Pengaruh Penyeimbangan Data Pada Klasifikasi Terjemahan Al-Quran Dengan Metode Naïve Bayes dan Long Short Term Memory Ningsih, Sulistia; Safaat, Nazruddin; Agustian, Surya; Yusra, Yusra; Cynthia, Eka Pandu
Journal of Computer System and Informatics (JoSYC) Vol 5 No 3 (2024): May 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v5i3.5181

Abstract

The Al Qur'an is a holy book of Muslims which is a guide to life for all mankind. Studying and understanding the translation of the Al-Quran is not easy, one way that can be done is to classify the translation of Al-Quran verses into existing topics. This research uses Naïve Bayes and LSTM methods in the classification process. The data used comes from translation data of the Al-Quran in Indonesian which has been labeled based on multi-class classification. One of the main problems faced is data imbalance. To overcome this problem, data balancing, text preprocessing, feature construction and feature extraction processes were carried out using the Bag of Words (BoW) and TF.IDF techniques. The research results indicate that the most optimal Naïve Bayes model achieved an average accuracy of 55.39% on test data from juz 30, 61.59% on test data from juz 10-20, and 59.53% on test data from juz 25-28. Meanwhile, the most optimal LSTM model yielded an accuracy of 58.02% on test data from juz 30, 59.64% on test data from juz 10-20, and 58.59% on test data from juz 25-28. The main aim of this research is to improve classification performance and compare the accuracy between naïve Bayes and lstm.
Klasifikasi Sentimen Terhadap Topik Pindah Ibu Kota Negara Pada Twitter Menggunakan Metode Naïve Bayes Classifier Dermawan, Jozu; Yusra, Yusra; Fikry, Muhammad; Agustian, Surya; Oktavia, Lola
Jurnal Sistem Komputer dan Informatika (JSON) Vol. 5 No. 3 (2024): Maret 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/json.v5i3.7475

Abstract

Towards the middle of 2019, President Joko Widodo announced plans to relocate Indonesia's capital city. This caused pros and cons in the community, which were widely observed in various social media. To quickly measure the level of public sentiment towards the policy of moving the National Capital City (IKN), whose construction is already underway, a classification system that has good performance is needed. This research proposes a classification of public sentiment on the topic using the Naïve Bayes Classifier method. The data used in this study amounted to 4000 tweets that have been classified into two classes, namely 2000 positive class data and 2000 negative class data. The purpose of this research is how to apply the Naïve Bayes Classifier method in classifying sentiment on the topic of moving the nation's capital and determine the accuracy level of the method. The application of the Naïve Bayes classification method using TF-IDF features to classify 10% of the data as testing data resulted in an accuracy of 77.00%, for a precision value of 77.06%, recall 77.08% and f1-score of 77.00%. Based on the results achieved, the Naïve Bayes Classifier method is good at text classification tasks, with a fairly good accuracy rate.
Klasifikasi Sentimen Masyarakat Terhadap Kaesang Pangarep pada Media Sosial Twitter/X Menggunakan MLP Classifier dengan Fitur FastText Tarmizi, Veci Cahyono; Agustian, Surya; Okfalisa, Okfalisa; Pizaini, Pizaini
TIN: Terapan Informatika Nusantara Vol 6 No 7 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/tin.v6i7.8815

Abstract

Social media has become a primary channel for the public to express their opinions and reactions toward various political developments in Indonesia. One of the prominent discussions revolves around Kaesang Pangarep’s appointment as the Chairman of the Indonesian Solidarity Party (PSI). This study aims to analyze and classify public sentiment regarding this issue by employing the Multi-Layer Perceptron (MLP) algorithm integrated with FastText-based text representation. The dataset was collected from Twitter using keywords such as “Kaesang PSI”, and was further expanded with additional data from general topics including Covid-19 and Open Topic, ensuring a balanced distribution across positive, neutral, and negative sentiment categories for a more comprehensive representation of public opinion. The model’s performance was evaluated through four metrics: accuracy, precision, recall, and F1 Score. The experimental results demonstrate that the MLP–FastText model achieved consecutive scores of 0. 5129 for F1 Score, 0. 6035 for accuracy, 0. 5319 for precision, and 0. 5996 for recall. These findings indicate that the combination of MLP and FastText effectively captures sentiment patterns within textual data, particularly in the context of unstructured and dynamic social media content, and performs well when enhanced with relevant external data augmentation strategies.
Classification of Phishing URL Attacks Using Random Forest Algorithm Based on Feature Importance Melyana Hasibuan; Rahmad Abdillah; Surya Agustian; Reski Mai Candra
bit-Tech Vol. 8 No. 2 (2025): bit-Tech
Publisher : Komunitas Dosen Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32877/bt.v8i2.3511

Abstract

The development of information technology and increasing digital activities have made URL-based phishing threats more complex and difficult to detect. Phishing attacks target not only individuals but also organizations, requiring detection systems that are accurate, efficient, and capable of handling high-dimensional data. Machine learning approaches, particularly Random Forest, have been widely applied for phishing detection; however, further evaluation is needed regarding the role of feature selection in improving efficiency without reducing performance. This study aims to evaluate the performance of the Random Forest algorithm for phishing URL detection and to analyze the impact of feature selection based on feature importance. This research adopts the Knowledge Discovery in Databases (KDD) framework, including data selection, preprocessing, feature selection, modeling, and evaluation stages. The PhiUSIIL-2024 dataset is used, with two modeling scenarios: Random Forest using all features (RF Full) and Random Forest using the top 30 features selected through feature importance (RF Top-30). Model performance is evaluated using accuracy, precision, recall, and F1-score metrics under different data split ratios. The experimental results show that both models achieve very high and stable classification performance, with evaluation metrics close to or reaching 100%. The RF Top-30 model maintains performance comparable to the RF Full model despite using fewer features. This study concludes that feature importance-based feature selection effectively simplifies the Random Forest model without sacrificing performance, making it suitable for efficient URL phishing detection systems.
Klasifikasi Sentimen Komentar Youtube Tentang Pembatalan Indonesia Sebagai Tuan Rumah Piala Dunia U-20 Menggunakan Algoritma Naïve Bayes Classifer Ilham Habibi Hasibuan; Elvia Budianita; Surya Agustian; Pizaini Pizaini
Jurnal Sistem Komputer dan Informatika (JSON) Vol. 5 No. 2 (2023): Desember 2023
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/json.v5i2.7096

Abstract

Text mining is a method used to perform tasks such as document classification, clustering, information extraction, sentiment analysis, and information retrieval. The Federation Internationale Football Association (FIFA), the international football governing body, has designated Indonesia as the host country for the U-20 World Cup starting in 2019. Indonesia is expected to be the choice venue for the U-20 World Cup in 2021. However, due to the Covid outbreak -19, the World Cup was rescheduled and is now scheduled to take place in 2023. Indonesia officially relinquished its position as host on March 31 2023. One of the reasons is the many factions that oppose the presence of the Israeli national team in Indonesia. As a result, various public reactions responded to Indonesia's decision to cancel holding the U-20 World Cup, especially on the Narasi tv YouTube channel video entitled "The U-20 World Cup Failed to Be Held in Indonesia, Let's Look at it from Two Perspectives | Discussion". Since the video was uploaded until August 16 2023, the total comments generated were 4,629 comments. This research uses a Naïve Bayes classifier approach. Naïve Bayes Classifier (NBC) is a direct probabilistic classifier that exploits Bayes' Theorem under strong independence conditions. The tests carried out show that the model performance when using stopword removal and stemming techniques is superior in classifying classes in the dataset. The F1-Score is 59.70% and the Accuracy value is 63.43%. Furthermore, after identifying the most efficient model for applying naïve Bayes classification, evaluation was carried out on validation data resulting in an F1-Score of 58.72% and an accuracy rate of 61.65%. Classification analysis shows that Indonesian people have a negative view or are disappointed with the cancellation
SENTIMENT CLASSIFICATION OF PUBLIC PERCEPTIONS ON RP200 TRILLION HIMBARA STIMULUS USING NAÏVE BAYES Wan Sobri Amin; Muhammad Fikry; Rahmad Abdillah; Surya Agustian
Jurnal Riset Informatika Vol. 8 No. 2 (2026): Maret 2026
Publisher : Kresnamedia Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34288/jri.v8i2.500

Abstract

The government's policy in the form of a fund stimulus of Rp200 trillion to the Himpunan Bank Milik Negara (HIMBARA) is a strategic step to maintain national economic stability and encourage real sector recovery. However, the implementation of public policy is inseparable from the response and public perception that develops on social media. This study aims to classify public sentiment towards the Rp200 trillion fund stimulus policy to Bank HIMBARA based on Instagram user comments and test the performance of the Naïve Bayes Classifier method in analyzing public policy sentiment. This study uses a quantitative approach with text mining and machine learning methods. Data in the form of 1.309 Instagram comments was collected through web scraping techniques from several online media accounts, then processed through text preprocessing and manual labeling stages into positive, neutral, and negative sentiments. Feature weighting was carried out using TF-IDF, then the data were classified using Multinomial Naïve Bayes and Complement Naïve Bayes. The results show that the Complement Naïve Bayes model achieved the best performance with an accuracy of 84%, an F1-score of 81%, and a high ROC-AUC value. These findings indicate that the majority of public sentiment toward the stimulus policy tends to be positive, and that the Naïve Bayes method is effective for social media–based sentiment analysis.
Co-Authors .Safrizal, Safrizal Abdillah, Rahmad Afdhal Zikri Afriyanti, Liza Aftari, Dhea Putri AGUNG SUCIPTO Ahmad, Rizmah Zakiah Nur Alfitra Salam Arasy, Abdurrahman Ash Shiddicky Aulia Ramadhani Ayu Fransiska Baehaqi Delifah, Nur Dermawan, Jozu Dzaky Abdillah Salafy Eka Pandu Cynthia El Saputra, Yoga Elin Haerani Elvia Budianita Fahrezy, Irgi Faizah Husniah Fauzan Ray T Fauzi Ihsan Febi Yanto Febrian Rizki Adi Sutiyo Fitri Insani Fitri Insani Fitri Wulandari Fitri, Dina Deswara Fuji Astuti Gusti, Siska Kurnia Habib Hakim Sinaga Hadi, Mukhlis Halimah Heru Wibowo Idhafi, Zaky Iffa, Marwika Rifattul Ihsan, Miftahul Iis Afrianty Iis Afrianty Ilham Habibi Hasibuan Illahi, Ridho Iman Fauzi Aditya Sayogo Indri Pangestuti Iwan Iskandar Jasril Jasril Jasril Jasril Jasril Jasril Lestari Handayani Lubis, Anggun Tri Utami BR. Melyana Hasibuan Miftah Farid Muhammad Fikry Muhammad Fikry Muhammad Iqbal Maulana Muhammad Irsyad Muhammad Irsyad Muhammad Ravil Muktar Sahbuddin Mukti M Kusairi Mulyadi, Syahrul Nadila Handayani Putri naldi, Afri Nazir, Alwis Nazruddin Safaat Nazruddin Safaat H Nazruddin Safaat H Negara, Benny Sukma Novriyanto Novriyanto Novriyanto Nurul Fatiara Okfalisa Okfalisa Oktavia, Lola Pangestu, Yoga Pizaini Pizaini Pranata, Joni Prima Yohana Putri Zahwa Putri, Adilah Atikah Putri, Atika Rahmad Abdillah Rahmad Kurniawan Ramadhani, Siti Reski Mai Candra Reski Mai Candra Rizqa Raaiqa Bintana Safrizal, Afri Naldi Salam Kurniawan Saputra, Ikhsan Dwi Saputra, M Ridho Saputra, Nugroho Wahyu Sinaga, Habib Hakim Siti Ramadhani Siti Ramadhani Siti Ramadhani Sri Puji Utami A. Subhi, Yazid Abdullah Suci Rahayu Sulistia Ningsih, Sulistia Suwanto Sanjaya Syaiful Azhar Tarmizi, Veci Cahyono Trya Ayu Pratiwi Utari, Roid Fitrah Wan Sobri Amin Yusra Yusra Yusra, Yusra