Claim Missing Document
Check
Articles

Found 3 Documents
Search

Optimasi Linear Support Vector Machine untuk Deteksi Smishing Multi-Kelas pada Dataset Tidak Seimbang Vannia, Anggun; Muljono
Jurnal Sistem Komputer dan Informatika (JSON) Vol. 7 No. 2 (2025): Desember 2025
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/json.v7i2.9299

Abstract

Serangan smishing (SMS phishing) menghadapi tantangan mendasar dalam deteksi berbasis machine learning akibat ketidakseimbangan distribusi kelas pada dataset dunia nyata, di mana instance kelas minoritas (smishing) justru paling kritis untuk diidentifikasi. Penelitian ini mengusulkan sebuah framework robust yang mengoptimasi Linear Support Vector Machine (SVM) dengan strategi hybrid sampling tiga tingkat untuk klasifikasi multi-kelas pada kondisi data tidak seimbang. Framework yang dikembangkan mengintegrasikan ekstraksi fitur hibrida TF-IDF dan meta-features dengan strategi penanganan ketidakseimbangan data yang komprehensif, yang meliputi Random Oversampling (ROS) untuk kelas minoritas, Random Undersampling (RUS) untuk kelas mayoritas, dan Embedding MixUp untuk augmentasi data level embedding. Optimasi parameter melalui GridSearchCV dengan validasi 5-fold berhasil menentukan konfigurasi optimal SVM Linear (C=0.5). Hasil evaluasi pada test set mendemonstrasikan kemampuan klasifikasi yang tinggi dan seimbang, dengan pencapaian akurasi 96,7% dan F1-macro 87,6%. Kinerja yang konsisten merata pada semua kelas ini tercermin dari recall smishing 84% sambil mempertahankan recall ham 99%. Temuan ini menegaskan bahwa kombinasi Linear SVM dan strategi hybrid sampling  berhasil menghasilkan model deteksi smishing yang robust, seimbang, dan siap diimplementasikan dalam skenario dunia nyata.
Arsitektur Hibrida IndoBERTweet - Convolutional Neural Network (CNN) untuk Klasifikasi Ujaran Kebencian Berbahasa Gaul di Media Sosial Margareta Valencia Suci Handayani; Muljono
Infotekmesin Vol 17 No 1 (2026): Infotekmesin: Januari 2026
Publisher : P3M Politeknik Negeri Cilacap

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35970/infotekmesin.v17i1.3026

Abstract

Detecting hate speech on Indonesian social media is challenging due to slang, abbreviations, and informal expressions that hinder automated text understanding. Traditional machine learning approaches often fail to capture contextual meaning effectively. This study aims to develop a hate speech detection system for Indonesian slang by evaluating contextual embedding IndoBERTweet combined with a Convolutional Neural Network (CNN) architecture. The research compares the performance of CNN and BiLSTM models using IndoBERTweet and FastText embeddings. A dataset of 1,477 labeled tweets categorized as Hate Speech, Abusive, or Non-Hate Speech was used. Evaluation metrics employed in this study consist of accuracy, precision, recall, F1 score, and AUC ROC. The results show that the IndoBERTweet + CNN model achieves the best performance, with 91.2% accuracy and a 91.1% F1-score, significantly outperforming FastText-based models. IndoBERTweet’s contextual embedding proves effective in handling the linguistic complexity and implicit meanings commonly found in Indonesian slang. These findings highlight the model’s strong capability for robust hate speech detection and open opportunities for its adoption as an automated content-moderation module that identifies and filters toxic narratives on social media platforms.
Optimalisasi Akurasi dan Stabilitas Analisis Sentimen Ulasan E-Commerce Indonesia melalui Fine-Tuning Transformer IndoBERT Alfina Latifa Maysara; Muljono
Infotekmesin Vol 17 No 1 (2026): Infotekmesin: Januari 2026
Publisher : P3M Politeknik Negeri Cilacap

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35970/infotekmesin.v17i1.3037

Abstract

The rapid growth of e-commerce in Indonesia increases the need for sentiment analysis to accurately understand customer perceptions. This study evaluates the effectiveness of the Transformer-based IndoBERT model for sentiment classification on Indonesian e-commerce reviews and compares its performance with four RNN architectures (LSTM, GRU, BiLSTM, and BiGRU). The PRDECT-ID dataset containing 5,400 reviews was processed through preprocessing, an 80:20 data split, RNN training using 5-Fold Cross Validation, and IndoBERT fine-tuning under a hold-out scheme. Unlike previous studies that focused solely on RNN models with a maximum accuracy of 90.7%, this work expands the evaluation by integrating a Transformer-based approach. Results show that IndoBERT achieves 98.52% accuracy and F1-weighted score, outperforming the best RNN models by approximately 0.94–0.95. Paired T-Test and Wilcoxon tests yield p < 0,05, confirming that the performance improvements are statistically significant. IndoBERT demonstrates greater stability and effectiveness for Indonesian sentiment analysis.