Claim Missing Document
Check
Articles

Found 8 Documents
Search
Journal : Journal of Computer Networks, Architecture and High Performance Computing

Optimizing SMS Spam Detection Using Machine Learning: A Comparative Analysis of Ensemble and Traditional Classifiers Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 4 (2024): Articles Research October 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i4.4822

Abstract

With the rapid rise of mobile communication, Short Message Service (SMS) has become an essential platform for transmitting information. However, the growing volume of unsolicited and harmful spam messages presents significant challenges for both users and mobile network operators. This study explores the effectiveness of various machine learning models, including Random Forest, Gradient Boosting, AdaBoost, Support Vector Machine (SVM), Logistic Regression, and an Ensemble Voting Classifier, in detecting SMS spam. A dataset containing 5,572 SMS messages, labeled as either spam or ham (legitimate), was used to evaluate these models. Hyperparameter tuning was performed on each model to optimize accuracy, and the models were assessed using metrics such as precision, recall, F1-score, and accuracy. The results indicated that the SVM and Ensemble Voting Classifier achieved the highest performance, with accuracies of 0.9857 and 0.9848, respectively. Both models demonstrated superior recall for spam messages, making them highly effective for real-world spam detection systems. While Random Forest, Gradient Boosting, and AdaBoost also performed well, their slightly lower recall for spam suggests that they may misclassify some spam as legitimate messages. The study highlights the effectiveness of machine learning models in addressing the SMS spam problem, particularly when using ensemble methods. Future research should focus on addressing class imbalance and exploring deep learning approaches to further enhance model performance. These findings offer valuable insights for developing more accurate and scalable SMS spam detection systems.
A Comparative Analysis of Deep Learning Models for SMS Spam Detection: CNN-LSTM, CNN-GRU, and ResNet Approaches Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 4 (2024): Articles Research October 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i4.4827

Abstract

Spam messages have become a growing challenge in mobile communication, threatening user security and data privacy. Traditional spam detection methods, including rule-based and machine learning techniques, are increasingly insufficient due to the evolving sophistication of spam tactics. This research evaluates the effectiveness of advanced deep learning models such as CNN-LSTM, CNN-GRU, and ResNet for SMS spam detection. The dataset used consists of diverse SMS messages labeled as either spam or legitimate (ham), ensuring broad coverage of real-world spam patterns. The study employs a robust ten-fold cross-validation approach to assess the generalization capabilities of the models, measuring performance based on accuracy, precision, recall, and F1 score. The results indicate that ResNet outperformed the other models, achieving an average accuracy of 99.08% and an F1 score of 0.9646, making it the most reliable model for spam detection. CNN-GRU demonstrated competitive performance with a balance between accuracy (98.97%) and computational efficiency, making it suitable for real-time applications. CNN-LSTM, while highly accurate (98.92%), showed a slightly lower recall compared to the other models, indicating a more cautious approach to detecting spam. These findings highlight the potential of hybrid deep learning models in addressing the complexities of SMS spam detection. Future research could focus on optimizing these models for deployment in resource-constrained environments, such as mobile devices, and further exploring the integration of residual connections for more effective spam filtering.
Comparative Analysis of Machine Learning Algorithms for Detecting Fake News: Efficacy and Accuracy in the Modern Information Ecosystem Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 1 (2024): Article Research Volume 6 Issue 1, January 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i1.3466

Abstract

In an era where the spread of fake news poses a significant threat to the integrity of the information landscape, the need for effective detection tools is paramount. This study evaluates the efficacy of three machine learning algorithms—Multinomial Naive Bayes, Passive Aggressive Classifier, and Logistic Regression—in distinguishing fake news from genuine articles. Leveraging a balanced dataset, meticulously processed and vectorized through Term Frequency-Inverse Document Frequency (TF-IDF), we subjected each algorithm to a rigorous classification process. The algorithms were evaluated on metrics such as precision, recall, and F1-score, with the Passive Aggressive Classifier outperforming others, achieving a remarkable 0.99 in both precision and recall. Logistic Regression followed with an accuracy of 0.98, while Multinomial Naive Bayes displayed robust recall at 1.00 but lower precision at 0.91, resulting in an accuracy of 0.95. These metrics underscored the nuanced capabilities of each algorithm in correctly identifying fake and real news, with the Passive Aggressive Classifier demonstrating superior balance in performance. The study's findings highlight the potential of employing machine learning techniques in the fight against fake news, with the Passive Aggressive Classifier showing promise due to its high accuracy and balanced precision-recall trade-off. These insights contribute to the ongoing efforts in digital media to develop advanced, ethical, and accurate tools for maintaining information veracity. Future research should continue to refine these models, ensuring their applicability in diverse and evolving news ecosystems.
Analysis of Machine Learning Classifiers for Speaker Identification: A Study on SVM, Random Forest, KNN, and Decision Tree Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 1 (2024): Article Research Volume 6 Issue 1, January 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i1.3487

Abstract

This study investigates the performance of machine learning classifiers in the domain of speaker identification, a pivotal component of modern digital security systems. With the burgeoning integration of voice-activated interfaces in technology, the demand for accurate and reliable speaker identification is paramount. This research provides a comprehensive comparison of four widely used classifiers: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Decision Tree (DT). Utilizing the LibriSpeech dataset, known for its diversity of speakers and recording conditions, we extracted Mel-frequency cepstral coefficients (MFCCs) to serve as features for training and evaluating the classifiers. Each model's performance was assessed based on precision, recall, F1-score, and accuracy. The results revealed that RF outperformed all other classifiers, achieving near-perfect metrics, indicative of its robustness and generalizability for speaker identification tasks. KNN also demonstrated high performance, suggesting its suitability for applications where rapid execution and interpretability are critical. Conversely, SVM and DT, while yielding moderate and lower performances respectively, highlighted the necessity for further optimization. These findings underscore the effectiveness of ensemble and distance-based classifiers in handling complex patterns for speaker differentiation. The study not only guides the selection of appropriate classifiers for speaker identification but also sets the stage for future research, which could explore hybrid models and the impact of dataset variability on performance. The insights from this analysis contribute significantly to the field, providing a benchmark for developing advanced speaker identification systems
Evaluating the Efficacy of Machine Learning Models in Credit Card Fraud Detection Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 2 (2024): Articles Research Volume 6 Issue 2, April 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i2.3814

Abstract

This research evaluates the effectiveness of various machine learning models in detecting credit card fraud within a dataset comprising 555,719 transactions. The study meticulously compares traditional and advanced models, including Logistic Regression, Support Vector Machines (SVM), Random Forest, Gradient Boosting, k-Nearest Neighbors (k-NN), Naive Bayes, AdaBoost, LightGBM, XGBoost, and Multilayer Perceptrons (MLP), in terms of accuracy and reliability. Through a robust methodology involving extensive data preprocessing, feature engineering, and a 5-fold stratified cross-validation, the research identifies XGBoost as the most effective model, demonstrating a near-perfect mean accuracy of 0.9990 with minimal variability. The results emphasize the significance of model choice, data preparation, and the potential of ensemble and boosting techniques in managing the complexities of fraud detection. The findings not only contribute to the academic discourse on fraud detection but also suggest practical applications for real-world systems, aiming to enhance security measures in financial transactions. Future research directions include exploring hybrid models and adapting to evolving fraud tactics through continuous learning systems.
Comparative Analysis of Machine Learning Models for Credit Card Fraud Detection in Imbalanced Datasets Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 2 (2024): Articles Research Volume 6 Issue 2, April 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i2.3816

Abstract

This study presents a comprehensive evaluation of various machine learning models for detecting credit card fraud, emphasizing their performance in handling highly imbalanced datasets. We focused on three models: Logistic Regression, Random Forest, and Multilayer Perceptron (MLP), using a dataset comprising 555,719 transactions, each annotated with 22 attributes. Logistic Regression served as a baseline, Random Forest was evaluated for its high accuracy and low dependency on hyperparameter tuning, and MLP was tested for its capability to identify non-linear patterns. The models were assessed using ROC AUC, Matthews Correlation Coefficient (MCC), and precision-recall curves to determine their effectiveness in distinguishing fraudulent transactions. Results indicated that the Random Forest model outperformed others with a ROC AUC of 0.9868 and an MCC of 0.6638, showing substantial superiority in managing class imbalances and complex data interactions. Logistic Regression, although useful as a benchmark, exhibited limitations with a high number of false positives. MLP showed potential but was prone to a significant false positive rate, suggesting a need for further model refinement. The findings highlight the importance of choosing appropriate models and feature engineering techniques in fraud detection systems and suggest avenues for future research in real-time model deployment and advanced algorithmic strategies
Optimizing SMS Spam Detection Using Machine Learning: A Comparative Analysis of Ensemble and Traditional Classifiers Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 4 (2024): Articles Research October 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i4.4822

Abstract

With the rapid rise of mobile communication, Short Message Service (SMS) has become an essential platform for transmitting information. However, the growing volume of unsolicited and harmful spam messages presents significant challenges for both users and mobile network operators. This study explores the effectiveness of various machine learning models, including Random Forest, Gradient Boosting, AdaBoost, Support Vector Machine (SVM), Logistic Regression, and an Ensemble Voting Classifier, in detecting SMS spam. A dataset containing 5,572 SMS messages, labeled as either spam or ham (legitimate), was used to evaluate these models. Hyperparameter tuning was performed on each model to optimize accuracy, and the models were assessed using metrics such as precision, recall, F1-score, and accuracy. The results indicated that the SVM and Ensemble Voting Classifier achieved the highest performance, with accuracies of 0.9857 and 0.9848, respectively. Both models demonstrated superior recall for spam messages, making them highly effective for real-world spam detection systems. While Random Forest, Gradient Boosting, and AdaBoost also performed well, their slightly lower recall for spam suggests that they may misclassify some spam as legitimate messages. The study highlights the effectiveness of machine learning models in addressing the SMS spam problem, particularly when using ensemble methods. Future research should focus on addressing class imbalance and exploring deep learning approaches to further enhance model performance. These findings offer valuable insights for developing more accurate and scalable SMS spam detection systems.
A Comparative Analysis of Deep Learning Models for SMS Spam Detection: CNN-LSTM, CNN-GRU, and ResNet Approaches Airlangga, Gregorius
Journal of Computer Networks, Architecture and High Performance Computing Vol. 6 No. 4 (2024): Articles Research October 2024
Publisher : Information Technology and Science (ITScience)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/cnahpc.v6i4.4827

Abstract

Spam messages have become a growing challenge in mobile communication, threatening user security and data privacy. Traditional spam detection methods, including rule-based and machine learning techniques, are increasingly insufficient due to the evolving sophistication of spam tactics. This research evaluates the effectiveness of advanced deep learning models such as CNN-LSTM, CNN-GRU, and ResNet for SMS spam detection. The dataset used consists of diverse SMS messages labeled as either spam or legitimate (ham), ensuring broad coverage of real-world spam patterns. The study employs a robust ten-fold cross-validation approach to assess the generalization capabilities of the models, measuring performance based on accuracy, precision, recall, and F1 score. The results indicate that ResNet outperformed the other models, achieving an average accuracy of 99.08% and an F1 score of 0.9646, making it the most reliable model for spam detection. CNN-GRU demonstrated competitive performance with a balance between accuracy (98.97%) and computational efficiency, making it suitable for real-time applications. CNN-LSTM, while highly accurate (98.92%), showed a slightly lower recall compared to the other models, indicating a more cautious approach to detecting spam. These findings highlight the potential of hybrid deep learning models in addressing the complexities of SMS spam detection. Future research could focus on optimizing these models for deployment in resource-constrained environments, such as mobile devices, and further exploring the integration of residual connections for more effective spam filtering.