Garuda - Garba Rujukan Digital

Jurnal Computer Science and Information Technology (CoSciTech)

Vol 6 No 2 (2025): Jurnal Computer Science and Information Technology (CoSciTech)

Florentina Yuni Arini (Unknown)

Publish Date
13 Sep 2025

Email spam detection is a critical challenge in maintaining the security and efficiency of digital communication. This research proposes and evaluates an optimized pipeline for email spam detection by integrating Bidirectional Encoder Representations from Transformers (BERT) for feature extraction, Mutual Information (MI) for feature selection to reduce dimensionality, and a dense neural network for classification. The Lingspam dataset, consisting of 2893 emails (2412 ham and 481 spam), was used in the experiments with an 80% training and 20% testing data split. Text features were extracted using BERT (bert-base-uncased), resulting in a 768-dimensional embedding, which was then reduced to the 200 most relevant features using MI. A dense neural network model with a 256-128-64-32-1 neuron architecture was trained using the Adam optimizer, binary cross-entropy loss function, and techniques such as early stopping and class weights to handle class imbalance. Evaluation results on the test data demonstrated very high performance, achieving an accuracy of 99.14%, precision of 0.9596, recall of 0.9896, F1-score of 0.9744, and ROC-AUC of 0.9995. This approach indicates that the combination of BERT-MI with a dense network can achieve accuracy comparable to more complex methods, but with the potential for a simpler and more efficient architecture.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Jurnal Computer Science and Information Technology (CoSciTech)

Website

Abbrev

coscitech

Publisher

Universitas Muhammadiyah Riau

Subject

Computer Science & IT

Description

Jurnal CoSciTech (Computer Science and Information Technology) merupakan jurnal peer-review yang diterbitkan oleh Program Studi Teknik Informatika, Fakultas Ilmu Komputer, Univeritas Muhammadiyah Riau (UMRI) sejak April tahun 2020. Jurnal CoSciTech terdaftar pada PDII LIPI dengan Nomor ISSN ...

Article Info

Abstract

Optimasi algoritma deteksi spam email dengan BERT-MI dan jaringan dense

Article Info

Abstract