Electronic mail classification in educational institutes becomes the fundamental task to manage information efficiently. Due to the globalization and the technological advancement, volume of email users increasing consistently, which in turn increases the volume of digital data exponentially. This necessitates the developing automated email classification systems for the better and organized work. This paper develops a novel graph-based similarity (GBS) approach based on semantic similarity to address these challenges. The method initially selects the most relevant features based on feature weights, later it builds a graph by using Jaccard co efficient method for each category with features as nodes and correlation between the nodes as edges. Later, these graphs are used as templates for each category and classifies each new incoming email into the specific class based on the similarity among the graph templates and a new email. The GBS method was compared with the well-known benchmarked email classifiers and the findings demonstrated that the GBS method outperformed with 98.91% accuracy after fine-tuning of graph parameters and the classifier hyper parameters. Additionally, receiver operating characteristic (ROC) curve analysis was conducted, achieving a highest area under curve (AUC) score 0.989, demonstrating robust classification proficiency across all categories.
Copyrights © 2025