The COVID-19 pandemic caused by SARS-CoV-2 continues to challenge the global health system through the emergence of various variants with genetic characteristics that affect vaccine transmission and effectiveness. Conventional identification methods such as Whole-Genome Sequencing (WGS) have high accuracy but are constrained by significant cost and time. Most classification studies today still rely on complex hybrid architectures such as CNN-LSTM or image-based representations that increase computational load. This study aims to develop an efficient and lightweight pure Convolutional Neural Network model based on alignment-free encoding to classify five Variant of Concern (VOC) variants of SARS-CoV-2 (Alpha, Beta, Delta, Gamma, and Omicron) with an exclusive focus on the Spike gene sequence. The dataset consists of 5,000 Spike gene sequences that are represented using integer encoding and standardized with zero-padding. CNN proposed Lightweight architecture consists of four 1D convolution layers with a total of approximately 1.6 million parameters. The test results show that the model achieves excellent performance with an overall accuracy of 98.93%. The precision, recall, and F1-score values averaged 0.99, while the analysis of the ROC curve showed AUC values above 0.99 for all variants. This approach has proven to be efficient and effective, offering a fast, scalable, and resource-efficient solution to support real-time genomic surveillance systems in future pandemic mitigation.
Copyrights © 2026