International Journal of Advances in Intelligent Informatics
Vol 8, No 2 (2022): July 2022

Cluster analysis and ensemble transfer learning for COVID-19 classification from computed tomography scans

Lyubomir Gotsev (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Ivan Mitkov (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Eugenia Kovatcheva (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Boyan Jekov (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Roumen Nikolov (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Elena Shoikova (State University of Library Studies and Information Technologies, Sofia, Bulgaria)
Milena Petkova (State University of Library Studies and Information Technologies, Sofia, Bulgaria)



Article Info

Publish Date
31 Jul 2022

Abstract

The paper aims to demonstrate the convergence and synergistic application of deep learning methods and techniques to improve approaches and solutions in COVID-19 computed tomography scans classification by addressing the data and protocol challenges found after analyzing related work. The study is set to test and experiment with the proposed strategy: data standardization and normalization to achieve proper contrast and resolution; k-means (clustering) and group shuffle split to avoid data leakage; augmentation and transfer learning to deal with limited sample size and over-fitting. All activities are implemented before applying ensemble learning. VGG-16, Densenet-201, Inception v3 are the pre-trained networks utilized to build base models for the suggested stacking model with a vector of their predictions fed into a meta-learner input. All four classifiers are measured and compared. Various confusion-matrix-based and weighted evaluation metrics are considered: accuracy, recall, precision, f-measure, specificity, and AUC. Critical measurements, such as negative prediction value, false-positive rate, false-negative rate, and false discovery rate, are also presented. The evaluated classifiers achieve high results with AUC between 0.95 to 1. However, the stacked method is the most reliable. The ensemble approach enhanced described strategy having three main advantages: outperforming the base models, reducing data pitfalls, and decreasing generalization error. It can serve as a baseline to increase the performance quality and mitigate the risk of bias in the field.

Copyrights © 2022






Journal Info

Abbrev

IJAIN

Publisher

Subject

Computer Science & IT

Description

International journal of advances in intelligent informatics (IJAIN) e-ISSN: 2442-6571 is a peer reviewed open-access journal published three times a year in English-language, provides scientists and engineers throughout the world for the exchange and dissemination of theoretical and ...