Indonesian Journal of Electrical Engineering and Computer Science
Vol 31, No 2: August 2023

Predictive analytics on COVID-19 data using Hive based on Hadoop cluster

Ali Abbood Khaleel (Bilad Alrafidain University College)
Ali Noori Kareem (Bilad Alrafidain University College)
Laith Hikmet Mahdi (Bilad Alrafidain University College)



Article Info

Publish Date
01 Aug 2023

Abstract

COVID-19 pandemic has received a serious attention from academia, industry and governments to stop the huge number of deaths and economic disruptions around the world. Many techniques have been used to control the spread of the pandemic by understanding its characteristics and behavior. However, because of the large amounts and complex characteristics of COVID-19 data, the querying and analysis of such data using conventional tools have become a challenging task. As a result, powerful and distributed tools are highly required for querying and analyzing this data effectively. In this paper, distributed system using Hive based on Hadoop cluster is used to query and analyze COVID-19 data to obtain meaningful information. Hadoop is employed as a scalable and reliable framework to accommodate such large amounts of data. Hive is used as a data warehouse that run on Hadoop cluster to perform querying and predictive analytics on huge COVID-19 datasets. Several experiments are performed to evaluate the performance of proposed system. Experiments show that the proposed system outperforms relational database management system (RDBMS) in terms of query processing time. Experiments also show that the proposed system has a better efficiency in terms of data load, I/O operation, reading and writing data.

Copyrights © 2023