Garuda - Garba Rujukan Digital

Journal of Computer Science and Engineering (JCSE)

Vol 4, No 1: February (2023)

Moses, Timothy (Unknown)
Abiodun, Oladunjoye John (Unknown)

Publish Date
05 Apr 2023

The central resource manager of Hadoop Yet Another Resource Manager (YARN) has posed a major concern to big data analysis and exploration. The central arbiter is overwhelmed whenever there are resource requests by application masters and heartbeat communication from several name nodes in the Hadoop cluster; thereby, degrading the performance of the framework. An attempt to decentralize the resource manager's responsibilities by introducing a new layer in the cluster named the Rack Unit Resource Manager (RU_RM) layer increased cluster performance but introduced a fault-tolerance concern. This work, therefore, developed a fault-tolerant model to allow for efficient and effective data analysis in the Hadoop cluster. A pseudo-distributed computation was set up with the help of the YARN Scheduler Load Simulator (SLS) and WordCount operation performed with varying input sizes. Two fault scenarios were presented and the results obtained showed that with an increase in input size (workload), the running time of the developed fault-tolerant model though slightly higher than that of the existing model, is significantly negligible when compared to the computation bottleneck incurred anytime RU_RM fails. The developed model, therefore, has good performance in the presence of failure of a unit (RU_RM) in the cluster.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Journal of Computer Science and Engineering (JCSE)

Website

Abbrev

JCSE

Publisher

Institute of Computer Science and Engineering

Subject

Computer Science & IT

Description

Computer Architecture, Processor design, operating systems, high-performance computing, parallel processing, computer networks, embedded systems, theory of computation, design and analysis of algorithms, data structures and database systems, theory of computation, design and analysis of algorithms, ...

Article Info

Abstract

A fault-tolerance model for Hadoop rack-aware resource management system

Article Info

Abstract