Indonesian Journal of Electrical Engineering and Computer Science
Vol 4, No 3: December 2016

A New Memory MapReduce Framework for Higher Access to Resources

ZuKuan WEI (University of Electronic Science and Technology of China)
Bo HONG (University of Electronic Science and Technology of China)
JaeHong KIM (YoungDong University)



Article Info

Publish Date
01 Dec 2016

Abstract

The demand for highly parallel data processing platform was growing due to an explosion in the number of massive-scale data applications both in academia and industry. MapReduce was one of the most meaningful solutions to deal with big data distributed computing, This paper was based on the work of Hadoop MapReduce. In the face of massive data computing and calculation process, MapReduce generated a lot of dynamic data, but these data were discarded after the task completed. Meanwhile, a large number of dynamic data were written to HDFS during task execution, caused much unnecessary IO cost. In this paper, we analyzed existing distributed caching mechanism and proposed a new Memory MapReduce framework that has a real-time response to read or write request from task nodes, maintain related information about cache data. After performance testing, we could clearly find MapReduce with cache significantly improved in IO performance.

Copyrights © 2016