Big Data is a term used to describe the growth of large data, both structured data and data not tersrukur. Big Data has three main characteristics: volume, velocity, and variety. The problem that arises with the development of Big Data is how to store the data. Data that continues to grow enlarged each time requires a large storage space as well. This certainly will not be able if the storage space is in one machine (single node / host). Distributed file system is a storage and file management module consisting of multiple machines (multi node / host). This study aims to compare the performance of two file systems, GlusterFS and HDFS in distributed file storage with striped and replicated distribution scenarios. The study is limited to the measurement of file system performance in performing write / read file operational. The test results show that the performance GlusterFS have a lighter in performing write file operations with the acquisition of 44.54 MBps throughput, execution time for 58.54 seconds, CPU usage for 54.83% and memory usage for 3.6%. HDFS has the optimal performance on read files operations, obtained the average of throughput for 194.37 MBps, execution time for 16.01 seconds, CPU usage for 86.9% and memory usage for 18.5%.
Copyrights © 2019