Ceph as a Scalable Alternative to the Hadoop Distributed File System

Abstract

The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.

Publication
;login: The USENIX Magazine