filesystems

PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud

QMDS: A File System Metadata Service Supporting a Graph Data Model-Based Query Language

Design and Implementation of a Metadata-Rich File System

Ceph as a Scalable Alternative to the Hadoop Distributed File System

The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits …

Fusing Data Management Services with File Systems

File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze …

Mixing Hadoop and HPC Workloads on Parallel Filesystems

MapReduce-tailored distributed filesystems---such as HDFS for Hadoop MapReduce---and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each …

JabberWocky: Crowd-Sourcing Metadata for Files

Abstract Storage: Moving file format-specific abstractions into petabyte-scale storage systems

High-end computing is increasingly I/O bound as computations become more data-intensive, and data transport technologies struggle to keep pace with the demands of large-scale, distributed computations. One approach to avoiding unnecessary I/O is to …

Building a Parallel File System Simulator

Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost …

How Private are Home Directories?