parallel

DAOS and Friends: A Proposal for an Exascale Storage System

The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase two of the project starting, it is an excellent …

Automatic and transparent I/O optimization with storage integrated application runtime support

Traditionally storage has not been part of a programming model's semantics and is added only as an I/O library interface. As a result, programming models, languages, and storage systems are limited in the optimizations they can perform for I/O …

An Innovative Storage Stack Addressing Extreme Scale Platforms and Big Data Applications

Exploring Trade-offs in Transactional Parallel Data Movement

Latency Minimization in SSD Clusters for Free

Modeling a Leadership-scale Storage System

Exascale supercomputers will have the potential for billion-way parallelism. While physical implementations of these systems are currently not available, HPC system designers can develop models of exascale systems to evaluate system design points. …

PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud

Ceph as a Scalable Alternative to the Hadoop Distributed File System

The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits …

Ceph: A Scalable, High-Performance Distributed File System

provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and …

CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data

Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available …