papers

Towards Physical Design Management in Storage Systems

In the post-Moore era, systems and devices with new architectures will arrive at a rapid rate with significant impacts on the software stack. Applications will not be able to fully benefit from new architectures unless they can delegate adapting to …

MBWU: Benefit Quantification for Data Access Function Offloading

The storage industry is considering new kinds of storage de- vices that support data access function offloading, i.e. the ability to perform data access functions on the storage device itself as opposed to performing it on a separate compute system …

Reproducible Computer Network Experiments: A Case Study Using Popper

Computer network research experiments can be broadly grouped in three categories: simulated, controlled, and real-world experiments. Simulation frameworks, experiment testbeds and measurement tools, respectively, are commonly used as the platforms …

Skyhook: Programmable storage for databases

Ceph is an open source distributed storage system that is object-based and massively scalable. Ceph provides developers with the capability to create data interfaces that can take advantage of local CPU and memory on the storage nodes (Ceph Object …

Spotting Black Swans With Ease: The Case for a Practical Reproducibility Platform

Advances in agile software delivery methodologies and tools (commonly referred to as DevOps) have not yet materialized in academic scenarios such as university, industry and government laboratories. In this position paper we make the case for Black …

Taming Performance Variability

The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution. This variation has important effects on …

Tintenfisch: File System Namespace Schemas and Generators

The file system metadata service is the scalability bottleneck for many of today’s workloads. Common approaches for attacking this “metadata scaling wall” include: caching inodes on clients and servers, caching parent inodes for path traversal, and …

Popper Pitfalls: Experiences Following a Reproducibility Convention

We describe the four publications we have tried to make reproducible and discuss how each paper has changed our workflows, practices, and collaboration policies. The fundamental insight is that paper artifacts must be made reproducible from the start …

Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace

HPC and data center scale application developers are abandoning POSIX IO because file system metadata synchronization and serialization overheads of providing strong consistency and durability are too costly -- and often unnecessary -- for their …

Programmable Caches with a Data Management Language & Policy Engine

Our analysis of the key-value activity generated by the ParSplice molecular dynamics simulation demonstrates the need for more complex cache management strategies. Baseline measurements show clear key access patterns and hot spots that offer …