Home
News
Research
Publications
Teaching
Contact
CV
Light
Dark
Automatic
mapreduce
SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce
Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and …
Michael Sevilla
,
Ike Nassi
,
Kleoni Ioannidou
,
Scott Brandt
,
Carlos Maltzahn
PDF
Cite
SIDR: Structure-Aware Intelligent Data Routing in Hadoop
The MapReduce framework is being extended for domains quite different from the web applications for which it was designed, including …
Joe Buck
,
Noah Watkins
,
Greg Levin
,
Adam Crume
,
Kleoni Ioannidou
,
Scott Brandt
,
Carlos Maltzahn
,
Neoklis Polyzotis
,
Aaron Torres
PDF
Cite
Project
Compressing Intermediate Keys between Mappers and Reducers in SciHadoop
In Hadoop mappers send data to reducers in the form of key/value pairs. The default design of Hadoop’s process for transmitting …
Adam Crume
,
Joe Buck
,
Carlos Maltzahn
,
Scott Brandt
PDF
Cite
SciHadoop Semantic Compression
Adam Crume
,
Joe Buck
,
Noah Watkins
,
Carlos Maltzahn
,
Scott Brandt
,
Neoklis Polyzotis
PDF
Cite
Structure-Aware Intelligent Data Routing in SciHadoop
Joe Buck
,
Noah Watkins
,
Greg Levin
,
Adam Crume
,
Kleoni Ioannidou
,
Scott Brandt
,
Carlos Maltzahn
,
Neoklis Polyzotis
PDF
Cite
SciHadoop: Array-based Query Processing in Hadoop
Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific …
Joe Buck
,
Noah Watkins
,
Jeff LeFevre
,
Kleoni Ioannidou
,
Carlos Maltzahn
,
Neoklis Polyzotis
,
Scott A. Brandt
PDF
Cite
Project
Haceph: Scalable Metadata Management for Hadoop using Ceph
Esteban Molina-Estolano
,
Carlos Maltzahn
,
Ben Reed
,
Scott A. Brandt
Cite
Ceph as a Scalable Alternative to the Hadoop Distributed File System
The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a …
Carlos Maltzahn
,
Esteban Molina-Estolano
,
Amandeep Khurana
,
Alex J. Nelson
,
Scott A. Brandt
,
Sage A. Weil
PDF
Cite
Mixing Hadoop and HPC Workloads on Parallel Filesystems
MapReduce-tailored distributed filesystems—such as HDFS for Hadoop MapReduce—and parallel high-performance computing …
Esteban Molina-Estolano
,
Maya Gokhale
,
Carlos Maltzahn
,
John May
,
John Bent
,
Scott Brandt
PDF
Cite
Cite
×