SkyhookDM

Websites: skyhookdm.com, IRIS-HEP project
Funding: DOE ASCR DE-NA0003525 (FWP 20-023266): UCSC subcontractor of Sandia National Labs, NSF OAC-1836650, NSF CNS-1764102, NSF CNS-1705021, and CROSS
Overview: USENIX ;login: Summer ‘20

SkyhookDM is an extension of Ceph for the scalable storage of tables and for offloading common data management operations on them, including selection, projection, aggregation, and indexing, as well as user-defined functions. The goal of SkyhookDM is to transparently scale out data management operations across many storage servers leveraging the scale-out and availability properties of Ceph while significantly reducing the use of CPU cycles and interconnect bandwidth for unnecessary data transfers. The SkyhookDM architecture is also designed to transparently optimize for future storage devices of increasing heterogeneity and specialization. All the data movements from the Ceph OSDs to the client happen in Apache Arrow format.

Carlos Maltzahn
Carlos Maltzahn
Adjunct Professor, Founder & Director of CROSS

My research interests include programmable storage systems, big data storage & processing, scalable data management, distributed systems performance management, and practical reproducible research.