Computer Science | School of Engineering | UC Santa Cruz
Home | Syllabus | Schedule | Reading List | Project

CMPS 221: Advanced Operating Systems
Fall 2011

Sample Paper Summary

Name: Scott Brandt


Paper: Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, 
       and Carlos Maltzahn, ``Ceph: A Scalable, High-Performance, Distributed 
       Object-based Storage System,'' Symposium on Operating Systems Design 
       and Implementation (OSDI '06), Seattle, Washington, November 6-8, 2006, 
       to appear.

1. What is the problem the authors are trying to solve?

  Existing storage systems do not scale well to petabytes of data and 
  terabytes/second throughput.

2. What other approaches or solutions existed at the time that this work was done?

  Lots of other file systems existed. NFS is a standard for distributed
  file systems. Lustre is a distributed object-based file system, as 
  is the Panasas file system.

3. What was wrong with the other approaches or solutions?

  All have limitations that prevent them from scaling to the desired level.
  Block-based file systems have problems dealing with the large number of 
  blocks in such a system. Other object-based file systems fail to take full
  advantage of the object-based paradigm and still maintain object lists.

4. What is the authors' approach or solution?

  The authors solution includes: 
  - Object-based storage devices
  - A globally known mapping function for locating file data
    (instead of object lists)
  - A scalable metadata manager that dynamically redelegates authority
    for directory subtrees based on load
  - A distributed autonomous system for managing the object stores

5. Why is it better than the other approaches or solutions?

  It scales to petabytes, provides nearly linear performance improvements
  as storage devices are added, degrades gracefully as storage devices are
  removed, and provides very high performance.

6. How did they test their solution?

  They ran parts of the storage system and observed their performance
  under various workloads. Data performance was tested on a single object 
  store and on several object stores. Metadata performance was tested on
  a large cluster.

7. How does it perform?

  Performance is very good. The system appears to achieve its goals,
  although scalability could be improved in certain scenarios where a lot
  of sharing occurs.

8. Why is this work important?

  This work is important because storage systems continue to grow in size 
  and data is becoming increasingly important.

3+ comments/questions

  * Why didn't they directly compare the performance of their system against 
    that of any other storage systems?

  * What happens if you scale to exabytes? Will the system still work? What
    factors will limit its ability to scale further?

  * How much of the improvement is due to CRUSH, and how much to the design 
    of the other parts of the system? Why didn't they do any tests to isolate
    the benefits of the individual design decisions?
 

Prof. Scott A. Brandt (scott@cs.ucsc.edu)