Sample Paper Summary
Name: Scott Brandt
Paper: Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long,
and Carlos Maltzahn, ``Ceph: A Scalable, High-Performance, Distributed
Object-based Storage System,'' Symposium on Operating Systems Design
and Implementation (OSDI '06), Seattle, Washington, November 6-8, 2006,
to appear.
1. What is the problem the authors are trying to solve?
Existing storage systems do not scale well to petabytes of data and
terabytes/second throughput.
2. What other approaches or solutions existed at the time that this work was done?
Lots of other file systems existed. NFS is a standard for distributed
file systems. Lustre is a distributed object-based file system, as
is the Panasas file system.
3. What was wrong with the other approaches or solutions?
All have limitations that prevent them from scaling to the desired level.
Block-based file systems have problems dealing with the large number of
blocks in such a system. Other object-based file systems fail to take full
advantage of the object-based paradigm and still maintain object lists.
4. What is the authors' approach or solution?
The authors solution includes:
- Object-based storage devices
- A globally known mapping function for locating file data
(instead of object lists)
- A scalable metadata manager that dynamically redelegates authority
for directory subtrees based on load
- A distributed autonomous system for managing the object stores
5. Why is it better than the other approaches or solutions?
It scales to petabytes, provides nearly linear performance improvements
as storage devices are added, degrades gracefully as storage devices are
removed, and provides very high performance.
6. How did they test their solution?
They ran parts of the storage system and observed their performance
under various workloads. Data performance was tested on a single object
store and on several object stores. Metadata performance was tested on
a large cluster.
7. How does it perform?
Performance is very good. The system appears to achieve its goals,
although scalability could be improved in certain scenarios where a lot
of sharing occurs.
8. Why is this work important?
This work is important because storage systems continue to grow in size
and data is becoming increasingly important.
3+ comments/questions
* Why didn't they directly compare the performance of their system against
that of any other storage systems?
* What happens if you scale to exabytes? Will the system still work? What
factors will limit its ability to scale further?
* How much of the improvement is due to CRUSH, and how much to the design
of the other parts of the system? Why didn't they do any tests to isolate
the benefits of the individual design decisions?
|