Ph.D Student and Research Assistant
I'm a 2nd-year PhD student at Department of Computer Sciences, UC Santa Cruz. I'm working with Prof. Ethan Miller. My research interests lie in Non-volatile Memory, Operating Systems, Distributed Systems.
NVM Checkpointing Library(On going)SSRC, Santa Cruz05/2017-now
NVMCKPT is a persistent heap management system we are currently designing and implementing, an will be implemented as a runtime library. Our checkpointing semantics and implementation can enable higher performance, and is more NVM-Friendly than log-based durable transactions. NVMCKPT is different from previous work in two ways. First, previous NVM programming model is bound by transactional semantics. As is pointed out by previous work, significant overhead is incurred by the need to enforce strict intra- and inter-transaction write ordering. We propose a new checkpoint semantics with which write ordering could be easily relaxed for better performance. NVMCKPT could enable even more performance gain by its, optional, asynchronous checkpointing. Checkpoints are taken at a configurable interval at a consistent program state. Second, durable transactions are commonly built based on logging. However, logging often requires to make a copy of old or new data in the critical path which seriously degrades application performance. We use the CoW checkpointing to minimize the data copying in critical path and reduce the overall NVM writes. When checkpointing, the persistent heap is marked as read-only. Updates to the persistent data structures are written to a new location(remapped). To support remapping, NVMCKPT presents the application with a virtual persistent heap and maintains an mapping between virtual persistent heap and the physical one. We propose to use sub-page validity to minimize the CoW/mapping overhead for small/large object updates.
SSD Friendly Caching for Data Center Workloads05/2015-03/2016ICT, Beijing
SSD caching faces two challenges: 1) SSD has limited write endurance, which requires to reduce write amount to SSD, and 2) data-center workloads exhibit a diverse I/O access patterns, which requires to figure out SSD caching friendly patterns. We propose an SSD cache manager S-RAC with re-adding blocks and ghost cache adaptation to retain SSD friendly blocks in SSD. The evaluation shows the efficiency of S-RAC in reducing SSD write amount while improving/maintaining cache hit ratio. S-RAC prototype is implemented into Linux kernel based on dm-cache.
Cache Oriented SSD ManagementICT, Beijing 08/2014-06/2015
This work is driven the observation that a unified view of garbage collection and cache eviction in SSD caches is able to reduce overall Write Amplification, thus boosting applications performance and elongating SSD lifetime. COSMO integrated cache and flash management. It explicitly controls the page placement on SSDs, clustering data in a way that facilitates future garbage collection.
Process Checkpointing on Systems with Persistent MemorySSRC, Santa Cruz03/2017-06/2017
Building applications directly on top of Persistent Main Memory(PM) is promising. But system failures might leave in-memory data structures in an inconsistent state. In this work, we investigate using process checkpointing techniques that have been widely used for process migration and high performance computing. Checkpointing would be promising for PM-attached system since it can address the issue of crash consistency in a transparent way. We proposed PM-aware checkpointing, which leverage the Kernel's CoW semantic to facilitate checkpointing in a PM-attached system. A prototype of PM-aware checkpointing is implemented into the FreeBSD kernel.
Understanding the Latency of File systems on NVMSSRC, Santa Cruz01/2017-03/2017
In this work, we focus on understanding the latency of file systems on Byte-Addressable NVM. Previous performance studies mainly focus on comparing the overall performance of file systems(legacy FS as well as BNVM-aware FS) on BNVM under different configurations and workloads. But in our work, we expect to have a better understanding of all kinds of overhead in the data access path. We believe such deep understanding would facilitate future design/optimization of BNVM-aware file systems. Our work first identify potential overheads for different FS operations(open, read, write, etc.). Data access path is then be broken down into different parts to isolate those overhead. Two file systems, ext3 and PMFS, are instrumented. Our preliminary results confirm the necessity of a lightweight software layer for low-latency NVMs.
Memory Trace Analyzer07/2017-09/2017
It was originally developed to model the performance of NVM checkpointing and traditional logging experimentally. It contains a trace generator that can generates memory accesses with customizable spatial/temporal locality, read/write ratio, etc. It allows you to implement a set of "instrumentor", in which you can simulate the system behavior under certain memory access. Then, with all the collected information(e.g CPU cache miss ratio, how many cache line is dirtied in a transaction, etc), we can estimate the performance of a given system under various memory access patterns.
URL Source Detection, Tracking and Statistics Module09/2013-02/2014
It contains: i). A web crawler to collect webpages from given domains. ii). A Naive Bayesian Classifier trained to identify these "hub" webpages. iii). A monitor to track the updating frequency of the "hub" webpages.
Yuanjiang Ni, Ji Jiang, Dejun Jiang, Xiaosong Ma, Jin Xiong and Yuangang Wang " S-RAC: SSD Friendly Caching for Data Center Workloads", SYSTOR '16, Haifa, Israel, June 2016.