Checking Crash Consistency on Storage Systems


Presented by: Feng Qin (The Ohio State University)

Date: Tuesday 12th June 2018 - 09:30

Venue: School of Electronic and Information Engineering Meeting Room 503

Event Type: Lectures


Crash consistency is an important property for modern storage systems. Unfortunately, it is difficult to be preserved in the complex storage stack. At the higher layer of the stack, applications enable users to perform various functions and to accomplish different tasks. Furthermore, the middle layers including databases, file systems, block layers introduce all kinds of optimizations such as caching and journaling. At the lower layer of the stack, the new components such as Solid State Drive (SSD) are often ignored or under-studied in the adverse conditions. Such complexity brings unprecedented challenges to preserve the data consistency of storage systems after crashes.

In this talk, I will mainly present our recent work on checking crash consistency of databases and applications. In particular, our framework for torturing databases include carefully-crafted workloads to exercise the ACID guarantees, a record/replay subsystem to allow the controlled injection of simulated power faults, a ranking algorithm to prioritize where to fault based on our experience, and a multi-layer tracer to diagnose root causes. Unlike databases, applications provide various functions to users, requiring non-trivial manual efforts of specifying checking scripts and workloads. To address this key challenge, our proposed approach C3 automatically generates the testing oracle and checking scripts to make the entire validation process as easy as pressing a single button.