Data Integrity and Availability:
The Challenge of Scale for Modern Storage Systems
Guest Editor's Introduction • Sundara Nagarajan • May 2012
Recent reports from Amazon Web Services (AWS) indicate that the company's S3 storage service will soon have more than a trillion objects in storage and be capable of handling a million requests per second. Clearly, we are living through an era of transformation in storage system architectures designed to deliver continuously scalable service. Consumers and enterprise users want most of their data to be stored in the most economical manner, with a small part stored for rapid access as needed. Yet, even if a storage solution is available free of cost, users are uncompromising on a key property: access to their data on demand without fail. This invariant raises several challenges for storage system designers. Faults in storage systems can cause latent errors that remain long undetected until access uncovers them as failures. Hardware and software defects can cause faults that have a significant negative impact on reliability and availability, leading to data loss or delays in getting to the data.
For this issue of Computing Now, I collected a set of articles that highlight the design trade-offs and choices for modern-day storage system architectures with regard to consistency, data integrity, and availability. This theme explores real-world problems and solutions associated with contemporary storage systems.
Storage System Components
A storage system is characterized by a stack of hardware and software components, including:
- Different storage devices—solid-state devices, hard disks, removable media, tape and so on;
- Data-protection mechanisms, such as RAID or erasure codes at the device level;
- Block and file semantics for storage organization;
- Data management entities, snapshots, clones;
- Provisioning of space;
- Network access protocols, and so on.
These elements are combined with storage-efficiency techniques such as deduplication and compression to deliver the optimum access speed and cost/capacity. Modern storage systems deploy different kinds of storage devices with different levels of cost-to-performance and reliability characteristics. These systems are essentially networks of components interconnected via high-speed connections and working together to deliver the storage services. Time to data or availability (or latency) and data loss are two important quality-of-service parameters that every storage system must ideally optimize, tending to zero..
Hardware storage devices have evolved rapidly in terms of the amount of data they can store. However, their reliability has not progressed at the same rate. Techniques such as RAID provide the first line of defense at the storage subsystem level, but are these techniques scalable to terabytes of capacity and beyond? Augmenting device capabilities and low-level error correction techniques, sophisticated file system designs offer data organization, space provisioning, and data management functionality. The primary goal of this layer is to meet the applications' performance and cost-efficiency needs. Such designs use metadata to store and retrieve blocks of data rapidly, and they achieve consistency by ensuring that the metadata and data reflect the latest updates. As the amount of data stored in a file system grows to petabytes and exabytes, corresponding consistency characteristics vary by file system design.
Traditional solutions to hard crashes, malfunctioning hardware systems, software defects, and other such threats to data integrity are no longer sufficient. The growing demand for storage capacity in the IT infrastructure poses new technical challenges. To effectively and efficiently fulfill this demand academic and industrial researchers must find innovative solutions that can scale petabytes and beyond.
At the Symposium on Principles of Distributed Computing in 2000, Eric Brewer spoke about what came to be known as the CAP theorem — a conjecture that distributed systems can't simultaneously guarantee consistency, availability, and partition tolerance. Given that contemporary storage system architectures are inherently large-scale distributed systems, the theorem has attracted a lot of attention from storage system designers since then. In the first article in this month's theme, Brewer reviews the state-of-the-art in "CAP Twelve Years Later: How the 'Rules' Have Changed." In this lucidly authored paper, he gives guidance for designing large-scale systems with consistency and availability.
In "Loris — A Dependable, Modular File-Based Storage Stack," Raja Appuswamy, David van Moolenbroek, and Andrew Tanenbaum evaluate the traditional storage stack for reliability, heterogeneity and flexibility. They then propose a redesign in which most of the storage stack deals with finer-grained failure domains. As a result, it can potentially handle threats to data integrity — that is, data corruption, system failures, and device failures — more effectively than the traditional stack.
Abhishek Rajimwale and colleagues present several intriguing aspects of handling faulty components in modern systems in "Coerced Cache Eviction and Discreet Mode Journaling: Dealing with Misbehaving Disks." This article highlights some of the pioneering work in data integrity and consistency design issues that the team at the University of Wisconsin-Madison Computer Sciences Department has been doing.
Storage efficiency is an important design consideration for modern storage systems. Deduplication reduces the space required for storage by eliminating redundant chunks of data, whereas fault tolerance relies on increased redundancy to handle faults in a system. The evident contradiction between the two is the subject of Eric Rozier and his colleagues' very interesting article, "Modeling the Fault-tolerance Consequences of De-duplication."
"Semantic-Aware Metadata Organization Paradigm in Next-Generation File Systems," by Yu Hua and colleagues, is a novel work on highly scalable storage systems. The article describes the design of a large-scale file system that could significantly reduce performance bottlenecks due to metadata transactions.
As performance demands on storage systems keep rising, designers must be able to concurrently optimize hardware and software components. Darren Kerbyson and colleagues discuss a codesign methodology in "Codesign Challenges for Exascale Systems: Performance, Power, and Reliability."
In addition to the articles in this month's theme, we include video interviews with two thought leaders from the storage domain. Tanya Shastri spoke with Steve Kleiman, senior vice president and chief scientist at NetApp, on the industry vision of things to come as we face mounting demands on data integrity and availability at increasing scale and diversity of devices. Prof. Remzi Arpaci-Dusseau, of the University of Wisconsin-Madison, has been studying data integrity characteristics of storage devices and file systems. In a video interview on our theme, he shares his thoughts on the state-of-the-art in data integrity research. We thank them both for sharing their insights.
Data Integrity and Availability in Storage Systems,
An Interview with Steve Kleiman
An Interview with Remzi H. Arpaci-Dusseau
Sundara Nagarajan ("SN") is a technical director at NetApp and a visiting professor at International Institute of Information Technology, Bangalore, India. He is also CN's regional liaison to IEEE Computer Society activities in India. Contact him at s dot nagarajan at computer dot org.