Issue No. 11 - Nov. (2014 vol. 63)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2013.143
Guangyan Zhang , Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Weimin Zheng , Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Keqin Li , Dept. of Comput. Sci., State Univ. of New York, New Paltz, NY, USA
In RAID-5, data and parity blocks are distributed across all disks in a round-robin fashion. Previous approaches to RAID-5 scaling preserve such round-robin distribution, therefore requiring all the data to be migrated. In this paper, we rethink RAID-5 data layout and propose a new approach to RAID-5 scaling called MiPiL. First, MiPiL minimizes data migration while maintaining a uniform data distribution, not only for regular data but also for parity data. It moves the minimum number of data blocks from old disks to new disks for regaining a uniform data distribution. Second, MiPiL optimizes online data migration with piggyback parity updates and lazy metadata updates. Piggyback parity updates during data migration reduce the numbers of additional XOR computations and disk I/Os. Lazy metadata updates minimize the number of metadata writes without compromising data reliability. We implement MiPiL in Linux Kernel 22.214.171.124, and evaluate its performance by replaying three real-system traces. The results demonstrate that MiPiL consistently outperforms the existing “moving-everything” approach by 74.07-77.57% in redistribution time and by 25.78-70.50% in user response time. The experiments also illustrate that under the WebSearch2 and Financial1 workloads, the performance of the RAID-5 scaled using MiPiL is almost identical to that of the round-robin RAID-5.
Metadata, Disk arrays, Data models, Parity check codes
Guangyan Zhang, Weimin Zheng and Keqin Li, "Rethinking RAID-5 Data Layout for Better Scalability," in IEEE Transactions on Computers, vol. 63, no. 11, pp. 2816-2828, 2014.