This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
February 1995 (vol. 44 no. 2)
pp. 192-202

Abstract— We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusive-OR computations. This redundant storage is optimal, in the sense that two failed disks cannot be retrieved with less than two redundant disks. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID-5 controllers. Hence, EVENODD can be implemented on standard RAID-5 controllers without any hardware changes. The most commonly used scheme that employes optimal redundant storage (i.e., two extra disks) is based on Reed–Solomon (RS) error-correcting codes. This scheme requires computation over finite fields and results in a more complex implementation. For example, we show that the complexity of implementing EVENODD in a disk array with 15 disks is about 50% of the one required when using the RS scheme.

The new scheme is not limited to RAID architectures: it can be used in any system requiring large symbols and relatively short codes, for instance, in multitrack magnetic recording. To this end, we also present a decoding algorithm for one column (track) in error.

Index Terms—RAID architectures, erasure-correcting codes, Reed–Solomon codes, disk arrays.

[1] M. Blaum,“A class of byte-correcting array codes,”IBM Research Report, RJ 5652 (57151), May 1987.
[2] ——,“A coding technique for recovery against double disk failures in disk arrays,”inProc. IEEE Int. Conf. Commun., Chicago, IL, June 1992, pp. 1366–1368.
[3] M. Blaum, J. Brady, J. Bruck, and J. Menon,“EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures”inProc. Int. Symp. Comput. Architecture (ISCA), Chicago, IL, Apr. 1994.
[4] M. Blaum, J. Bruck, and A. Vardy,“Binary codes with large symbols,”inProc. 1994 IEEE Int. Symp. Inform. Theory, June 1994.
[5] M. Blaum, H. Hao, R. Mattson, and J. Menon,“A coding technique for double disk failures in disk Arrays,”U.S. Patent 5 271 012, Dec. 1993.
[6] M. Blaum and R. Roth,“New array codes for multiple phased burst correction,”IEEE Trans. Inform. Theory, pp. 66–77, Jan. 1993.
[7] W. Burkhard and J. Menon,“Disk array storage system reliability,”inProc. 23rd Annu. Int. Symp. Fault-Tolerant Computing, Toulouse, France, June 1993.
[8] T. Fuja, C. Heegard, and M. Blaum,“Cross parity check convolutional codes,”IEEE Trans. Inform. Theory, July 1989, pp. 1264–1276.
[9] G.A. Gibson, L. Hellerstein, R.M. Karp, R.H. Katz, and D.A. Patterson, Coding Techniques for Hhandling Failures in Large Disk Arrays, csd-88-477 technical report, Univ. of California Berkley, 1988.
[10] R. Goodman andM. Sayano,“Size limits on phased burst error correcting array codes,”Electron. Lett., vol. 26, pp. 55–56, 1990.
[11] R. Goodman, R. J. McEliece, and M. Sayano,“Phased burst correcting array codes,”IEEE Trans. Inform. Theory, pp. 684–693, Mar. 1993.
[12] F. J. MacWilliams and N. J. A. Sloane,The Theory of Error-Correcting Codes. Amsterdam, The Netherlands: North-Holland, 1977.
[13] S. W. Ng,“Some design issues of disk arrays,”IBM Research Report, RJ 6590 (63550), Dec. 1988.
[14] A. M. Patel,“Multitrack error correction with cross-parity check coding,”IBM Technical Report TR02.813, 1978.
[15] A. M. Patel,“Adaptive cross parity code for a high density magnetic tape subsystem,”IBM J. Res. Develop., vol. 29, pp. 546–562, 1985.
[16] D.A. Patterson, G. Gibson, and R.H. Katz, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” Proc. ACM SIGMOD Conf., pp. 109–116, 1988.
[17] P. Prusinkiewicz and S. Budkowski,“A double track error-correction code for magnetic tape,”IEEE Trans. Comput., pp. 642–645, June 1976.

Citation:
Mario Blaum, Jim Brady, Jehoshua Bruck, Jai Menon, "EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures," IEEE Transactions on Computers, vol. 44, no. 2, pp. 192-202, Feb. 1995, doi:10.1109/12.364531
Usage of this product signifies your acceptance of the Terms of Use.