2006 15th IEEE International Conference on High Performance Distributed Computing Fault Tolerance of Tornado Codes for Archival Storage Paris June 19-June 23 ISBN: 1-4244-0307-3
This paper examines a class of low density parity check (LDPC) erasure codes called Tornado codes for applications in archival storage systems. The fault tolerance of Tornado code graphs is analyzed and it is shown that it is possible to identify and mitigate worst-case failure scenarios in small (96 node) graphs through use of simulations to find and eliminate critical node sets that can cause Tornado codes to fail even when almost all blocks are present. The graph construction procedure resulting from the preceding analysis is then used to construct a 96-device Tornado code storage system with capacity overhead equivalent to RAID 10 that tolerates any 4 device failures. This system is demonstrated to be superior to other parity-based RAID systems. Finally, it is described how a geographically distributed data stewarding system can be enhanced by using cooperatively selected Tornado code graphs to obtain fault tolerance exceeding that of its constituent storage sites or site replication strategies
Index Terms:
distributed data stewarding system, fault tolerance, Tornado code graph, low density parity check, LDPC erasure code, archival storage system, parity-based RAID system
Citation:
M. Woitaszek, H.M. Tufo, "Fault Tolerance of Tornado Codes for Archival Storage," hpdc, pp.83-92, 2006 15th IEEE International Conference on High Performance Distributed Computing, 2006 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||