Parallel Algorithms / Architecture Synthesis, AIZU International Symposium on (1995)

Aizu-Wakamatsu, Fukushima, Japan

Mar. 15, 1995 to Mar. 17, 1995

ISBN: 0-8186-7038-X

pp: 206

C.P. Chiu , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan

Kun-Ming Yu , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan

Tong-Ying Juang , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan

ABSTRACT

Recovering from processor failures is an important problem in the design and development of reliable systems. We present a concurrent rollback algorithm in extended hypercube networks to recover from crash failures which involves small message and time complexities. The network of an extended hypercube is a hierarchical, low diameter, recursive structure. By appending only O(1) additional information to each message, we use less than O(Nlog N) message exchanges and O(log/sup 2/ N) time elapsed for recovery work where N is the number of processors of the extended hypercube network. The algorithms can be used to recover from the failure of an arbitrary number of processors.

INDEX TERMS

communication complexity; computational complexity; system recovery; hypercube networks; fault tolerant computing; parallel algorithms; crash recovery; extended hypercube networks; processor failures; reliable systems; concurrent rollback algorithm; crash failures; small message complexity; small time complexity; hierarchical low diameter recursive structure; message exchanges

CITATION

C.P. Chiu,
Kun-Ming Yu,
Tong-Ying Juang,
"Concurrent rollback for crash recovery in extended hypercube networks",

*Parallel Algorithms / Architecture Synthesis, AIZU International Symposium on*, vol. 00, no. , pp. 206, 1995, doi:10.1109/AISPAS.1995.401336