Parallel Algorithms / Architecture Synthesis, AIZU International Symposium on (1995)
Aizu-Wakamatsu, Fukushima, Japan
Mar. 15, 1995 to Mar. 17, 1995
Tong-Ying Juang , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan
C.P. Chiu , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan
Kun-Ming Yu , Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan
Recovering from processor failures is an important problem in the design and development of reliable systems. We present a concurrent rollback algorithm in extended hypercube networks to recover from crash failures which involves small message and time complexities. The network of an extended hypercube is a hierarchical, low diameter, recursive structure. By appending only O(1) additional information to each message, we use less than O(Nlog N) message exchanges and O(log/sup 2/ N) time elapsed for recovery work where N is the number of processors of the extended hypercube network. The algorithms can be used to recover from the failure of an arbitrary number of processors.
communication complexity; computational complexity; system recovery; hypercube networks; fault tolerant computing; parallel algorithms; crash recovery; extended hypercube networks; processor failures; reliable systems; concurrent rollback algorithm; crash failures; small message complexity; small time complexity; hierarchical low diameter recursive structure; message exchanges
C. Chiu, K. Yu and T. Juang, "Concurrent rollback for crash recovery in extended hypercube networks," Parallel Algorithms / Architecture Synthesis, AIZU International Symposium on(PAS), Aizu-Wakamatsu, Fukushima, Japan, 1995, pp. 206.