11th Pacific Rim International Symposium on Dependable Computing (PRDC'05)
Research on Architecture and Design Principles of COTS Components Based Generic Fault-Tolerant Computer
Changsha, Hunan, China
December 12-December 14
ISBN: 0-7695-2492-3
Yuan Youguang, Wuhan Digital Engineering Institute, Luoyu Road 718#, Wuhan, PRC
Zhao Xiaoyong, Center of Software Engineering, Three Gorges University,Yichang, PRC
A novel fault-tolerant architecture based on COTS components is put forward and implemented in this paper. In order to make observable the internal states of COTS components, and in order to concurrently perform fault-tolerance function and normal function and control the behavior of each COTS component, the authors have devised an intelligent hardware module dedicated to faulttolerance processing, which can significantly offload application processors. This architecture digs every inherent fault-detection mechanism and adopts layered fault protection mechanism to raise fault-tolerance coverage. This architecture is efficient, flexible, scalable and transparent with respect to faulttolerance. It is Byzantine fault safe and also supports online repair. The authors also raise some design tradeoffs when designing COTS components based fault-tolerant computer.
Citation:
Ou Zhonghong, Yuan Youguang, Zhao Xiaoyong, "Research on Architecture and Design Principles of COTS Components Based Generic Fault-Tolerant Computer," prdc, pp.227-234, 11th Pacific Rim International Symposium on Dependable Computing (PRDC'05), 2005