2007 International Conference on Parallel Processing Workshops (ICPPW 2007)
A Similar Resource Auto-Discovery Based Adaptive Fault-tolerance Method for Embedded Distributed System
Xi'an, Chin
September 10-September 14
ISBN: 0-7695-2934-8
Kailong Zhang, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Ke Liang, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Xingshe Zhou, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Kaibo Wang, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Xiao Wu, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Zhiyi Yang, Northwestern Polytechnical University, China; Shaanxi Key Embedded System Technology Laboratory, China
Because of the resource constraints and high reliability requirement of Embedded Distributed System (EDS), some new fault-tolerance means, which are different from the traditional hardwareredundancy ones, should be studied. In this article, a fault-tolerance method that based on similar resources and related technologies are proposed and discussed. First, several mathematical models of key elements, such as computing nodes, similar nodes and tasks, are constructed. Then, the similarity computation methods and evaluation criteria are evinced by two different views: tasks and resources. Supported by theories above, numerous methods, such as similar nodes auto-discovery (SNAD) and its optimization one (oSNAD), redundant tasks auto-deployment, and reconfiguration policies of fault tasks and nodes are highlighted respectively. Simulation results show that these approaches and schemes can improve the adaptive fault-tolerance abilities of complicated embedded distributed systems.
Citation:
Kailong Zhang, Ke Liang, Xingshe Zhou, Kaibo Wang, Xiao Wu, Zhiyi Yang, "A Similar Resource Auto-Discovery Based Adaptive Fault-tolerance Method for Embedded Distributed System," icppw, pp.21, 2007 International Conference on Parallel Processing Workshops (ICPPW 2007), 2007