2001 International Conference on Parallel Processing Workshops (ICPPW'01) Fault Tolerance in the WebCom Metacomputer Valencia, Spain September 03-September 07 ISBN: 0-7695-1260-7
Abstract: This paper addresses fault tolerance in the WebCom metacomputer. WebCom's computation platform is dynamically reconfigurable and volunteer-based. Since its constituent machines may join and leave unpredictability, fault survival and efficient fault recovery is of paramount importance. A fault tolerance mechanism is outlined, which relies on a fast and efficient processor replacement procedure. It is shown that the characteristics of this procedure, together with the hierarchical and referentially transparent nature of WebCom executions, can be used to limit the affect of a fault to its immediate neighbourhood.
Index Terms:
Fault Tolerance, Condensed Graphs, Metacomputing, Distributed Computing, WebCom
Citation:
John P. Morrison, James J. Kennedy, David A. Power, "Fault Tolerance in the WebCom Metacomputer," icppw, pp.0245, 2001 International Conference on Parallel Processing Workshops (ICPPW'01), 2001 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||