loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
3rd Euromicro Workshop on Parallel and Distributed Processing
Algorithm-based fault-tolerant programming in scientific computation on multiprocessors
San Remo, Italy
January 25-January 27
ISBN: 0-8186-7031-2
J. Altmann, IMMD III, Erlangen-Nurnberg Univ., Germany
A. Bohm, IMMD III, Erlangen-Nurnberg Univ., Germany
Efficient parallel algorithms proposed to solve many fundamental problems in scientific computation are sensitive to processor failures. Because of its low costs, algorithm-based fault tolerance is an interesting concept for introducing fault tolerance into existing multiprocessors. To facilitate fault-tolerant programming in scientific computation, we have modified and developed further an existing parallel run-time environment. In this paper the aspect of tuning known error processing techniques to the algorithm-based approach is primarily examined. Design issues for implementation and execution time overhead of a fault-tolerant application in our run-time environment are studied. In contrast to many other environments for parallel fault-tolerant programming, which use the master/slave programming model, our environment enables one to add fault tolerance to existing parallel applications in scientific computation.
Index Terms:
parallel algorithms; multiprocessing systems; software fault tolerance; parallel programming; programming environments; algorithm-based fault-tolerant programming; scientific computation; multiprocessors; parallel algorithms; parallel run-time environment; error processing techniques; execution time overhead; master/slave programming model
Citation:
J. Altmann, A. Bohm, "Algorithm-based fault-tolerant programming in scientific computation on multiprocessors," pdp, pp.374, 3rd Euromicro Workshop on Parallel and Distributed Processing, 1995
Usage of this product signifies your acceptance of the Terms of Use.