This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Distributed Reconfiguration Strategies for Fault-Tolerant Multiprocessor Systems
August 1982 (vol. 31 no. 8)
pp. 771-784
E.M. Clarke, Center for Research in Computing Technology, Harvard University
In this paper, we investigate strategies for dynamically reconfiguring shared memory multiprocessor systems that are subject to common memory faults and unpredictable processor deaths. These strategies aim at determining a communication page, i.e., a page of common memory that can be used by a group of processors for storing crucial common resources such as global locks for synchronization and global data structures for voting algorithms. To ensure system reliability, the reconfiguration strategies must be distributed so that each processor independently arrives at exactly the same choice. This type of reconfiguration strategy is currently used in the STAGE operating system on the PLURIBUS multiprocessor [5]. We analyze the weak points of the PLURIBUS algorithm and examine alternative strategies satisfying optimization criteria such as maximization of the number of processors and the number of common memory pages in the reconfigured system. We also present a general distributed algorithm which enables the processors in such a system to exchange the local information that is needed to reach a consensus on system reconfiguration.
Index Terms:
reconfiguration strategies, Communication page, fault-tolerence, multiprocessor systems
Citation:
E.M. Clarke, C.N. Nikolaou, "Distributed Reconfiguration Strategies for Fault-Tolerant Multiprocessor Systems," IEEE Transactions on Computers, vol. 31, no. 8, pp. 771-784, Aug. 1982, doi:10.1109/TC.1982.1676083
Usage of this product signifies your acceptance of the Terms of Use.