Issue No. 12 - December (2004 vol. 30)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TSE.2004.106
Anand Tripathi , IEEE
This paper presents an abstraction called guardian for exception handling in distributed and concurrent systems that use coordinated exception handling. This model addresses two fundamental problems with distributed exception handling in a group of asynchronous processes. The first is to perform recovery when multiple exceptions are concurrently signaled. The second is to determine the correct context in which a process should execute its exception handling actions. Several schemes have been proposed in the past to address these problems. These are based on structuring a distributed program as atomic actions based on conversations or transactions and resolving multiple concurrent exceptions into a single one. The guardian in a distributed program represents the abstraction of a global exception handler, which encapsulates rules for handling concurrent exceptions and directing each process to the semantically correct context for executing its recovery actions. Its programming primitives and the underlying distributed execution model are presented here. In contrast to the existing approaches, this model is more basic and can be used to implement or enhance the existing schemes. Using several examples we illustrate the capabilities of this model. Finally, its advantages and limitations are discussed in contrast to existing approaches.
Concurrent programming, distributed programming, fault tolerance.
R. Miller and A. Tripathi, "The Guardian Model and Primitives for Exception Handling in Distributed Systems," in IEEE Transactions on Software Engineering, vol. 30, no. , pp. 1008-1022, 2004.