Issue No. 04 - April (1986 vol. 12)
N. Natarajan , Department of Computer Science, the Pennsylvania State University, University Park, PA 16802
A distributed system is an interconnected network of computing elements or nodes, each of which has its own storage. A distributed program is a collection of processes which execute asynchronously, possibly in different nodes of a distributed system, and they communicate with each other in order to realize a common goal. In such an environment, a group of processes may sometimes get involved in a communication deadlock. This is a situation in which each member process of the group is waiting for some member to communicate with it, but no member is attempting communication with it. In this paper, we present an algorithm for detecting such communication deadlocks. The algorithm is distributed, i.e., processes detect deadlocks during the course of their communication, without the aid of a central controller. The detection scheme does not presume any a priori structure among processes, and detection is made “on the fly” without freezing normal activities. The scheme does not require any storage whose size is determined by the size of the network, and hence is suitable also for an environment where processes are created dynamically.
System recovery, Synchronization, Detectors, Kernel, Detection algorithms, Educational institutions, distributed system, Communication, computing agents, deadlock, distributed program
N. Natarajan, "A distributed scheme for detecting communication deadlocks", IEEE Transactions on Software Engineering, vol. 12, no. , pp. 531-537, April 1986, doi:10.1109/TSE.1986.6312900