This Article 
 Bibliographic References 
 Add to: 
Establishment of Isolated Failure Immune Real-Time Channels in HARTS
February 1995 (vol. 6 no. 2)
pp. 113-119

Abstract—Fault-tolerant, real-time communication in distributed systems is very important yet difficult to achieve. Traditional protocols like the TCP/IP achieve reliable communication through acknowledgment and retransmission schemes, where one achieves the reliability at the cost of performance. In this paper, we discuss how both the timeliness and fault-tolerance of communication can be achieved by using the concept of real-time channel [1] and exploring the inherent spatial redundancy of a given network topology. Specifically, we show how isolated failure immune real-time channels can be established in wrapped hexagonal mesh networks, thus ensuring timely delivery of messages in the presence of network component failures as long as the failures are isolated. This kind of fault-tolerance cannot be achieved with other commonly-known topologies like rings, rectangular meshes, and hypercubes. The proposed approach is to be implemented in an experimental distributed real-time system, called HARTS [2], whose construction is underway.

Index Terms—Distributed computing systems, fault-tolerant real-time communications, wrapped hexagonal mesh, isolated failure immune networks, real-time channels.

[1] D. Ferrari and D. C. Verma,“A scheme for real-time channel establishment in wide-area networks,”IEEE J. Select. Areas Commun., vol. SAC-8, pp. 368–379, Apr. 1990.
[2] K. G. Shin,“HARTS: A distributed real-time architecture,”IEEE Comput., vol. 24, pp. 25–35, May 1991.
[3] Q. Zheng and K. G. Shin,“On the ability of establishing real-time channels in point-to-point packet-switched networks,”IEEE Trans. Commun., vol. 42, pp. 1096–1105, Mar. 1994.
[4] Q. Zheng,“Real-time fault-tolerant communication in computer networks,” PhD thesis, Univ. of Michigan, 1993. PostScript version of this thesis isavailable via anonymous ftp from in directory people/zheng.
[5] Q. Zheng and K. G. Shin,“Fault-tolerant real-time communication in distributed computing systems,”inProc. 22nd Annual Int. Symp. Fault-tolerant Comput., pp. 86–93, 1992.
[6] M.-S. Chen, K. G. Shin, and D. D. Kandlur,“Addressing, routing and broadcasting in hexagonal mesh multiprocessors,”IEEE Trans. Comput., vol. 39, no. 1, pp. 10–18, Jan. 1990.
[7] A. M. Farley,“Networks immune to isolated failures,”Networks, vol. 11, pp. 255–268, 1981.
[8] D. D. Kandlur and K. G. Shin,“A communication subsystem for HARTS: An experimental distributed real-time system,”submitted for publication.
[9] D. D. Kandlur, K. G. Shin, and D. Ferrari,“Real-time communication in multi-hop networks,”inProc. Int. Conf. Distrib. Comput. Syst., pp. 300–307, May 1991.
[10] K. Shin, D. Kandlur, D. Kiskis, P. Dodd, H. Rosenberg, and A. Indiresan,“A distributed real-time operating system,”IEEE Software, pp. 58–68, Sept. 1992.

Qin Zheng, Kang G. Shin, "Establishment of Isolated Failure Immune Real-Time Channels in HARTS," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 2, pp. 113-119, Feb. 1995, doi:10.1109/71.342122
Usage of this product signifies your acceptance of the Terms of Use.