This Article 
 Bibliographic References 
 Add to: 
Information Dissemination in Distributed Systems with Faulty Units
June 1994 (vol. 43 no. 6)
pp. 698-710

Consider a network consisting of units connected by links in which some units could be faulty. Suppose each unit has a message which must be transmitted to all other (fault-free) units. We present an algorithm for doing this in a network operating in a fully distributed manner that requires at most 3n logn+O(n) message transmissions by fault-free units. Among other things, our result can be used to devise an algorithm for distributed system level diagnosis which is more efficient than the best currently known algorithm for this purpose.

[1] J. A. Bondy and U. S. R. Murthy,Graph Theory with Applications. New York: American Elsevier Publishing Co., 1976.
[2] M. C. Golumbic, "The general gossip problem," IBM Res. Rep. 4977, 1974.
[3] R. C. Entringer and P. J. Slater, "Gossips and telegraphs,"J. Franklin Inst., vol. 307, pp. 353-360, 1979.
[4] S. Even and B. Monien, "On the number of rounds necessary to disseminate information," inProc. 1st ACM Symp. on Parallel Algorithms and Architectures, Santa Fe, NM, June 1989.
[5] S. M. Hedetniemi, S. T. Hedetniemi, and A. L. Liestman, "A survey of gossiping and broadcasting in communication networks,"Networks, vol. 18, pp. 319-349, 1988.
[6] R. Ladin, B. Liskov, and L. Shrira, "A technique for constructing highly available services,"Algorithmica, vol. 3, pp. 393-420, 1988.
[7] A. Bagchi, S. L. Hakimi, and E. F. Schmeichel, "Gossiping in a distributed network,"IEEE Trans. Comput., vol. 42, no. 2, pp. 253-256, Feb. 1993.
[8] K. A. Berman and M. Hawrylycz, "Telephone problems with failures,"SIAM J. on Algebraic and Discrete Math., vol. 7, pp. 13-17, 1966.
[9] E. Korach, S. Moran, and S. Zaks, "The optimality of distributed constructions of minimum weight and degree restricted spanning trees in a complete network of processors,"SIAM J. Comput., vol. no. 2, pp. 231-236, 1987.
[10] B. Awerbuch, O. Goldreich, D. Peleg, and R. Vainish, "A trade-off between information and communication in broadcast protocols,"J. ACM, vol. 37, no. 2, pp. 238-256, 1990.
[11] R. G. Gallager, P. A. Humblet, and P. M. Spira, "A distributed algorithm for minimum weight spanning trees,"ACM Trans. Programming Languages and Syst., vol. 5, no. 1, pp. 66-67, Jan. 1983.
[12] B. Awerbuch, "A new distributed depth first search algorithm,"Inform. Processing Lett., vol. 20, pp. 147-150, 1985.
[13] I. Cidon, "Yet another distributed depth first search algorithm,"Inform. Processing Lett., vol. 26, pp. 301-305, 1988.
[14] S. Kutten, "Optimal fault-tolerant distributed computing of a spanning forest,"Inform. Processing Lett., vol. 27, pp. 299-307, 1988.
[15] R. Bar-Yehuda and S. Kutten, "Fault-tolerant distributed majority commitment,"J. of Algorithms, vol. 9, pp. 568-582, 1988.
[16] G. N. Frederickson and N. A. Lynch, "Electing a leader in a synchronous ring,"J. ACM, vol. 34, pp. 98-115, 1987.
[17] E. Korach, S. Kutten, and S. Moran, "A modular technique for the design of efficient distributed leader finding algorithms,"ACM Trans. Prog. Lang. and Syst., vol. 12, pp. 84-101, 1990.
[18] H. Garcia-Molina, "Election in a distributed computing system,"IEEE Trans. Comput., vol. C-31, no. 1, 1982.
[19] H. H. Abu-Amara, "Fault-tolerant distributed algorithm for election in complete networks,"IEEE Trans. Comput.,C-37(1988), pp. 449-453.
[20] S. Kutten and Y. Wolfstahl, "Tight upper and lower bounds on the message complexity oftresilient election in complete networks,"Internal Memo., Technion, Israel, Nov. 1985.
[21] L. Shrira and O. Goldreich, "Electing a leader in the presence of faults: A ring as a special case,"Acta Informatica, vol. 24, pp. 79-91, 1987.
[22] F. Preparata, G. Metze, and R. T. Chien, "On the connection assignment problem of diagnosable systems,"IEEE Trans. Electron. Comput., vol. 16, pp. 848-854, 1967.
[23] S. L. Hakimi and A. T. Amin, "Characterization of the connection assignment of diagnosable systems,"IEEE Trans. Comput., vol. C-23, no. 1, pp. 86-88, Jan. 1974.
[24] A. T. Dahbura and G. M. Masson, "AnO(n2.5) fault identification algorithm for diagnosable systems,"IEEE Trans. Comput., vol. C-33, no. 6, pp. 485-492, Dec. 1984.
[25] J. G. Kuhl and S. M. Reddy, "Distributed fault-tolerance for large multiprocessor system," inProc. 1980 Comput. Architecture Conf., France, May 1980.
[26] S. H. Hosseini, J. G. Kuhl, and S. M. Reddy, "A diagnosis algorithm for distributed computing systems with dynamic failure and repair,"IEEE Trans. Comput., vol. C-33, no. 3, pp. 223-233, Mar. 1984.
[27] A. Ghafoor and P. Sole, "Performance of fault-tolerant diagnostics in the hypercube systems,"IEEE Trans. Comput., vol. 38, no. 8, pp. 1164-1172, Aug. 1989.
[28] S. Mallela and G. M. Masson, "Diagnosable systems for intermittent faults,"IEEE Trans. Comput., vol. C-27, pp. 560-566, 1978.
[29] M. Barborak, M. Malek, and A. Dahbura, "The consensus problem in fault-tolerant computing,"ACM Computing Surveys, vol. 25, no. 2, pp. 171-220, June 1993.
[30] M. B. Sharma, S. S. Iyengar, and N. K. Mandyam, "An efficient distributed depth-first-search algorithm,"Inform. Processing Lett., vol. 32, pp. 183-186, 1989.

Index Terms:
distributed algorithms; message passing; fault tolerant computing; reliability; failure analysis; reliability theory; communication complexity; information dissemination; distributed systems; faulty units; fault-free units; distributed system level diagnosis; election; spanning tree; distributed algorithm.
A. Bagchi, S.L. Hakimi, "Information Dissemination in Distributed Systems with Faulty Units," IEEE Transactions on Computers, vol. 43, no. 6, pp. 698-710, June 1994, doi:10.1109/12.286303
Usage of this product signifies your acceptance of the Terms of Use.