Issue No. 05 - May (2007 vol. 18)
Darin England , Department of Computer Science and Engineering, University of Minnesota, Twin Cities, Minneapolis, MN
Bharadwaj Veeravalli , Department of Electrical and Computer Engineering, National University of Singapore, Singapore
Jon B. Weissman , Department of Computer Science and Engineering, University of Minnesota, Twin Cities, Minneapolis, MN
Large-scale distributed applications are subject to frequent disruptions due to resource contention and failure. Such disruptions are inherently unpredictable and, therefore, robustness is a desirable property for the distributed operating environment. In this work, we describe and evaluate a robust topology for applications that operate on a spanning tree overlay network. Unlike previous work that is adaptive or reactive in nature, we take a proactive approach to robustness. The topology itself is able to simultaneously withstand disturbances and exhibit good performance. We present both centralized and distributed algorithms to construct the topology, and then demonstrate its effectiveness through analysis and simulation of two classes of distributed applications: Data collection in sensor networks and data dissemination in divisible load scheduling. The results show that our robust spanning trees achieve a desirable trade-off for two opposing metrics where traditional forms of spanning trees do not. In particular, the trees generated by our algorithms exhibit both resilience to data loss and low power consumption for sensor networks. When used as the overlay network for divisible load scheduling, they display both robustness to link congestion and low values for the makespan of the schedule
Robustness, Network topology, Large-scale systems, Distributed algorithms, Algorithm design and analysis, Analytical models, Power generation, Resilience, Energy consumption, Displays
D. England, B. Veeravalli and J. B. Weissman, "A Robust Spanning Tree Topology for Data Collection and Dissemination in Distributed Environments," in IEEE Transactions on Parallel & Distributed Systems, vol. 18, no. 5, pp. 608-620, 2008.