Issue No. 08 - August (2011 vol. 22)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.201
Jinoh Kim , University of Minnesota, Minneapolis
Abhishek Chandra , University of Minnesota, Minneapolis
Jon B. Weissman , University of Minnesota, Minneapolis
Distributed computing applications are increasingly utilizing distributed data sources. However, the unpredictable cost of data access in large-scale computing infrastructures can lead to severe performance bottlenecks. Providing predictability in data access is, thus, essential to accommodate the large set of newly emerging large-scale, data-intensive computing applications. In this regard, accurate estimation of network performance is crucial to meeting the performance goals of such applications. Passive estimation based on past measurements is attractive for its relatively small overhead compared to relying on explicit probing. In this paper, we take a passive approach for network performance estimation. Our approach is different from existing passive techniques that rely either on past direct measurements of pairs of nodes or on topological similarities. Instead, we exploit secondhand measurements collected by other nodes without any topological restrictions. In this paper, we present Overlay Passive Estimation of Network performance (OPEN), a scalable framework providing end-to-end network performance estimation based on secondhand measurements, and discuss how OPEN achieves cost-effective estimation in a large-scale infrastructure. Our extensive experimental results show that OPEN estimation can be applicable for replica and resource selections commonly used in distributed computing.
Network performance estimation, secondhand estimation, data-intensive computing, replica selection, resource selection.
J. B. Weissman, A. Chandra and J. Kim, "Passive Network Performance Estimation for Large-Scale, Data-Intensive Computing," in IEEE Transactions on Parallel & Distributed Systems, vol. 22, no. , pp. 1365-1373, 2010.