Pages: pp. 97-98
The last decade has seen a dramatic increase in the deployment of heterogeneous distributed computing platforms, in particular, those consisting of heterogeneous clusters, and multiple heterogeneous collections of clusters aggregated over wide-area networks into grids. The software infrastructures and mechanisms to deploy such platforms have been well studied and implementations are already used in production, so that heterogeneous platforms represent a significant, and growing, fraction of the computational power delivered by parallel platforms today. In spite of these successes, many research challenges remain, including those pertaining to distributed algorithms and scheduling algorithms, which are critical for ensuring that these platforms are used effectively. In this context, the goal of this special section on "Algorithm Design and Scheduling Techniques (Realistic Platform Models) for Heterogeneous Clusters" is to gather papers that further our understanding of the impact of platform heterogeneity on the design and evaluation of new such algorithms.
In the paper entitled "Allocating Non-Real-Time and Soft Real-Time Jobs in Multiclusters," Ligang He, Stephen A. Jarvis, Daniel P. Spooner, Hong Jiang, Donna N. Dillenberger, and Graham R. Nudd introduce two workload allocation strategies for large-scale heterogeneous platforms. The first strategy achieves an optimized mean response time for jobs having no real-time requirements. The second strategy obtains an optimized mean miss rate for jobs having soft real-time requirements (i.e., a fraction of jobs are permitted to miss the real-time constraints). Both strategies take into account average system behaviors (such as the mean arrival rate of jobs) to calculate the workload proportions for individual clusters, and update on-the-fly the workload allocation when the change in the mean arrival rate reaches a certain threshold. The allocation schemes are combined with two job dispatching strategies (weighted random and weighted round-robin) to generate new job scheduling algorithms for multicluster environments.
In their paper "On the Distribution of Sequential Jobs in Random Brokering for Heterogeneous Computational Grids," Vandy Berten, Joel Goossens, and Emmanuel Jeannot study resource brokering for scheduling sequential jobs onto a grid platform that consists of heterogeneous sets of homogeneous processors, such as a set of clusters. Resources in each cluster are managed by a local scheduler that maintains a job queue. The paper studies a centralized "metascheduler" that uses a randomized strategy to share available resources among competing jobs. This research considers two cases depending on whether the platform is heavily loaded or lightly loaded. For each case, it obtains both analytical and experimental characterizations of the queue lengths at each local scheduler, CPU utilization, and average job slowdowns. Furthermore, the paper presents a discussion of the system's behavior when it transitions between a heavily loaded state and a lightly loaded one. All presented theoretical results are corroborated by simulations and provide a thorough description of randomized resource brokering.
The research in "Multiple Job Scheduling in a Connection-Limited Data Parallel System" presents a new method for scheduling jobs in a distributed system where the critical resource is the bandwidth to access the stored data. The authors, Alessandro Amoroso and Keith Marzullo, describe an approach that supports the master-worker scheme and can be applied to data parallel computation. They consider a typical wide-area data grid that is comprised of a set of sites, where each site has one or more local area networks. The platform model used is based on the Nile data grid. This paper uses a set of synthetic jobs to compare three schedulers: Greedy, Maxfow, and Hybrid. They tested their new approach under various circumstances and measured its performance by means of several metrics. The new Hybrid scheduler is never worse than either of the other two schedulers, and in 20 percent of the simulated runs, it produced runs that were at least 20 percent better.
The paper entitled "Capacity-Aware Multicast Algorithms on Heterogeneous Overlay Networks," coauthored by Zhan Zhang, Shigang Chen, Yibei Ling, and Randy Chow, addresses the problem of multicast for group communication among a distributed, dynamic set of heterogeneous nodes. Two capacity-aware overlay multicast services that focus on host heterogeneity, any-source multicast, dynamic membership, and scalability are proposed. Capacity is modeled as the maximum number of direct children to which a node is willing to forward multicast messages. The target applications considered are multisource environments, such as distributed games, teleconferencing, and virtual classrooms. They extend Chord and Koorde to be capacity-aware, and embed implicit degree-varying multicast trees on top of the overlay network and develop multicast routines that automatically follow the trees to disseminate multicast messages. They analyze the expected performance of the proposed multicasting schemes and perform simulations. The simulations show that the two methods achieve their best performances under different conditions, depending on membership change frequencies and node capacities.
We are very grateful to all who have helped in bringing about this special section. Twenty-one papers were submitted, of which only four were accepted. We thank the authors of all submitted papers for their interest in this special section, as well as the reviewers for their insightful comments and recommendations. We wish to acknowledge the excellent job of Ms. Suzanne Werner and Ms. Jennifer Carruth for helping manage the entire submission, review, and publication process. Finally, we thank Dr. Pen-Chung Yew, the former Editor-in-Chief of IEEE Transactions on Parallel and Distributed Systems, who originally envisioned a special issue on algorithms and scheduling, and who proposed the idea at the TPDS editorial meeting held during IPDPS 2004 in Santa Fe, New Mexico.