
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Yang Yang, Krijn van der Raadt, Henri Casanova, "Multiround Algorithms for Scheduling Divisible Loads," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 11, pp. 10921102, November, 2005.  
BibTex  x  
@article{ 10.1109/TPDS.2005.139, author = {Yang Yang and Krijn van der Raadt and Henri Casanova}, title = {Multiround Algorithms for Scheduling Divisible Loads}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {16}, number = {11}, issn = {10459219}, year = {2005}, pages = {10921102}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2005.139}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  Multiround Algorithms for Scheduling Divisible Loads IS  11 SN  10459219 SP1092 EP1102 EPD  10921102 A1  Yang Yang, A1  Krijn van der Raadt, A1  Henri Casanova, PY  2005 KW  Parallel processing KW  scheduling KW  divisible loads KW  multiround algorithms. VL  16 JA  IEEE Transactions on Parallel and Distributed Systems ER   
Abstract—Divisible load applications occur in many fields of science and engineering and can be easily parallelized in a masterworker fashion, but pose several scheduling challenges. While a number of approaches have been proposed that allocate load to workers in a single round, using multiple rounds improves overlap of computation with communication. Unfortunately, multiround algorithms are difficult to analyze and have thus received only limited attention. In this paper, we answer three open questions in the multiround divisible load scheduling area: 1) how to account for latencies, 2) how to account for heterogeneous platforms, and 3) how many rounds should be used. To answer 1), we derive the first closedform optimal schedule for a homogeneous platform with both computation and communication latencies, for a given number of rounds. To answer 2) and 3), we present a novel algorithm, UMR. We evaluate UMR in a variety of realistic scenarios.
[1] E.G. Coffman, M.R. Garey, and D.S. Johnson, Bin Packing Approximation Algorithms: A Survey. Boston: PWS Publishing Co., 1996.
[2] M. Drozdowski and P. Wolniewicz, “Experiments with Scheduling Divisible Tasks in Clusters of Workstations,” Proc. Int'l Conf. Parallel and Distributed Computing (Europar), pp. 311319, 2000.
[3] V. Bharadwaj and S. Ranganath, “Theoretical and Experimental Study on Large Size Image Processing Applications Using Divisible Load Paradigm on Distributed Bus Networks,” Image and Vision Computing, vol. 20, nos. 1314, pp. 9171034, 2002.
[4] C.K. Lee and M. Hamdia, “Parallel Image Processing Applications on a Network of Workstations,” Parallel Computing, vol. 21, pp. 137160, 1995.
[5] G. Miller, D.G. Payne, T.N. Phung, H. Siegel, and R. Williams, “Parallel Processing of Spaceborne Imaging Radar Data,” Proc. Int'l Conf. High Performance Computing and Comm. (SC '95), 1995.
[6] Y.J. Chiang, R. Farias, C.T. Silva, and B. Wei, “A Unified Infrastructure for Parallel OutofCore Isosurface Extraction and Volume Rendering of Unstructured Grids,” Proc. IEEE Symp. Parallel and LargeData Visualization and Graphics, pp. 5966, 2001.
[7] W. Bethel, B. Tierney, J. lee, D. Gunter, and S. Lau, “Using Highspeed WANs and Network Data Caches to Enable Remote and Distributed Visualization,” Proc. Int'l Conf. High Performance Computing and Comm. (SC '00), 2000.
[8] A. Garcia and H.W. Shen, “Parallel Volume Rendering: An Interleaved Parallel Volume Renderer with PCClusters,” Proc. Fourth Eurographics Workshop Parallel Graphics and Visualization, pp. 5159, 2002.
[9] Visible Human Project, 2003, http://www.nlm.nih.gov/research/visiblevisible_human.html .
[10] V. Bharadwaj, D. Ghose, V. Mani, and T.G. Robertazzi, Scheduling Divisible Loads in Parallel and Distributed Systems. IEEE CS Press, 1996.
[11] Cluster Computing, special issue on divisible load scheduling, vol. 6, no. 1, 2003.
[12] Grid2: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, eds., second ed. San Francisco: Morgan Kaufmann Publishers, 2003.
[13] V. Bharadwaj, D. Ghose, and V. Mani, “Optimal Sequencing and Arrangement in SingleLevel Tree Networks with Communication Delays,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 9, pp. 968976 Sept. 1994.
[14] V. Bharadwaj, D. Ghose, and V. Mani, “MultiInstallment Load Distribution in Tree Networks with Delays,” IEEE Trans. Aerospace and Electronc Systems, vol. 31, no. 2, pp. 555567, 1995.
[15] J. Blazewicz and M. Drozdowski, “Distributed Processing of Divisible Jobs with Communication Startup Costs,” Discrete Applied Math., vol. 76, pp. 2141, 1997.
[16] V. Bharadwaj, X. Li, and C.C. Ko, “On the Influence of StartUp Costs in Scheduling Divisible Loads on Bus Networks,” IEEE Trans. Aerospace and Electronic Systems, vol. 11, no. 12, pp. 12881305, 2000.
[17] O. Beaumont, A. Legrand, and Y. Robert, “Optimal Algorithms for Scheduling Divisible Workloads on Heterogeneous Systems,” Technical Report 200236, Ecole Normale Superieure de Lyon, Oct. 2002.
[18] S. Bataineh, T.Y. Hsiung, and T.G. Robertazzi, “Closed Form Solutions for Bus and Tree Networks of Processors Load Sharing a Divisible Job,” IEEE Trans. Computers, vol. 43, no. 10, Oct. 1994.
[19] K. Li, “Scheduling Divisible Tasks on Heterogeneous Linear Arrays with Applications to Layered Networks,” Proc. Third Int'l Workshop Parallel and Distributed Scientific and Eng. Computing with Applications, 2002.
[20] M. Drozdowski and W. Glazek, “Scheduling Divisible Loads in a ThreeDimensional Mesh of Processors,” Parallel Computing, vol. 25, no. 4, 1999.
[21] K. Li, “Parallel Processing of Divisible Loads on Partitionable Static Interconnection Networks,” Cluster Computing, vol. 6, no. 1, pp. 4755, 2003.
[22] D. Ghose and V. Mani, “Distributed Computation with Communication Delays: Asymptotic Performance Analysis,” J. Parallel and Distributed Computing, vol. 23, no. 3, 1994.
[23] J. Blazewicz, “Performance Limits of a TwoDimensional Network of Load Sharing Processors,” Foundations of Computing and Decision Sciences, vol. 21, no. 1, pp. 315, 1996.
[24] H.J. Kim, G.I. Jee, and J.G. Lee, “Optimal Load Distribution for Tree Network Processors,” IEEE Trans. Aerospace and Electronic Systems, vol. 32, no. 2, pp. 607611, 1996.
[25] A.L. Rosenberg, “Sharing Partitionable Workloads in Heterogeneous NOWs: Greedier Is Not Better,” Proc. Third IEEE Int'l Conf. Cluster Computing (CLUSTER '01), pp. 124131, 2001.
[26] M. Drozdowski and P. Wolniewicz, “Divisible Load Scheduling in Systems with Limited Memory,” Cluster Computing, vol. 6, no. 1, pp. 1929, 2003.
[27] S.F. Hummel, “Factoring: A Method for Scheduling Parallel Loops,” Comm. ACM, vol. 35, no. 8, pp. 90101, Aug. 1992.
[28] T. Hagerup, “Allocating Independent Tasks to Parallel Processors: An Experimental Study,” J. Parallel and Distributed Computing, vol. 47, pp. 185197, 1997.
[29] Y. Yang and H. Casanova, “RUMR: Robust Scheduling for Divisible Workloads,” Proc. 12th IEEE Symp. HighPerformance Distributed Computing (HPDC12), pp. 114125, June 2003.
[30] D. Altilar and Y. Paker, “An Optimal Scheduling Algorithm for Parallel Video Processing,” Proc. IEEE Int'l Conf. Multimedia Computing and Systems, pp. 245248, 1998.
[31] O. Beaumont, L. Carter, J. Ferrante, A. Legrand, and Y. Robert, “BandwidthCentric Allocation of Independent Tasks on Heterogeneous Platforms,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), June 2002.
[32] O. Beaumont, A. Legrand, and Y. Robert, “The MasterSlave Paradigm with Heterogeneous Processors,” Technical Report RR200113, Ecole Normale Superieure de Lyon, Mar. 2001.
[33] HMMER Webpage, http:/hmmer.wustl.edu/, 2003.
[34] A. Watt, 3D Computer Graphics, chapter 13, third ed. AddisonWesley, 2000.
[35] K. Shen, L.A. Rowe, and E.J. Delp, “A Parallel Implementation of an MPEG1 Encoder: Faster than RealTime!,” Proc. SPIE Conf. Digital Video Compression: Algorithms and Technologies, pp. 407418, Feb. 1995.
[36] F.J. GonzalezCastano, R. AsoreyCacheda, R.P. MartinezAlvarez, F. ComesanaSeijo, and J. ValesAlonso, “DVD Transcoding via Linux Metacomputing,” Linux J., vol. 116, p. 8, 2003.
[37] N. Amano, J. Gama, and F. Silva, “Exploiting Parallelism in Decision Tree Induction,” Proc. ECML/PKDD Workshop Parallel and Distributed Computing for Machine Learning, pp. 1322, Sept. 2003.
[38] S. Goil and A. Choudhary, “High Performance Multidimensional Analysis of Large Data Sets,” Proc. First ACM Int'l Workshop Data Warehousing and OLAP, pp. 3439, 1998.
[39] D. Skillicorn, “Strategies for Parallel Data Mining,” IEEE Concurrency, vol. 7, no. 4, pp. 2635, 1999.
[40] Mencoder Media Player, http:/www.mplayerhq.hu, 2004.
[41] VFleet Webpage, http://www.psc.edu/PackagesVFleet_Home, 2003.
[42] M. Tamura, T. an Oguchi, and M. Kitsuregawa, “Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining,” Proc. Int'l Conf. High Performance Computing and Comm., pp. 116, Nov. 1997.
[43] Y. Yang and H. Casanova, “Extensions to the MultiInstallment Algorithm: Affine Costs and Output Data Transfers,” Technical Report CS20030754, Dept. of Computer Science and Eng., Univ. of California, San Diego, July 2003.
[44] G. Chun, H. Dail, H. Casanova, and A. Snavely, “Benchmark Probes for Grid Assessment,” Proc. HighPerformance Grid Computing Workshop, Apr. 2004.
[45] R.L. Graham, D.E. Knuth, and O. Patashnik, Concrete Mathematics. Wiley, 1994.
[46] Y. Yang and H. Casanova, “MultiRound Algorithm for Scheduling Divisible Workload Applications: Analysis and Experimental Evaluation,” Technical Report CS20020721, Dept. of Computer Science and Eng., Univ. of California, San Diego, 2002.
[47] D. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods. Belmont, Mass.: Athena Scientific, 1996.
[48] A. Legrand, L. Marchal, and H. Casanova, “Scheduling Distributed Applications: The SimGrid Simulation Framework,” Proc. Third IEEE Int'l Symp. Cluster Computing and the Grid (CCGrid '03), May 2003.
[49] Y. Yang and H. Casanova, “UMR: A MultiRound Algorithm for Scheduling Divisible Workloads,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS 2003), Apr. 2003.
[50] K. van der Raadt, Y. Yang, and H. Casanova, “APSTDV: Divisible Load Scheduling and Deployment on the Grid,” Technical Report CS20040785, Dept. of Computer Science and Eng., Univ. of California, San Diego, Apr. 2004.
[51] H. Casanova, “Modeling LargeScale Platforms for the Analysis and the Simulation of Scheduling Strategies,” Proc. Sixth Workshop Advances in Parallel and Distributed Computational Models, Apr. 2004.