This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Adaptive Computing on the Grid Using AppLeS
April 2003 (vol. 14 no. 4)
pp. 369-382

Abstract—Ensembles of distributed, heterogeneous resources, also known as Computational Grids, have emerged as critical platforms for high-performance and resource-intensive applications. Such platforms provide the potential for applications to aggregate enormous bandwidth, computational power, memory, secondary storage, and other resources during a single execution. However, achieving this performance potential in dynamic, heterogeneous environments is challenging. Recent experience with distributed applications indicates that adaptivity is fundamental to achieving application performance in dynamic grid environments. The AppLeS (Application Level Scheduling) project provides a methodology, application software, and software environments for adaptively scheduling and deploying applications in heterogeneous, multiuser grid environments. In this article, we discuss the AppLeS project and outline our findings.

[1] The Grid: Blueprint for a New Computing Infrastructure. I. Foster and C. Kesselman, eds., San Francisco, Calif.: Morgan Kaufmann Publishers, 1999.
[2] F. Berman, R. Wolski, S. Figueira, J. Schopf,, and G. Shao, “Application-Level Scheduling on Distributed Heterogeneous Networks,” Proc. Supercomputing, 1996.
[3] K. Czajkowski et al., "Grid Information Services for Distributed Resource Sharing," IEEE Press, 2001, pp. 181–194.
[4] H. Dail, H. Casanova, and F. Berman, “A Decoupled Scheduling Approach for the GrADS Environment,” Proc. Supercomputing '02, Nov. 2002.
[5] I. Foster and K Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” Int'l J. Supercomputer Applications, vol. 11, no. 2, pp. 115-128, 1997.
[6] A. Grimshaw, A. Ferrari, F.C. Knabe, and M. Humphrey, “Wide-Area Computing: Resource Sharing on a Large Scale,” Computer, vol. 32, no. 5, pp. 29-37, May 1999.
[7] H. Casanova and J. Dongarra, “NetSolve: A Network Server for Solving Computational Science Problems,” Int'l J. Supercomputer Applications and High Performance Computing, vol. 11, no. 3, pp. 212-223, 1997.
[8] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek,, and V. Sunderam,PVM: Parallel Virtual Machine—A Users' Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1994.
[9] M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra, “MPI: The Complete Reference,” MIT Press,, 1995.
[10] N. Spring and R. Wolski, “Application Level Scheduling of Gene Sequence Comparison on Metacomputers,” Proc. 12th ACM Int'l Conf. Supercomputing, July 1998.
[11] A. Su, F. Berman, R. Wolski, and M. Mills Strout, “Using AppLeS to Schedule Simple SARA on the Computational Grid,” Int'l J. High Performance Computing Applications, vol. 13, no. 3, pp. 253-262, 1999.
[12] S. Smallen, W. Cirne, J. Frey, F. Berman, R. Wolski, M.H. Su, C. Kesselman, S. Young, and M. Ellisman, “Combining Workstations and Supercomputers to Support Grid Applications: The Parallel Tomography Experience,” Proc. Ninth Heterogeneous Computing Workshop, pp. 241-252, May 2000.
[13] W. Cirne and F. Berman, "Adaptive Selection of Partition Size for Supercomputer Requests," Job Scheduling Strategies for Parallel Processing, LNCS 1911, Springer-Verlag, 2000, pp. 187-207.
[14] H. Dail, G. Obertelli, F. Berman, R. Wolski, and A. Grimshaw, “Application-Aware Scheduling of a Magnetohydrodynamics Application in the Legion Metasystem,” Proc. Ninth Heterogeneous Computing Workshop (HCW '00), May 2000.
[15] H. Casanova, G. Obertelli, F. Berman, and R. Wolski, “The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid,” Proc. Supercomputing 2000 (SC '00), Nov. 2000.
[16] J. Schopf and F. Berman, “Stochastic Scheduling,” Proc. Supercomputing 1999, 1999.
[17] J. Schopf and F. Berman, “Using Stochastic Information to Predict Application Behavior on Contended Resources,” Int'l J. Foundations of Computer Science, vol. 12, no. 3, pp. 341-363, 2001.
[18] S. Smallen, H. Casanova, and F. Berman, “Applying Scheduling and Tuning to On-Line Parallel Tomography,” Proc. ACM/IEEE Supercomputing 2001, 2001.
[19] M. Faerman, A. Su, R. Wolski, and F. Berman, “Adaptive Performance Prediction for Distributed Data-Intensive Applications,” Proc. Supercomputing '99, Nov. 1999.
[20] Caltech's synthetic aperture radar atlas project,www.aciri.org/floyd/papers/simulate_2001.pdfhttp:/ /www.supercomp.org/sc96/proceedings/ org/sc96/proceedings/.http://www.netlib.org/ utk/papers/mpi-book/mpi-book.pshttp:/ /www.cacr.caltech.edu/\~roy/sara index.html. year?
[21] Caltech's digital sky project,http://www.cacr.caltech.edudigisky, 2002.
[22] Microsoft's terraserver project,http:/terraserver.homeadvisor.msn.com, 2002.
[23] R. Wolski, N.T. Spring, and J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing,” J. Future Generation Computing Systems, 1999.
[24] R. Wolski, “Dynamically Forecasting Network Performance Using the Network Weather Service,” J. Cluster Computing, vol. 1, no. 1, pp. 119-132, 1998.
[25] A.S. Grimshaw, E.A. West, and W.R. Pearson, “No Pain and Gain!—Experiences with Mentat on Biological Applications,” Concurrency: Practice and Experience, vol. 5, no. 4, July 1993.
[26] T. Hagerup, “Allocating Independent Tasks to Parallel Processors: An Experimental Study,” J. Parallel and Distributed Computing, vol. 47, pp. 185-197, 1997.
[27] S. Flynn Hummel, J. Schmidt, R.N. Uma, and J. Wein, “Load-Sharing in Heterogeneous Systems via Weighted Factoring,” Proc. Eighth Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 318-328, June 1996.
[28] T. Hagerup, “Allocating Independent Tasks to Parallel Processors: An Experimental Study,” J. Parallel and Distributed Computing, vol. 47, pp. 185-197, 1997.
[29] J.R. Stiles, T.M. Bartol, E.E. Salpeter, and M.M. Salpeter, “Monte Carlo Simulation of Neuromuscular Transmitter Release Using MCell, a General Simulator of Cellular Physiological Processes,” Computational Neuroscience, pp. 279-284, 1998.
[30] J.R. Stiles, D. Van Helden, T.M. Bartol, E.E. Salpeter, and M.M. Salpeter, “Miniature End-Plate Current Rise Times$\big.<\bigr.$100 Microseconds from Improved Dual Recordings Can Be Modeled with Passive Acetylcholine Diffusion Form a Synaptic Vesicle,” Proc. Nat'l Academy of Sciences U.S.A., vol. 93, pp. 5745-5752, 1996.
[31] O.H. Ibarra and C.E. Kim, “Heuristic Algorithms for Scheduling Independent Tasks on Nonindentical Processors,” J. ACM, vol. 24, no. 2, pp. 280-289, Apr. 1977.
[32] T.D. Braun, H.J. Siegel, N. Beck, L.L. Bölöni, M. Maheswaran, A.I. Reuther, J.P. Robertson, M.D. Theys, B. Yao, D. Hensgen, and R.F. Freund, “A Comparison Study of Static Mapping Heuristics for a Class of Meta-Tasks on Heterogeneous Computing Systems,” Proc. Eighth IEEE Workshop on Heterogeneous Computing Systems (HCW '99), pp. 15-29, Apr. 1999.
[33] M. Maheswaran and H.J. Siegel, “A Dynamic Matching and Scheduling Algorithm for Heterogeneous Computing Systems,” Proc. Seventh Heterogeneous Computing Workshop, 1998.
[34] H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman, “Heuristics for Scheduling Parameter Sweep Applications in Grid Environments,” Proc. Ninth Heterogeneous Computing Workshop (HCW '00), pp. 349-363, May 2000.
[35] Apst homepage,http://grail.sdsc.edu/projectsapst, 2002.
[36] G. Shao, “Adaptive Scheduling of Master/Worker Applications on Distributed Computational Resources,” Ph.D. thesis, Univ. of California, San Diego, May 2001.
[37] W. Cirne, “Using Moldability to Improve the Performance of Supercomputer Jobs,” Ph.D. thesis, Univ. of California, San Diego, Nov. 2000.
[38] S. Altschul, W. Dish, W. Miller, E. Myers, and D. Lipman, “Basic Local Alignment Search Tool,” J. Molecular Biology, vol. 215, pp. 403-410, 1990.
[39] S. Altschul, T. Madden, A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman, “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” Nucleic Acids Research, vol. 25, pp. 3389-3402, 1997.
[40] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge Univ. Press, 1998.
[41] J. Basney, M. Livny, and P. Mazzanti, “Harnessing the Capacity of Computational Grids for High Energy Physics,” Proc. Conf. Computing in High Energy and Nuclear Physics, 2000.
[42] A. Majumdar, “Parallel Performance Study of Monte-Carlo Photon Transport Code on Shared-, Distributed-, and Distributed-Shared-Memory Architectures,” Proc. 14th Parallel and Distributed Processing Symp., IPDPS '00, pp. 93-99, May 2000.
[43] H. Casanova, “Simgrid: A Toolkit for the Simulation of Application Scheduling,” Proc. IEEE Int'l Symp. Cluster Computing and the Grid (CCGrid '01), pp. 430-437, May 2001.
[44] A. Takefusa, S. Matsuoka, H. Nakada, K. Aida, and U. Nagashima, “Overview of a Performance Evaluation System for Global Computing Scheduling Algorithms,” Proc. Eighth IEEE Int'l Symp. High Performance Distributed Computing (HPDC), pp. 97-104, Aug. 1999.
[45] NPACI Scalable Visualisation Tools Webpage,http:/vistools.npaci.edu, 2002.
[46] H. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. Shingyalov, and P. Bourne, “The Protein Data Bank,” Nucleic Acids Research, vol. 28, no. 1, pp. 235-242, 2000.
[47] A. Natrajan, M. Crowley, N. Wilkins-Diehr, M. Humphrey, A. Fox, and A. Grimshaw, “Studying Protein Folding on the Grid: Experiences Using CHARM on NPACI Resources under Legion,” Proc. 10th IEEE Int'l Symp. High Performance Distributed Computing (HPDC-10), 2001.
[48] SETI@Home project,http:/setiathome.ssl.berkeley.edu, 2002.
[49] M. Thomas, S. Mock, J. Boisseau, M. Dahan, K. Mueller, and D. Sutton, “The GridPort Toolkit Architecture for Building Grid Portals,” Proc. 10th IEEE Int'l Symp. High Performance Distributed Computing (HPDC-10), Aug. 2001.
[50] M. Yarrow, K. McCann, R. Biswas, and R. Van der Wijngaart, “An Advanced User Interface Approach for Complex Parameter Study Process Specification on the Information Power Grid,” Proc. GRID 2000, Dec. 2000.
[51] D. Abramson, J. Giddy, and L. Kotler, "High-Performance Parametric Modeling with Nimrod-G: Killer Application for the Global Grid?" Proc. Int'l Parallel and Distributed Processing Symp., IEEE CS Press, Los Alamitos, Calif., 2000.
[52] K. Czajkowski et al., "Grid Information Services for Distributed Resource Sharing," IEEE Press, 2001, pp. 181–194.
[53] K. Czajkowski et al., "A Resource Management Architecture for Metacomputing Systems," Proc. 4th Workshop Job Scheduling Strategies for Parallel Processing. LNCS 1459, Springer, 1998, pp. 62-82.
[54] I. Foster et al., "A Security Architecture for Computational Grids," Proc. 5th ACM Conf. Computer and Communications Security, ACM Press, New York, 1998.
[55] The Portable Batch System Webpage,http:/www.openpbs.com, 2002.
[56] IBM LoadLeveler User's Guide. IBM Corp., 1993.
[57] M. Litzkow, M. Livny, and M.W. Mutka, “Condor—A Hunter of Idle Workstations,” Proc. Eighth Int'l Conf. Distributed Computing Systems, Jun. 1988.
[58] I. Foster, C. Kesselman, J. Tedesco, and S. Tuecke, “GASS: A Data Movement and Access Service for Wide Area Computing Systems,” Proc. Sixth Workshop I/O in Parallel and Distributed Systems, May 1999.
[59] W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, and S. Tuecke, “GridFTP: Protocol Extension to FTP for the Grid,” Grid Forum Internet-Draft, Mar. 2001.
[60] The storage resource broker,http://www.npaci.edu/dicesrb, 2002.
[61] H. Casanova, T. Bartol, J. Stiles, and F. Berman, “Distributing MCell Simulations on the Grid,” Int'l J. High Performance Computing Applications, vol. 14, no. 3, pp. 243-257, 2001.
[62] H. Casanova and F. Berman, Parameter Sweeps on the Grid with APST, chapter 33. Wiley Publishers, Inc., 2002.
[63] Sun Microsystems Grid Engine,http://www.sun.comgridware/, 2002.
[64] Entropia Inc.,http:/www.entropia.com, entropia, year?
[65] United Device Inc.,http:/www.ud.com, 2002.
[66] J. Linderoth, S. Kilkarni, J.-P. Goux, and M. Yoder, “An Enabling Framework for Master-Worker Applications on the Computational Grid,” Proc. Ninth IEEE Symp. High Performance Distributed Computing, pp. 43-50, Aug. 2000.
[67] E. Heymann, M. Senar, E. Luque, and M. Livny, “Adaptive Scheduling for Master-Worker Applications on the Computational Grid,” Proc. IEEE/ACM Int'l Workhop Grid Computing (GRID 2000), Dec. 2000.
[68] G. Shao and R. Wolski, and F. Berman, “Master/Slave Computing on the Grid,” Proc. Ninth Heterogeneous Computing Workshop, pp. 3-16, May 2000.
[69] P. Tang and P.-C. Yew, “Processor Self-Scheduling for Multiple Nested Parallel Loops,” Proc. 1986 Int'l Conf. Parallel Processing, pp. 528-535, Aug. 1986.
[70] C.P. Kruskal and A. Weiss, "Allocating Independent Subtasks on Parallel Processors," IEEE Trans. Software Eng., vol. 11, no. 10, pp. 1,001-1,016, Oct. 1985.
[71] C.D. Polychronopoulos and D.J. Kuck, “Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers,” IEEE Trans. Computers, vol. 36, no. 12, pp. 1425-1439, Dec. 1987.
[72] T.H. Tzen and L.M. Ni, "Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers," IEEE Trans. Parallel and Distributed Systems, vol. 4, pp. 87-98, Jan. 1993.
[73] S. Flynn Hummel, J. Schmidt, R.N. Uma, and J. Wein, “Load-Sharing in Heterogeneous Systems via Weighted Factoring,” Proc. Eighth Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 318-328, June 1996.
[74] D.H. Bailey, E. Barszcz, J.T. Barton, D.S. Browning, R.L. Carter, L. Dagum, R.A. Fatoohi, P.O. Frederickson, T.A. Lasinski, R.S. Schreiber, H.D. Simon, V. Venkatakrishnan, and S.K. Weeratunga, “The NAS Parallel Benchmarks,” Int'l J. Supercomputer Applications, vol. 5, no. 3, pp. 63-73, 1991.
[75] “Persistence of Vision Raytracer,” Persistence of Vision Development Team, 1999.
[76] L.F. Ten Eyck, J. Mandell, V.A. Roberts, and M.E. Pique, “Surveying Molecular Interactions with DOT,” Proc. 1995 ACM/IEEE Supercomputing Conf., pp. 506-517, Dec. 1995.
[77] W. Cirne and F. Berman, “Using Moldability to Improve the Performance of Supercomputer Jobs,” J. Parallel and Distributed Computing, vol. 62, no. 10, pp. 1571-1601, 2002.
[78] W. Cirne and F. Berman, “When the Herd Is Smart: The Aggregate Behavior in the Selection of Job Request,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 2, pp. 181-192, Feb. 2003.
[79] “Extensible Argonne Scheduler System (EASY),” http://info.mcs.anl.gov/Projects/sp/scheduler scheduler.html, 2002.
[80] “Maui Scheduler,” http://supercluster.orgmaui/, 2002.
[81] “Load Sharing Facility (LSF),” http://wwwinfo.cern.ch/pdplsf/, 2002.
[82] W. Cirne and F. Berman, “A Model for Moldable Supercomputer Jobs,” Proc. IPDPS 2001—Int'l Parallel and Distributed Processing Symp., Apr. 2001.
[83] D.G. Feitelson, “Metrics for Parallel Job Scheduling and Their Convergence,” Job Scheduling Strategies for Parallel Processing, vol. 2221, pp. 188-206, 2001.
[84] J. Gehring and A. Reinefeld, “MARS—A Framework for Minimising the Job Execution Time in a Metacomputing Environment,” Future Generation Computer Systems, vol. 12, pp. 87-99, 1996.
[85] H. Topcuoglu, S. Hariri, W. Furmanski, J. Valente, I. Ra, D. Kim, Y. Kim, X. Bing, and B. Ye, “The Software Architecture of a Virtual Distributed Computing Environment,” Proc. Sixth IEEE High-Performance and Distributed Computing Conf. (HPDC '97), pp. 40-49, 1997.
[86] M. Sirbu and D. Marinescu, “A Scheduling Expert Advisor for Heterogeneous Environments,” Proc. Heterogeneous Computing Workshop (HCW '97), pp. 74-87, 1997.
[87] J. Budenske, R. Ramanujan, and H.J. Siegel, “On-Line Use of Off-Line Derived Mappings for Iterative Automatic Target Recognition Tasks and a Particular Class of Hardware,” Proc. Heterogeneous Computing Workshop (HCW '97), pp. 96-110, 1997.
[88] P. Au, J. Darlington, M. Ghanem, Y. Guo, H. To, and J. Yang, “Co-Ordinating Heterogeneous Parallel Computation,” Proc. Euro-Par '96, pp. 601-614, 1996.
[89] J.N. Cotrim Arabe, A. Beguelin, B. Lowekamp, E. Seligman, M. Starkey, and P. Stephan, “Dome: Parallel Programming in a Distributed Computing Environment,” Proc. IEEE Symp. Parallel and Distributed Processing, pp. 218–224, 1996.
[90] J. Gehring, “Dynamic Program Description as a Basis for Runtime Optimisation,” Proc. Third Int'l Euro-Par Conf., pp. 958-965, Aug. 1997.
[91] J. Schopf, “Performance Prediction and Scheduling for Parallel Applications on Multiuser Clusters,” Ph.D. thesis, Univ. of California, San Diego, Dec. 1998
[92] GrADS project homepage,http://nhse2.cs.rice.edugrads, 2002.
[93] Grid research and innotation laboratory homepage,http:/grail.sdsc.edu, 2002.
[94] Virtual Instrument project,http://gcl.ucsd.eduvi_itr/, 2002.
[95] M. Faerman, A. Birnbaum, H. Casanova, and F. Berman, “Resource Allocation for Steerable Parallel Parameter Searches,” Proc. Grid Computing Workshop, Nov. 2002.
[96] D. Kondo, H. Casanova, E. Wing, and F. Berman, Models and Scheduling Guidelines for Global Computing Applications Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS '02), 2002.
[97] I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” Int'l J. Supercomputer Applications, 2001.

Index Terms:
Scheduling, parallel and distributed computing, heterogeneous computing, grid computing.
Citation:
Francine Berman, Richard Wolski, Henri Casanova, Walfredo Cirne, Holly Dail, Marcio Faerman, Silvia Figueira, Jim Hayes, Graziano Obertelli, Jennifer Schopf, Gary Shao, Shava Smallen, Neil Spring, Alan Su, Dmitrii Zagorodnov, "Adaptive Computing on the Grid Using AppLeS," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 4, pp. 369-382, April 2003, doi:10.1109/TPDS.2003.1195409
Usage of this product signifies your acceptance of the Terms of Use.