This Article 
 Bibliographic References 
 Add to: 
Concurrent Scheduling: Efficient Heuristics for Online Large-Scale Data Transfers in Distributed Real-Time Environments
November 2006 (vol. 17 no. 11)
pp. 1348-1359

Abstract—The static staging heuristics proposed in the literature for staging the data items associated with real-time distributed applications adhere to a method by which only one data item is transferred in each communication step to optimize a specific cost function. In this paper, we first propose the Extended Partial Path (EPP) algorithm based on the same method. In terms of maximizing the number of satisfied requests, we have analytically shown that EPP has a performance that is equal to or greater than the Partial Path Heuristic (PPH) introduced previously [CHECK END OF SENTENCE], thanks to excluding the data items that cannot be satisfied by PPH from scheduling and scheduling the satisfiable data-items along their extended paths. In contrast to EPP and other data staging heuristics proposed, we develop the concurrent scheduling (CS) heuristic which allows simultaneous transfer of more than one data item in an organized fashion, thereby improving the overall performance of the staging system. At the heart of the CS heuristic are EPP and the local priority assignment method devised for solving the conflicts between data items at the intermediate nodes. The extensive simulation results further confirm the superiority of the CS heuristic over PPH.

[1] M.D. Theys, M. Tan, N. Beck, H.J. Siegal, and M. Jurczyk, “A Mathematical Model and Scheduling Heuristic for Satisfying Prioritized Data Requests in an Oversubscribed Communication Network,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 9, pp. 969-988, Nov. 2000.
[2] J. Lee, B. Tierney, and W. Johnston, “Data Intensive Distributed Computing: A Medical Application Example,” Proc. High Performance Computing and Networking Conf. (HPCN '99), Apr. 1999.
[3] B. Allcock, I. Foster, V. Nefedova, A. Chervenak, E. Deelman, C. Kesselman, J. Lee, A. Sim, A. Shoshani, B. Drach, and D. Williams, “High Performance Remote Access to Climate Simulation Data: A Challenge Problem for Data Grid Technologies,” Proc. 2001 ACM/IEEE Supercomputing Conf. (SC '01), Nov. 2001.
[4] D. McKeown, G.E. Bulwinkle, and S. Cochran, “Research in the Automatex Analysis of Remotely Sensed Imagery,” Proc. DARPA Image Understanding Workshop, pp. 99-132, 1996.
[5] S. Shukla and D. Agrawal, “Scheduling Pipelined Communication in Distributed Memory Multiprocessors for Real-Time Applications,” Proc. Int'l Symp. Computer Architecture, pp. 222-231, 1991.
[6] B. Tierney, W. Johnston, and J. Lee, “A Cache-Based Data Intensive Distributed Computing Architecture for Grid Applications,” CERN School of Computing, Sept. 2000.
[7] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. MIT Press, 1990.
[8] Q. Wang, N.L. Passos, and E.H.-M. Sha, “Optimal Data Scheduling for Uniform Multidimensional Applications,” IEEE Trans. Computers, vol. 45, no. 12, pp. 1439-1444, Dec. 1996.
[9] Y. Tian, E.H.-M. Sha, C. Chantrapornchai, and P.M. Kogge, “Optimizing Data Scheduling on Processor-in-Memory Arrays,” Proc. Int'l Parallel Processing Symp./Symp. Parallel and Distributed Processing (IPPS/SPDP), pp. 57-61, 1998.
[10] A. Goel, M. Henzinger, S. Plotkin, and E. Tardos, “Scheduling Data Transfers in a Network and the Set Scheduling Problem,” Proc. Ann. ACM Symp. Theory of Computing, pp. 189-197, 1999.
[11] K. Ranganathan and I. Foster, “Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications,” Proc. 11th IEEE Int'l Symp. High Performance Distributed Computing (HPDC '02), pp. 352-365, July 2002.
[12] I.R. Philp and J.W.S. Liu, “End-to-End Scheduling in Real-Time Packet-Switched Networks,” Proc. Int'l Conf. Network Protocols (ICNP '96), pp. 23-31, Oct.-Nov. 1996.
[13] V. Sivaraman and F. Chiussi, “Providing End-to-End Statistical Delay Guarantees with Earliest Deadline First Scheduling and Per-Hop Traffic Shaping,” Proc. IEEE INFOCOM, vol. 2, pp.631-640, Mar. 2000.
[14] T. Holmberg and J.M. Karlsson, “Scheduling Deadline Driven Packet Flows in HiperAccess,” Proc. Eighth IEEE Int'l Symp. Computers and Comm., pp. 108-118, 2003.
[15] M. Eltayeb, “Efficient Data Scheduling for Real-Time Large-Scale Data-Intensive Distributed Applications,” PhD dissertation, Ohio State Univ., 2004.
[16] K. Nam, S. Lee, and J. Kim, “Path Selection for Real-Time Communication in Wormhole Networks,” Int'l J. High Speed Computing, vol. 10, no. 4, 1999.

Index Terms:
Data staging, data scheduling, real-time, distributed computing and networking.
Mohammed S. Eltayeb, Atakan Dogan, F? ?zg?, "Concurrent Scheduling: Efficient Heuristics for Online Large-Scale Data Transfers in Distributed Real-Time Environments," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 11, pp. 1348-1359, Nov. 2006, doi:10.1109/TPDS.2006.150
Usage of this product signifies your acceptance of the Terms of Use.