Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications
High-Performance Distributed Computing, International Symposium on (2003)
June 22, 2003 to June 24, 2003
Atsuko Takefusa , Ochanomizu University
Osamu Tatebe , AIST
Satoshi Matsuoka , Tokyo Institute of Technology and National Institute of Informatics
Youhei Morita , High Energy Accelerator Research Organization
Data Grid is a Grid environment for ubiquitous access and analysis of large-scale data. Because Data Grid is in the early stages of development, the performance of its petabyte-scale models in a realistic data processing setting has not been well investigated. By enhancing our Bricks Grid simulator to accomodated Data Grid scenarios, we investigate and compare the performance of the different Data Grid models. These are categorized mainly as either central or tier models; they employ various scheduling and replication strategies under realistic assumptions of job processing for CERN LHC experiments on the Grid Datafarm system. Our results show that the central model is efficient but that the tier model, with its greater resources and its speculative class of background replication policies, are quite effective and achieve higher performance, while each tier is smaller than the central model.
O. Tatebe, S. Matsuoka, Y. Morita and A. Takefusa, "Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications," High-Performance Distributed Computing, International Symposium on(HPDC), Seattle, Washington, 2003, pp. 34.