High-Performance Distributed Computing, International Symposium on (2001)
San Francisco, California
Aug. 7, 2001 to Aug. 9, 2001
Heinz Stockinger , CERN
Asad Samar , California Institute of Technology
Koen Holtman , California Institute of Technology
Bill Allcock , Argonne National Laboratory
Ian Foster , Argonne National Laboratory
Brian Tierney , CERN
Abstract: Data replication is a key issue in a Data Grid and can be managed in different ways and at different levels of granularity: for example, at the file level or object level. In the High Energy Physics community, Data Grids are being developed to support the distributed analysis of experimental data. We have produced a prototype data replication tool, the Grid Data Management Pilot (GDMP) that is in production use in one physics experiment, with middleware provided by the Globus Toolkit used for authentication, data movement, and other purposes. We present here a new, enhanced GDMP architecture and prototype implementation that uses Globus Data Grid tools for efficient file replication. We also explain how this architecture can address object replication issues in an object-oriented database management system. File transfer over wide-area networks requires specific performance tuning in order to gain optimal data transfer rates. We present performance results obtained with GridFTP, an enhanced version of FTP, and discuss tuning parameters.
B. Allcock, H. Stockinger, B. Tierney, A. Samar, K. Holtman and I. Foster, "File and Object Replication in Data Grids," High-Performance Distributed Computing, International Symposium on(HPDC), San Francisco, California, 2001, pp. 0076.