loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 15th IEEE International Conference on High Performance Distributed Computing
Filecules in High-Energy Physics: Characteristics and Impact on Resource Management
Paris
June 19-June 23
ISBN: 1-4244-0307-3
A. Aamnitchi, Dept. of Comput. Sci.&Eng., Univ. of South Florida, Tampa, FL
S. Doraimani, Dept. of Comput. Sci.&Eng., Univ. of South Florida, Tampa, FL
Grid computing has reached the stage where deployments are mature and many collaborations run in production mode. Mature grid deployments offer the opportunity for revisiting and perhaps updating traditional beliefs related to workload models, which in turn leads to the re-evaluation of traditional resource management techniques. This paper analyzes usage patterns in a typical grid community, a large-scale data-intensive scientific collaboration in high-energy physics. We focus mainly on data usage, since data is the major resource for this class of applications. Our observations led us to propose a new abstraction for resource management in scientific data analysis applications: we define a filecule as a group of files that is always used together. We show that filecules exist and present their characteristics. The existence of filecules suggests a new granularity for data management, which, if incorporated in design, can significantly outperform the traditional solutions for data caching, replication and placement based on single-file granularity. We reason about the impact of filecules on resource management and show compelling evidence for using this abstraction when designing data management services
Index Terms:
data management service, high-energy physics, resource management, grid computing, large-scale data-intensive scientific analysis application, data caching, data replication, single-file granularity
Citation:
A. Aamnitchi, S. Doraimani, G. Garzoglio, "Filecules in High-Energy Physics: Characteristics and Impact on Resource Management," hpdc, pp.69-80, 2006 15th IEEE International Conference on High Performance Distributed Computing, 2006
Usage of this product signifies your acceptance of the Terms of Use.