loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 International Conference on Parallel Processing (ICPP'06)
Using Space and Attribute Partitioned Partial Replicas for Data Subsetting and Aggregation Queries
Columbus, Ohio
August 14-August 18
ISBN: 0-7695-2636-5
Li Weng, The Ohio State University, USA
Umit Catalyurek, The Ohio State University, USA
Tahsin Kurc, The Ohio State University, USA
Gagan Agrawal, The Ohio State University, USA
Joel Saltz, The Ohio State University, USA
Partial replication is one type of optimization to speed up execution of queries submitted to large datasets. In partial replication, a portion of the dataset is extracted, re-organized, and re-distributed across the storage system. In this paper we investigate methods for efficient execution of queries when replicas of a dataset exist; we assume the replicas have already been created and do not target the replica creation problem. We propose a cost model and algorithm for combined use of space partitioned and attribute partitioned replicas for executing data subsetting range queries. We extend the cost model and propose a greedy algorithm to address range queries with aggregation operations. The extended replica selection algorithm allows uneven partitioning of replicas across storage nodes. Different replicas can be partitioned across different subsets of storage nodes. We have implemented these techniques as part of an automatic data virtualization system and have evaluated the benefits of our techniques using this system. We demonstrate the efficacy of the algorithms on parallel machines using queries on datasets from oil reservoir simulation studies and satellite data processing applications.
Citation:
Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz, "Using Space and Attribute Partitioned Partial Replicas for Data Subsetting and Aggregation Queries," icpp, pp.271-280, 2006 International Conference on Parallel Processing (ICPP'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.