loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
15th International Conference on Scientific and Statistical Database Management
Declustering Large Multidimensional Data Sets for Range Queries over Heterogeneous Disks
Cambridge, Massachusetts, USA
July 09-July 11
ISBN: 0-7695-1964-4
Jonghyun Lee, University of Illinois
Marianne Winslett, University of Illinois
Xiaosong Ma, University of Illinois
Shengke Yu, University of Illinois
Declustering is a technique to distribute data sets over multiple disks so that future retrievals can be well balanced over the disks and be performed in parallel. Although clusters often have heterogeneous disks, most declustering work has focused only on homogeneous environments. In this work, we investigate the declustering problem for a heterogeneous disk environment using virtual servers, and propose novel approaches for deciding the number of virtual servers and the mapping between virtual servers and physical disks. Our experimental results show that by combining our algorithm for choosing the number of virtual servers with a greedy algorithm for mapping virtual servers to disks, users can expect range query retrieval performance within 4% of the optimum achievable in practice on average, in all configurations studied. Compared to an intuitively natural approach to the problem, this represents an improvement of 8-31% in average fetch ratio, as well a 26- 38% reduction in the standard deviation of performance for small queries.
Citation:
Jonghyun Lee, Marianne Winslett, Xiaosong Ma, Shengke Yu, "Declustering Large Multidimensional Data Sets for Range Queries over Heterogeneous Disks," ssdbm, pp.212, 15th International Conference on Scientific and Statistical Database Management, 2003
Usage of this product signifies your acceptance of the Terms of Use.