loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03)
Improving Access to Multi-dimensional Self-describing Scientific Datasets
Tokyo, Japan
May 12-May 15
ISBN: 0-7695-1919-9
Beomseok Nam, University of Maryland
Alan Sussman, University of Maryland
Applications that query into very large multi- dimensional datasets are becoming more common. Many self-describing scientific data file formats have also emerged, which have structural metadata to help navigate the multi-dimensional arrays that are stored in the files. The files may also contain application-specific semantic metadata. In this paper, we discuss efficient methods for performing searches for subsets of multi-dimensional data objects, sing semantic information to build multi- dimensional indexes, and group data items into properly sized chunks to maximize disk I/O bandwidth. This work is the first step in the design and implementation of a generic indexing library that will work with various high-dimension scientific data file formats containing semantic information about the stored data. To validate the approach, we have implemented indexing structures for NASA remote sensing data stored in the HDF format with a specific schema (HDF-EOS), and show the performance improvements that are gained from indexing the datasets, compared to using the existing HDF library for accessing the data.
Citation:
Beomseok Nam, Alan Sussman, "Improving Access to Multi-dimensional Self-describing Scientific Datasets," ccgrid, pp.172, Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.