This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
IEEE Computer Society Bioinformatics Conference (CSB'03)
CoMRI: A Compressed Multi-Resolution Index Structure for Sequence Similarity Queries
Stanford, California
August 11-August 14
ISBN: 0-7695-2000-6
Hong Sun, Ohio State University
Ozgur Ozturk, Ohio State University
Hakan Ferhatosmanoglu, Ohio State University
In this paper, we present CoMRI, Compressed Multi-Resolution Index, our system for fast sequence similarity search in DNA sequence databases. We employ Virtual Bounding Rectangle (VBR) concept to build a compressed, grid style index structure. An advantage of grid format over trees is subsequence location information is given by the order of corresponding VBR in the VBR list. Taking advantage of VBRs, our index structure fits into a reasonable size of memory easily. Together with a new optimized multi-resolution search algorithm, the query speed is improved significantly. Extensive performance evaluations on Human Chromosome sequence data show that VBRs save 80%-93% index storage size compared to MBRs (Minimum ounding Rectangles) and new search algorithm prunes almost all unnecessary VBRs which guarantees efficient disk I/O and CPU cost. According to the results of our experiments, the performance of CoMRI is at least 100 times faster than MRS which is another grid index structure introduced very recently.
Citation:
Hong Sun, Ozgur Ozturk, Hakan Ferhatosmanoglu, "CoMRI: A Compressed Multi-Resolution Index Structure for Sequence Similarity Queries," csb, pp.553, IEEE Computer Society Bioinformatics Conference (CSB'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.