loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Data Compression Conference (dcc 2008)
Compressed Index for Dictionary Matching
March 25-March 27
ISBN: 978-0-7695-3121-2
The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|H_k(T) + o(|T| log s) bits, where H_k(T) denotes the kth-order empirical entropy of T, and s is the size of the alphabet. In this paper, we study compressed representation of another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection D of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nH_k(D) + o(n log s))-bit index with only a small sacrifice in search time.
Index Terms:
Indexing, Dictionary Matching, Compression, Pattern Matching, Entropy
Citation:
Wing-Kai Hon, Tak-Wah Lam, Rahul Shah, Siu-Lung Tam, Jeffrey Scott Vitter, "Compressed Index for Dictionary Matching," dcc, pp.23-32, Data Compression Conference (dcc 2008), 2008
Usage of this product signifies your acceptance of the Terms of Use.