loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Data Compression Conference (DCC '99)
Linear Global Detectors of Redundant and Rare Substrings
Snowbird, Utah
March 29-March 31
ISBN: 0-7695-0096-X
Alberto Apostolico, Purdue University and University of Padova
Stefano Lonardi, Purdue University and University of Padova
Mary Ellen Bock, Purdue University
We show here that under several accepted measures of deviation from expected frequency, the candidates over- or underrepresented words are restricted to the O(n) words that end at internal nodes of a compact suffix tree, as opposed to the ?(n 2 ) possible substrings. This surprising fact is a consequence of properties in the form that if a word that ends in the middle of an arc is, say, overrepresented, then its extension to the nearest node of the tree is even more so. Based on this, we design global linear detectors of favored and unfavored words for our probabilistic framework, and display the results of some preliminary that apply our constructions to the analysis of genomic sequences.
Citation:
Alberto Apostolico, Stefano Lonardi, Mary Ellen Bock, "Linear Global Detectors of Redundant and Rare Substrings," dcc, pp.168, Data Compression Conference (DCC '99), 1999
Usage of this product signifies your acceptance of the Terms of Use.