loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
XII International Conference of the Chilean Computer Science Society (SCCC'02)
Improved Antidictionary Based Compression
Copiap?, Atacama, CHILE
November 06-November 08
ISBN: 0-7695-1867-2
Maxime Crochemore, King?s College
Gonzalo Navarro, University of Chile

The compression of binary texts using antidictionaries is a novel technique based on the fact that some substrings (called "antifactors") never appear in the text. Let sb be an antifactor, where b is its last bit. Every time s appears in the text we know that the next bit is b and hence omit its representation. Since building the set of all antifactors is space consuming at compression time, it is customary to limit the maximum length of antifactors considered up to a constant k. Larger k yields better compression of the text but requires more space at compression time.

In this paper we introduce the notion of almost antifactors, which are strings that rarely appear in the text. More formally, almost antifactors are strings that, if we consider them as antifactors and separately code their occurrences as exceptions, the compression ratio improves. We show that almost antifactors permit improving compression with a limited amount of main memory to compress. Our experiments show that they obtain the same compression of the classical algorithm using only 30% - 55% of its memory space.

Citation:
Maxime Crochemore, Gonzalo Navarro, "Improved Antidictionary Based Compression," sccc, pp.7, XII International Conference of the Chilean Computer Science Society (SCCC'02), 2002
Usage of this product signifies your acceptance of the Terms of Use.