The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March-April (1997 vol.9)
pp: 302-313
ABSTRACT
<p><b>Abstract</b>—For compression of text databases, semi-static word-based methods provide good performance in terms of both speed and disk space, but two problems arise. First, the memory requirements for the compression model during decoding can be unacceptably high. Second, the need to handle document insertions means that the collection must be periodically recompressed if compression efficiency is to be maintained on dynamic collections. Here we show that with careful management the impact of both of these drawbacks can be kept small. Experiments with a word-based model and over 500 Mb of text show that excellent compression rates can be retained even in the presence of severe memory limitations on the decoder, and after significant expansion in the amount of stored text.</p>
INDEX TERMS
Document databases, text compression, dynamic databases, word-based compression, Huffman coding.
CITATION
Alistair Moffat, Justin Zobel, Neil Sharman, "Text Compression for Dynamic Document Databases", IEEE Transactions on Knowledge & Data Engineering, vol.9, no. 2, pp. 302-313, March-April 1997, doi:10.1109/69.591454
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool