Data Compression Conference (dcc 2008) March 25-March 27 ISBN: 978-0-7695-3121-2
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/DCC.2008.78
In this paper we present several pre-processing techniques developed to help general-purpose compressors achieve better results in the task of (lossless) text compression. The possibility to create dictionaries ``online'', together with the ability to store them within the compressed file, has revealed itself an attractive one, resulting in significant compression improvement.Moreover, this technique has the advantage of being independent for languages whose vocabulary is built upon the use of prefixes and sufixes.In our experiments, we achieved an improvement representing almost 3\% over existing techniques on a large (100Mbyte) file.
Index Terms:
text pre-processing, lossless compression, capital conversion, dictionary
Citation:
Lu? Batista, Lu?s A. Alexandre, "Text Pre-processing for Lossless Compression," dcc, pp.506, Data Compression Conference (dcc 2008), 2008 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||