
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Bernhard Balkenhol, Stefan Kurtz, "Universal Data Compression Based on the BurrowsWheeler Transformation: Theory and Practice," IEEE Transactions on Computers, vol. 49, no. 10, pp. 10431053, October, 2000.  
BibTex  x  
@article{ 10.1109/12.888040, author = {Bernhard Balkenhol and Stefan Kurtz}, title = {Universal Data Compression Based on the BurrowsWheeler Transformation: Theory and Practice}, journal ={IEEE Transactions on Computers}, volume = {49}, number = {10}, issn = {00189340}, year = {2000}, pages = {10431053}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.888040}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Universal Data Compression Based on the BurrowsWheeler Transformation: Theory and Practice IS  10 SN  00189340 SP1043 EP1053 EPD  10431053 A1  Bernhard Balkenhol, A1  Stefan Kurtz, PY  2000 KW  Lossless data compression KW  BurrowsWheeler Transformation KW  context trees KW  suffix trees. VL  49 JA  IEEE Transactions on Computers ER   
Abstract—A very interesting recent development in data compression is the BurrowsWheeler Transformation [1]. The idea is to permute the input sequence in such a way that characters with a similar context are grouped together. We provide a thorough analysis of the BurrowsWheeler Transformation from an information theoretic point of view. Based on this analysis, the main part of the paper systematically considers techniques to efficiently implement a practical data compression program based on the transformation. We show that our program achieves a better compression rate than other programs that have similar requirements in space and time.
[1] M. Burrows and D. Wheeler, “A BlockSorting Lossless Data Compression Algorithm,” Research Report 124, Digital Systems Research Center, 1994. http://gatekeeper.dec.com/pub/DEC/SRC/researchreports/ abstractssrcrr124.html .
[2] F. Willems, Y. Shtarkov, and T. Tjalkens, “The ContextTree Weighting Method: Basic Properties,” IEEE Trans. Information Theory, vol. 41, pp. 653664, 1995.
[3] Y. Shtarkov, “Universal Sequential Coding of Single Messages,” Problems Information Transmission, vol. 23, no. 3, pp. 317, 1987.
[4] Y. Shtarkov, T. Tjalkens, and F. Willems, “Multialphabet Coding of Memoryless Sources,” Problems Information Transmission, vol. 31, no. 2, pp. 2035, 1995.
[5] T.C. Bell, J.G. Cleary, and I.H. Witten, Text Compression.Englewood Cliffs, N.J.: Prentice Hall, 1990.
[6] R. Arnold and T. Bell, A Corpus for the Evaluation of Lossless Compression Algorithms Proc. Data Compression Conf., pp. 201210, Mar. 1997.
[7] J. Gailly, “The gzip Program, Version 1.2.4,” 1993. ftp://prep.ai.mit.edu/pub/gnugzip1.2.4.tar.gz .
[8] B. Balkenhol and S. Kurtz, “Universal Data Compression Based on the Burrows and Wheeler Transformation: Theory and Practice,” technical report, Sonderforschungsbereich: Diskrete Strukturen in der Mathematik, Universität Bielefeld, 98069, 1998. http://www.mathematik.unibielefeld.de/sfb343 preprints/.
[9] J.G. Cleary, R.M. Neal, and I.H. Witten, “Arithmetic Coding for Data Compression,” Comm. ACM, vol. 30, no. 6, pp. 520540, June 1987.
[10] R. Krichevsky and V. Trofimov, “The Performance of Universal Encoding,” IEEE Trans. Information Theory, vol. 27, pp. 199207, 1981.
[11] J. Cleary, W. Teahan, and I. Witten, “Unbounded Length Contexts for PPM,” Proc. IEEE Data Compression Conf., pp. 5261, 1995.
[12] P. Weiner, “Linear Pattern Matching Algorithms,” Proc. 14th IEEE Ann. Symp. Switching and Automata Theory, pp. 111, 1973.
[13] E.M. McCreight, "A Space Economical Suffix Tree Construction Algorithm," J. ACM, vol. 23, no. 2, pp. 26272, 1976.
[14] E. Ukkonen, “OnLine Construction of SuffixTrees,” Algorithmica, vol. 14, no. 3, 1995.
[15] M. Farach, “Optimal Suffix Tree Construction with Large Alphabets,” Proc. 38th Ann. Symp. Foundations of Computer Science, FOCS 97, 1997.
[16] U. Manber and E. Myers, “Suffix Arrays: A New Method for OnLine String Searches,” SIAM J. Computing, vol. 22, no. 5, pp. 935948, 1993.
[17] K. Sadakane, “A Fast Algorithm for Making Suffix Arrays and for BurrowsWheeler Transformation,” Proc. IEEE Data Compression Conf., pp. 129138, 1998.
[18] S. Kurtz, “Reducing the Space Requirement of Suffix Trees,” Software—Practice and Experience, vol. 29, no. 13, pp. 1,1491,171, 1999.
[19] R. Giegerich and S. Kurtz, “From Ukkonen to McCreight and Weiner: A Unifying View of LinearTime Suffix Tree Construction,” Algorithmica, vol. 19, pp. 331353, 1997.
[20] R. Giegerich and S. Kurtz, “A Comparison of Imperative and Purely Functional Suffix Tree Constructions,” Science of Computer Programming, vol. 25, nos. 23, pp. 187218, 1995.
[21] R. Irving, “Suffix Binary Search Trees,” research report, Dept. of Computer Science, Univ. of Glasgow, 1996. http://www.dcs.gla.ac.uk/rwi/paperssbst.ps .
[22] M. Crochmore and R. Vérin, “Direct Construction of Compact Acyclic Word Graphs,” Proc. Ann. Symp. Combinatorial Pattern Matching (CPM '97), pp. 116129, 1997.
[23] N. Larsson, “The Context Trees of Block Sorting Compression,” Proc. IEEE Data Compression Conf., pp. 189198, 1998.
[24] B. Ryabko, “Data Compression by Means of a Book Stack,” Problems Information Transmission, vol. 16, no. 4, pp. 1621, 1980.
[25] R. Ahlswede, T. Han, and K. Kobayashi, “Universal Coding of Integers and Unbounded Search Trees,” IEEE Trans. Information Theory, vol. 43, no. 2, pp. 669682, 1997.
[26] Q. Stout, “Improved Prefix Encodings of Natural Numbers,” IEEE Trans. Information Theory, vol. 26, pp. 607609, 1980.
[27] J. Rissanen, “A Universal Prior for Integers and Estimation by Minimum Description Length,” Annals of Statistics, vol. 11, pp. 416431, 1983.
[28] V. Levenshtein, “On the Redundancy and Delay of Decodable Coding of Natural Numbers,” Problems in Cybernetics, vol. 20, pp. 173179, 1968, (in Russian).
[29] P. Elias, “Universal Codword Sets and Representation of Integers,” IEEE Trans. Information Theory, vol. 21, pp. 194203, 1975.
[30] P. Fenwick, “Block Sorting Text Compression—Final Report,” Technical Report 130, Dept. of Computer Science, Univ. of Auckland, 1996. http://www.cs.auckland.ac.nz/peterf/ftplink TechRep130.ps.
[31] G. Cormack and R. Horspool, “Data Compression Using Dynamic Markov Modelling,” Computer J., vol. 30, pp. 541550, 1987.
[32] A. Moffat, “Implementing the PPM Data Compression Scheme,” IEEE Trans. Comm., vol. 28, no. 11, pp. 1,9171,921, 1990.
[33] J. Seward, “The bzip2 Program, vers. 0.1pl2,” 1997. http:/www.muraroa.demon.co.uk.
[34] M. Schindler, “The szip Homepage,” 1998. http://www.compressconsult.comszip/.
[35] J. Ziv and A. Lempel, "Compression of Individual Sequence via VariableRate Coding," IEEE Trans. Information Theory, vol. 24, no. 5, pp. 530536, 1978.
[36] J. Ziv and A. Lempel, "A Universal Algorithm for Sequential Data Compression," IEEE Trans. Information Theory, vol. 23, no. 3, pp. 337343, 1977.