loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth International Conference Document Analysis and Recognition (ICDAR'97)
Dynamic word based text compression
Ulm, GERMANY
August 18-August 20
ISBN: 0-8186-7898-4
K.S. Ng, Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
L.M. Cheng, Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
C.H. Wong, Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design.
Index Terms:
data compression; dynamic word based text compression; back searching algorithm; storage protocol; dictionaries; encoding; copy codes; literal codes; hybrid codes; hashing function; decoding; space character; redundancy; message
Citation:
K.S. Ng, L.M. Cheng, C.H. Wong, "Dynamic word based text compression," icdar, pp.412, Fourth International Conference Document Analysis and Recognition (ICDAR'97), 1997
Usage of this product signifies your acceptance of the Terms of Use.