Data Compression Conference (DCC '04) Compressed Index for Dynamic Text Snowbird, Utah March 23-March 25 ISBN: 0-7695-2082-0
This paper investigates how to index a text which is subject to updates. The best solution in the literature is based on suffix tree using O(n log n) bits of storage, where n is the length of the text. It supports finding all occurrences of a pattern P in O(|P| + occ) time, where occ is the number of occurrences. Each text update consists of inserting or deleting a substring of length y and can be supported in O(y + \sqrt n) time. In this paper, we initiate the study of compressed index using only O(n log |\Sigma|) bits of space, where \Sigma denotes the alphabet. Our solution supports finding all occurrences of a pattern P in O(|P| log2 n(log\epsilon n + log |\Sigma|) + occlog1+\epsilon n) time, while insertion or deletion of a substring of length y can be done in O((y + \sqrt n) log2+\epsilon n) amortized time, where 0 \lt \epsilon \le 1. The core part of our data structure is based on the recent work on Compressed Suffix Trees (CST) and Compressed Suffix Arrays (CSA).
Citation:
Wing-Kai Hon, Tak-Wah Lam, Kunihiko Sadakane, Wing-Kin Sung, Siu-Ming Yiu, "Compressed Index for Dynamic Text," dcc, pp.102, Data Compression Conference (DCC '04), 2004 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||