String Processing and Information Retrieval, International Symposium on (1999)
Sept. 21, 1999 to Sept. 24, 1999
Eljas Soisalon-Soininen , Helsinki University of Technology
Peter Widmayer , Institut f?r Theoretische Informatik
An important feature of a document database system is that the documents can be retrieved by searching for words from their contents. In a full-text index, each word of the stored documents can be used as a search key. Inserting a new document into the database automatically triggers a transaction that inserts the words together with their occurrence information into the index. In this paper, we present solutions to problems that arise when full-text indexing is applied for constantly changing document data, such as WWW pages.We present and analyze an algorithm for full-text indexing with the following properties: Concurrent searches are possible and efficient, and the algorithm can be designed such that several indexing processes can be performed concurrently. Moreover, the algorithm allows efficient recovery of the index after failures that can occur while the index is modified. This is important for large indices, because when not prepared for failures, the index may need to be reconstructed from original documents.
E. Soisalon-Soininen and P. Widmayer, "Concurrency and Recovery in Full-Text Indexing," String Processing and Information Retrieval, International Symposium on(SPIRE), Cancun, Mexico, 1999, pp. 192.