Issue No.03 - June (1995 vol.7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/69.390248
<p><it>Abstract</it>—Signature files have been studied extensively as an access method for textual databases. Many approaches have been proposed for searching signatures files efficiently. However, different methods make different assumptions and use different performance measures, making it difficult to compare their performance. In this paper, we study three basic methods proposed in the literature, namely, the <it>indexed descriptor file</it>, the <it>two-level superimposed coding scheme</it>, and the <it>partitioned signature file approach</it>. The contribution of this paper is two-fold. First, we present a uniform analytical performance model so that the methods can be compared fairly and consistently. The analysis shows that the two-level superimposed coding scheme, if stored in a transposed file, has the best performance. Second, we extend the two-level superimposed coding method into a <it>multilevel superimposed coding</it> method, we obtain the optimal number of levels for the multilevel method and show that for databases with reasonable size the optimal value is much larger than 2, which is assumed in the two-level method. The accuracy of the analytical formula is demonstrated by simulation.</p>
Access methods, text retrieval, performance analysis, superimposed coding.
Dik Lun Lee, Young Man Kim, Gaurav Patel, "Efficient Signature File Methods for Text Retrieval", IEEE Transactions on Knowledge & Data Engineering, vol.7, no. 3, pp. 423-435, June 1995, doi:10.1109/69.390248