|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| C.y. Chen, H.f. Lin, C.c. Chang, R.c.t. Lee, "Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval," IEEE Transactions on Knowledge and Data Engineering, vol. 9, no. 1, pp. 148-160, January-February, 1997. | |||
| BibTex | x | ||
| @article{ 10.1109/69.567057, author = {C.y. Chen and H.f. Lin and C.c. Chang and R.c.t. Lee}, title = {Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {9}, number = {1}, issn = {1041-4347}, year = {1997}, pages = {148-160}, doi = {http://doi.ieeecomputersociety.org/10.1109/69.567057}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Optimal Bucket Allocation Design of k-ary MKH Files for Partial Match Retrieval IS - 1 SN - 1041-4347 SP148 EP160 EPD - 148-160 A1 - C.y. Chen, A1 - H.f. Lin, A1 - C.c. Chang, A1 - R.c.t. Lee, PY - 1997 KW - Multidisk file design KW - bucket allocation problem KW - multiple key hashing files KW - partial match queries KW - optimal performances. VL - 9 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
Abstract—This paper first shows that the bucket allocation problem of an MKH (Multiple Key Hashing) file for partial match retrieval can be reduced to that of a smaller sized subfile, called the remainder of the file. And it is pointed out that the remainder type MKH file is the hardest MKH file for which to design an optimal allocation scheme. We then particularly concentrate on the allocation of an important remainder type MKH file; namely, the k-ary MKH file. We present various sufficient conditions on the number of available disks and the number of attributes for a k-ary MKH file to have a perfectly optimal allocation among the disks for partial match queries. Based upon these perfectly optimal allocations, we further present a heuristic method, called the CH (Cyclic Hashing) method, to produce near optimal allocations for the general k-ary MKH files. Finally, a comparison, by experiment, between the performances of the proposed method and an "ideal" perfectly optimal method, shows that the CH method is indeed satisfactorily good for the general k-ary MKH files.
[1] K.A.S. Abdel-Ghaffar and A. El. Abbadi, "Optimal Disk Allocation for Partial Match Queries," ACM Trans. Database Systems, vol. 18, no. 1, pp. 132-156, 1993.
[2] A.V. Aho and J.D. Ullman, "Optimal Partial-Match Retrieval When Fields Are Independently Specified," ACM Trans. Database Systems, vol. 4, no. 2, pp. 168-179, 1979.
[3] A. Bolour, "Optimality Properties of Multiple Key Hashing Functions," J. Assoc. Computing, vol. 26, no. 2, pp. 196-210, 1979.
[4] W.A. Burkhard, "Partial Match Hash Coding: Benefits of Redundancy," ACM Trans. Database Systems, vol. 4, no. 2, pp. 228-239, 1979.
[5] M.Y. Chan, "Multidisk File Design: An Analysis of Folding Buckets to Disks," BIT, vol. 24, pp. 262-268, 1984.
[6] M.Y. Chan, "A Note on Redundant Disk Allocation," IPL, vol. 20, pp. 121-123, 1985.
[7] C.C. Chang, "Optimal Information Retrieval When Queries Are Not Random," Information Sciences, vol. 34, pp. 199-223, 1984.
[8] C.C. Chang, "Application of Principal Component Analysis to Multidisk Concurrent Accessing," BIT, vol. 28, pp. 205-214, 1988.
[9] C.C. Chang and C.Y. Chen, "Gray Code as a Declustering Scheme for Concurrent Disk Retrieval," Information Science and Eng., vol. 13, no. 2, pp. 177-188, 1987.
[10] C.C. Chang and C.Y. Chen, "Symbolic Gray Code as a Data Allocation Scheme for Two-disk Systems," The Computer J.,U.K., vol. 35, no. 3, pp. 299-305, 1992.
[11] C.C. Chang, M.W. Du, and R.C.T. Lee, "Performance Analysis of Cartesian Product Files and Random Files," IEEE Trans. Software Eng., vol. 10, no. 1, pp. 88-99, 1984.
[12] C.C. Chang, R.C.T. Lee, and H.C. Du, "Some Properties of Cartesian Product Files," Proc. ACM-SIGMOD Conf., pp. 157-168, 1980.
[13] C.C. Chang and J.C. Shieh, "On the Complexity of File Allocation Problem," Proc. Int'l Conf. Foundation of Data Organization,Kyoto, Japan, pp. 113-115, May 1985.
[14] C.Y. Chen and H.F. Lin, "Optimality Criteria of the Disk Modulo Allocation Method for Cartesian Product Files," BIT, vol. 31, pp. 566-575, 1991.
[15] C.Y. Chen, H.F. Lin, R.C.T. Lee, and C.C. Chang, "Redundant MKH Files Design among Multiple Disks for Concurrent Partial Match Retrieval," The J. Systems and Software, 1996, to appear.
[16] H.C. Du, "Disk Allocation Methods for Binary Cartesian Product Files," BIT, vol. 26, pp. 138-147, 1986.
[17] H.C. Du and J.S. Sobolewski, "Disk Allocation for Product Files on Multiple Disk Systems," ACM Trans. Database Systems, vol. 7, Mar. 1982.
[18] C. Faloutsos and D. Metaxas, "Disk Allocation Methods Using Error Correcting Codes," IEEE Trans. Computers, Aug. 1991.
[19] M.T. Fang, R.C.T. Lee, and C.C. Chang, "The Idea of Declustering and its Applications," Proc. Int'l Conf. Very Large Databases, 1986.
[20] M.H. Kim and S. Pramanik, “Optimal File Distribution for Partial Match Retrieval,” Proc. ACM Int'l Conf. Management of Data, pp. 173-182, 1988.
[21] R.C.T. Lee and S.H. Tseng, "Multikey Sorting," Policy Analysis and Information Systems, vol. 3, no. 2, pp. 1-20, 1979.
[22] W.C. Lin, R.C.T. Lee, and H.C. Du, "Common Properties of Some Multi-Attribute File Systems," IEEE Trans. Software Eng., vol. 1, SE-5, no. 2, pp. 160-174, 1979.
[23] J.H. Liou and S.B. Yao, "Multi-Dimension Clustering for Database Organizations," Information Systems, vol. 2, no. 2, pp. 187-198, 1977.
[24] K. Ramamohanarao, J. Shepherd, and R. Sacks-Davis, "Multi-Attribute Hashing with Multiple File Copies for High Performance Partial-Match Retrieval," BIT, vol. 30, pp. 404-423, 1990.
[25] R.L. Rivest, "Partial-Match Retrieval Algorithms," SIAM J. Computing, vol. 14, no. 1, pp. 19-50, 1976.
[26] J.B. Rothnie and T. Lozano, “Attribute Based File Organization in a Paged Memory Environment,” Comm. ACM, vol. 17, no. 2, pp. 63–69, Feb. 1974.
[27] Y.Y. Sung, "Performance Analysis of Disk Allocation Method for Cartesian Product Files," IEEE Trans. Software Eng., vol. 13, no. 9, pp. 1,018- 1,026, 1987.
[28] C.Y. Tang, D.J. Buehrer, and R.C.T. Lee, "On the Complexity of Some Multiattribute File Design Problems," Information Systems, vol. 10, no. 1, pp. 21-25, 1985.

