This Article 
 Bibliographic References 
 Add to: 
The Optimality of Allocation Methods for Bounded Disagreement Search Queries: The Possible and the Impossible
September 2006 (vol. 18 no. 9)
pp. 1194-1206
Data Allocation on multiple I/O devices manifests itself in many computing systems, both centralized and distributed. Data is partitioned on multiple I/O devices and clients issue various types of queries to retrieve relevant information. In this paper, we derive necessary and sufficient conditions for a data allocation method to be optimal for two important types of queries: partial match and bounded disagreement search queries. We formally define these query types and derive the optimality conditions based on coding-theoretic arguments. Although these conditions are fairly strict, we show how to construct good allocation methods for practical realistic situations. Not only are the response times bounded by a small value, but also the identification of the relevant answer set is efficient.

[1] K.A.S. Abdel-Ghaffar and A. El Abbadi, “Optimal Disk Allocation for Partial Match Queries,” ACM Trans. Database Systems, vol. 18, no. 1, pp. 132-156, 1993.
[2] M.J. Atallah and S. Prabhakar, “(Almost) Optimal Parallel Block Access for Range Queries,” Proc. ACM Symp. Principles of Database Systems, pp. 205-215, May 2000.
[3] E.R. Berlekamp, Algebraic Coding Theory. Laguna Hills, Calif.: Aegean Park Press, 1984.
[4] R. Bhatia, R.K. Sinha, and C.-M. Chen, “Hierarchical Declustering Schemes for Range Queries,” Proc. Int'l Conf. Extending Database Technology, pp. 525-537, Mar. 2000.
[5] A. Borodin, R. Ostrovsky, and Y. Rabani, “Lower Bounds for High Dimensional Nearest Neighbor Search and Related Problems,” Proc. ACM Symp. Theory of Computing, pp. 312-321, May 1999.
[6] C.-M. Chen and C.T. Cheng, “From Discrepancy to Declustering: Near-Optimal Multidimensional Declustering Strategies for Range Queries,” J. ACM, vol. 51, no. 1, pp. 46-73, 2004.
[7] H.C. Du and J.S. Sobolewski, “Disk Allocation for Cartesian Product Files on Multiple-Disk Systems,” ACM Trans. Database Systems, vol. 7, no. 1, pp. 82-101, 1982.
[8] C. Faloutsos, “Gray Codes for Partial Match and Range Queries,” IEEE Trans. Software Eng., vol. 14, no. 10, pp. 1381-1393, Oct. 1988.
[9] C. Faloutsos and D. Metaxas, “Declustering Using Error Correcting Codes,” Proc. ACM Symp. Principles of Database Systems, pp. 253-258, Mar. 1989.
[10] V. Guruswami, “List Decoding of Error-Correcting Codes,” PhD thesis, MIT, 2001.
[11] V. Guruswami and M. Sudan, “Improved Decoding of Reed-Solomon and Algebraic-Geometry Codes,” IEEE Trans. Information Theory, vol. 45, pp. 1757-1767, Sept. 1999.
[12] J.M. Hellerstein, E. Koutsoupias, D.P. Miranker, C.H. Papadimitriou, and V. Samoladas, “On a Model of Indexibility and Its Bounds for Range Queries,” J. ACM, vol. 49, pp. 35-55, Jan. 2002.
[13] S.M. Johnson, “A New Upper Bound for Error-Correcting Codes,” IEEE Trans. Information Theory, vol. 8, no. 32, pp. 203-207, 1962.
[14] H.C. Kim and K.-J. Li, “Declustering Spatial Objects by Clustering for Parallel Disks,” Proc. Int'l Conf. Database and Expert Systems Applications, pp. 450-459, Sept. 2001.
[15] M.H. Kim and S. Pramanik, “Optimal File Distribution for Partial Match Retrieval,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 173-182, June 1988.
[16] S. Kuo, M. Winslett, Y. Cho, and J. Lee, “New GDM-Based Declustering Methods for Parallel Range Queries,” Proc. Int'l Database Eng. and Applications Symp., pp. 119-127, Aug. 1999.
[17] B. Moon and J.H. Saltz, “Scalability Analysis of Declustering Methods for Multidimensional Range Queries,” IEEE Trans. Knowledge Data Eng., vol. 10, no. 2, pp. 310-327, Mar./Apr. 1998.
[18] A.W. Nordstrom and J.P. Robinson, “An Optimum Nonlinear Code,” Information and Control, vol. 11, pp. 613-616, 1967.
[19] R.L. Rivest, “Partial-Match Retrieval Algorithms,” SIAM J. Computing, vol. 5, pp. 19-50, Mar. 1976.
[20] R.K. Sinha, R. Bhatia, and C.-M. Chen, “Asymptotically Optimal Declustering Schemes for Range Queries,” Proc. Int'l Conf. Database Theory, pp. 144-158, Jan. 2001.
[21] I. Stoica, R. Morris, D. Liben-Nowell, D.R. Karger, M.F. Kaashoek, F. Dabek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications,” IEEE/ACM Trans. Networking, vol. 11, no. 1, pp. 17-32, Feb. 2003.
[22] J.H. van Lint, Introduction to Coding Theory, third ed. Berlin: Springer, 1999.
[23] T. Verhoeff, “An Updated Table of Minimum-Distance Bounds for Binary Linear Codes,” IEEE Trans. Information Theory, vol. 33, no. 5, pp. 665-680, 1987.

Index Terms:
Access methods, file organization, maintenance, organization/structure, information theory, file organization, retrieval models, Cartesian product files, multiple disk systems, coding theory.
Khaled A.S. Abdel-Ghaffar, Amr El Abbadi, "The Optimality of Allocation Methods for Bounded Disagreement Search Queries: The Possible and the Impossible," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 9, pp. 1194-1206, Sept. 2006, doi:10.1109/TKDE.2006.149
Usage of this product signifies your acceptance of the Terms of Use.