This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Combined Method for Maintaining Large Indices in Multiprocessor Multidisk Environments
June 1994 (vol. 6 no. 3)
pp. 479-496

Consider the problem of maintaining large indices (or secondary memory indices) in a multiprocessor multidisk environment in which each processor has a dedicated secondary memory (one disk or more). The processors either reside in the same site and communicate via shared memory, or reside in different sites and communicate via a local broadcast network. The straightforward method (SFM) for maintaining such an index, which is commonly called declustering, is to partition the index records equally among the processors, each of which maintains its part of the index in a local B/sup tree. In prior work (Inform. Processing Lett., vol. 34, pp. 313-321, May 1990), we have presented another method, called the "totally distributed B/sup tree" (TDB) method, in which all processors together implement a "wide" B/sup tree. There are settings in which the second method is better than the first method, and vice versa. In this paper, we present a new method, called the combined distribution method (CDM), that combines the ideas underlying SFM and TDB. In tightly coupled environments, CDM outperforms both SFM and TDB in almost all practical settings (in many settings by more than 30%). This is shown by an approximate analysis and verified by simulations. Note that CDM's approach can improve performance in database systems that use a RAID (redundant array of inexpensive disks).

[1] R. Baeza-Yates and P.-A. Larson, "Performance of B+-trees with partial expansions,"IEEE Trans. Knowl. Data Eng., vol. 1, pp. 248-257, June 1989.
[2] D. Bittonet al., "Parallel algorithms for the execution of relational database operation,"ACM Trans. Database Syst., vol. 8, no. 3, pp. 324-353, Sept. 1983.
[3] R. Bayer and E. McCreight, "Organization and maintenance of large ordered indexes,"Acta Informatica, vol. 1, pp. 173-189, 1972.
[4] G. Copeland, W. Alexander, E. Boughter, and T. Keller, "Data placement in bubba," inProc. ACM SIGMOD, Chicago, IL, June 1-3, 1988, pp. 99-109.
[5] D. J. DeWitt, S. Ghandeharizadeh, D. A. Schneider, H. Hsiao, and R. Rasmussen, "The Gamma database machine project,"IEEE Trans. Knowl. Data Eng., vol. 2, pp. 44-62, Mar. 1990.
[6] T. Johnson and D. Shasha, "Utilization of B-trees with inserts, deletes and modifies,"Proc. ACM-PODS Conf., 1989, pp. 235-246.
[7] L. Kleinrock,Queueing Systems., vol. 1, New York: Wiley, 1975.
[8] W. Litwin and D. B. Lomet, "A new method for fast data search with keys,"IEEE Software, pp. 16-24, Mar. 1987.
[9] D. B. Lomet, "A simple bounded disorder file organization with good performance,"ACM Trans. Database Syst., vol. 13, no. 4, pp. 525-551, Dec. 1988.
[10] J. Martin, VSAM:Access Method Services and Programming Techniques. Englewood Cliffs, NJ: Prentice-Hall, 1987.
[11] G. Matsliach and O. Shmueli, "Distributing a B+-tree in a loosely coupled environment,"Inform. Processing Lett., vol. 34, pp. 313-321, May 1990.
[12] G. Matsliach and O. Shmueli, "Methods for distributing a B+-tree in a loosely coupled environment," Tech. Rep. 594, Technion--Israel Inst. of Technol., Dept. of Comput. Sci., 1989.
[13] G. Matsliach and O. Shmueli, "Maintaining bounded disorder files in multiprocessor multi-disk environments,"Proc. 3rd Int. Conf. Database Theory (ICDT), reprinted in S. Abiteboul and P. C. Kanellakis, Eds.Lecture Notes in Computer Science 470, May 1991.
[14] G. Matsliach and O. Shmueli, "An efficient method for distributing search structures,"Proc. Int. Conf. Parallel and Distrib. Inform. Syst. (PDIS), Dec. 1991.
[15] D. A. Patterson, G. Gibson, and R. H. Katz, "A case for redundant arrays of inexpensive disks (RAID)," inProc. ACM SIGMOD, Chicago, IL, June 1-3, 1988, pp. 109-116.
[16] Y. Sagiv, "Concurrent operations on B+-trees with overtaking,"J. Comput. Syst. Sci., vol. 33, no. 2, pp. 275-296, Oct. 1986.
[17] D. Shasha, "Concurrent algorithms for search structures," Tech. Rep. 12-84, Harvard Univ., Div. of Applied Sci., June 1984.
[18] P. D. Smith and G. M. Barnes,Files and Databases: An Introduction, Reading, MA: Addison-Wesley, 1987.
[19] C. Stanfill, "Parallel computing for information retrieval: Recent developments," Tech. Rep. DR88-1, Thinkin Machine Corp., 1988.
[20] Teradata Corp., "DCB/1012 database computer system manual release 1.3," Tech. Rep. C10-0001-01, Teradata Corp., Los Angeles, CA, 1985.
[21] J. D. Ullman,Principles of Databases Systems. Rockville, MD: Computer Science Press, 1982.
[22] A. A. C. Yao, "On random 2-3 trees,"Acta Informatica, vol. 9, no. 2, pp. 159-170, 1978.

Index Terms:
multiprocessing systems; indexing; tree data structures; magnetic disc storage; performance evaluation; distributed databases; large index maintenance; multiprocessor multidisk environments; secondary memory indices; dedicated secondary memory; shared memory; local broadcast network; straightforward method; declustering; index record partitioning; local B/sup tree; totally distributed B/sup tree method; combined distribution method; tightly coupled environments; performance; approximate analysis; simulation; database systems; RAID; redundant array of inexpensive disks; data structures; distributed file systems; distributed indices
Citation:
G. Matsliach, O. Shmueli, "A Combined Method for Maintaining Large Indices in Multiprocessor Multidisk Environments," IEEE Transactions on Knowledge and Data Engineering, vol. 6, no. 3, pp. 479-496, June 1994, doi:10.1109/69.334867
Usage of this product signifies your acceptance of the Terms of Use.