This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Hashing Methods for Temporal Data
July/August 2002 (vol. 14 no. 4)
pp. 902-919

External dynamic hashing has been used in traditional database systems as a fast method for answering membership queries. Given a dynamic set S of objects, a membership query asks whether an object with identity k is in (the most current state of) S. This paper addresses the more general problem of Temporal Hashing. In this setting, changes to the dynamic set are timestamped and the membership query has a temporal predicate, as in: "Find whether object with identity k was in set S at time t. " We present an efficient solution for this problem that takes an ephemeral hashing scheme and makes it partially persistent. Our solution, also termed partially persistent hashing, uses linear space on the total number of changes in the evolution of set S and has a small (O(\log_B(n/B))) query overhead. An experimental comparison of partially persistent hashing with various straightforward approaches (like external linear hashing, the Multiversion B-Tree, and the R*-tree) shows that it provides the faster membership query response time. Partially persistent hashing should be seen as an extension of traditional external dynamic hashing in a temporal environment. It is independent of the ephemeral dynamic hashing scheme used; while the paper concentrates on linear hashing, the methodology applies to other dynamic hashing schemes as well.

[1] I. Ahn and R.T. Snodgrass,“Performance evaluation of a temporal database management system,” C. Zaniolo, ed., Proc. ACM Int’l Conf. on Management of Data, pp. 96-107, May 1986.
[2] B. Becker, S. Gschwind, T. Ohler, B. Seeger, and P. Widmayer, "An Asymptotically Optimal Multiversion B-Tree," Very Large Data Bases J., vol. 5, no. 4, pp. 264-275, 1996.
[3] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles,” Proc. ACM SIGMOD Conf. Management of Data, 1990.
[4] J. Van den Bercken and B. Seeger, “Query Processing Techniques for Multiversion Access Methods,” Proc. Very Large Databases Conf., pp. 168-179, 1996.
[5] D. Comer, “The Ubiquitous B-Tree,” ACM Computing Surveys, vol. 11, no. 2, pp. 121-137, June 1979.
[6] T.H. Cormen,C.E. Leiserson, and R.L. Rivest,Introduction to Algorithms.Cambridge, Mass.: MIT Press/McGraw-Hill, 1990.
[7] M. Dietzfelbinger,A. Karlin,K. Mehlhorn,F. Meyer,H. Rohnhert,, and R. Tarjan,“Dynamic perfect hashing: Upper and lower bounds,” Proc. 29th IEEE FOCS, pp. 524-531, 1988.
[8] J.R. Driscoll, N. Sarnak, D.D. Sleator, and R.E. Tarjan, "Making Data Structures Persistent," J. Computer and System Sciences, vol. 38, pp. 86-124, 1989.
[9] R.J. Enbody and H.C. Du, “Dynamic Hashing Schemes,” ACM Computing Surveys, vol. 20, no. 2, pp. 85-113, June 1988.
[10] R. Elmasri and S.B. Navathe, Fundamentals of Database Systems, second ed., Benjamin/Cummings, 1994.
[11] R. Fagin, J. Nievergelt, N. Pippenger, and H.R. Strong, “Extendible Hashing—A Fast Access Method for Dynamic Files,” ACM Trans. Database Systems, vol. 4, no. 3, pp. 315-344, Sept. 1979.
[12] A. Fiat, M. Naor, J.P. Schmidt, and A. Siegel, “Nonoblivious Hashing,” J. ACM, vol. 39, no. 4, pp. 764-782, 1992.
[13] M.J. Folk, B. Zoellick, and G. Riccardi, File Structures. Reading, Mass.: Addisson Wesley, 1998.
[14] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD Conf. Management of Data, 1984.
[15] C.S. Jensen, J. Clifford, R. Elmasri, S.K. Gadia, P. Hayes and S. Jajodia, eds., "A Glossary of Temporal Database Concepts," ACM SIGMOD Record, vol. 23, no. 1, pp. 52-64, Mar. 1994.
[16] C.S. Jensen and R.T. Snodgrass, “Temporal Data Management,” IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 36–45, 1999.
[17] G. Kollios and V.J. Tsotras, “Hashing Methods for Temporal Data,” Technical Report UCR-CS-98-01, Dept. of Computer Science, Univ. of California-Riverside,(http://www.cs.ucr.edu/publications/tech_reports ).
[18] A. Kumar, V.J. Tsotras, and C. Faloutsos, "Designing Access Methods for Bitemporal Databases," IEEE Trans. Knowledge and Data Eng., vol. 10, no. 1, pp. 1-20, 1998.
[19] P. Larson, “Dynamic Hashing,” BIT, vol. 18, pp. 184-201, 1978.
[20] W. Litwin, “Linear Hashing: A New Tool for File and Table Addressing,” Proc. Very Large Databases Conf., pp. 212-223, 1980.
[21] A. Nanda and L.M. Ni, “Benchmark Workload Generation and Performance Characterization of Multiprocessors,” Proc. Supercomputing '92, pp. 20-29, Nov. 1992.
[22] T.Y.C. Leung and R.R. Muntz, “Temporal Query Processing and Optimization in Multiprocessor Database Machines,” Proc. Very Large Databases Conf., pp. 383-394, 1992.
[23] M.-L. Lo and C.V. Ravishankar, “Spatial Hash-Joins,” Proc. ACM SIGMOD, pp. 247-258, June 1996.
[24] D. Lomet and B. Salzberg, "Access Methods for Multiversion Data," Proc. ACM SIGMOD Conf., pp. 315-324, 1989.
[25] G.M. Landau, J.P. Schmidt, and V.J. Tsotras, "On Historical Queries along Multiple Lines of Time Evolution," Very Large Data Bases J., vol. 4, no. 4, pp. 703-726, Oct. 1995.
[26] G. Özsoyovglu and R.T. Snodgrass, “Temporal and Real-Time Databases: A Survey,” IEEE Trans. Knowledge and Data Eng., vol. 7, no. 4, pp. 513–532, 1995.
[27] D. Pfoser and C.S. Jensen, “Incremental Join of Time-Oriented Data,” Proc. 11th Int'l Conf. Scientific and Statistical Database Management, July 1999.
[28] R. Ramakrishnan, Database Management Systems, McGraw-Hill, 1997.
[29] B. Salzberg, File Structures. Prentice Hall, 1988.
[30] B. Seeger, Personal communication, 1999.
[31] D.A. Schneider and D.J. DeWitt, “Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines,” Proc. 16th Int'l Conf. Very Large Data Bases, pp. 469–480, Aug. 1990.
[32] B. Salzberg and D. Lomet, “Branched and Temporal Index Structures,” Technical Report NU-CCS-95-17, College of Computer Science, Northeastern Univ., 1995.
[33] M. Soo, R. Snodgrass, and C. Jensen, “Efficient Evaluation of the Valid-Time Natural Join,” Proc. 10th Int'l Conf. Data Eng., Feb. 1994.
[34] B. Salzberg and V.J. Tsotras, "A Comparison of Access Methods for Time-Evolving Data," ACM Computing Surveys, to appear; also available as Technical Report No. TR-18, TimeCenter, Aalborg Univ., 1997: .
[35] V. J. Tsotras, B. Gopinath, and G.W. Hart, "Efficient Management of Time-Evolving Databases," IEEE Trans. Knowledge and Data Eng., vol. 7, no. 4, pp. 591-608, Aug. 1995.
[36] V.J. Tsotras, C.S. Jensen, and R.T. Snodgrass, “An Extensible Notation for Spatiotemporal Index Queries,” SIGMOD Record, vol. 27, no. 1, pp. 47-53, 1998.
[37] V.J. Tsotras and N. Kangelaris, "The Snapshot Index: An I/O-Optimal Access Method for Timeslice Queries," Information Systems, vol. 20, no. 3, pp. 237-260, 1995.
[38] V.J. Tsotras and X.S. Wang, “Temporal Databases,” Wiley Encyclopedia of Electrical and Electronics Eng., vol. 21, pp. 628-641, 1999.
[39] P.J. Varman and R.M. Verma, "An Efficient Multiversion Access Structure," IEEE Trans. Knowledge and Data Eng., vol. 9, no. 3, pp. 391-409, May/June 1997.
[40] T. Zurek, “Optimization of Partitioned Temporal Joins,” PhD thesis, Dept. of Computer Science, Univ. of Edinburgh, Nov. 1997.

Index Terms:
Hashing, temporal databases, transaction time, access methods, data structures.
Citation:
George Kollios, Vassilis J. Tsotras, "Hashing Methods for Temporal Data," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 4, pp. 902-919, July-Aug. 2002, doi:10.1109/TKDE.2002.1019221
Usage of this product signifies your acceptance of the Terms of Use.