Scientific and Statistical Database Management, International Conference on (2006)
July 3, 2006 to July 5, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SSDBM.2006.29
Lukasz Golab , Univ. of Waterloo, Canada
Piyush Prahladka , Google, Inc. Bangalore, India
M. Tamer Ozsu , Univ. of Waterloo, Canada
Many applications store data items for a pre-determined, finite length of time. Examples include slidingwindows over on-line data streams, where old data are dropped as the window slides forward. Previous research on management of data with finite lifetimes has emphasized on-line query processing in main memory. In this paper, we address the problem of indexing time-evolving data on disk for off-line analysis. In order to reduce the I/O costs of index updates, existing work partitions the data chronologically. This way, only the oldest partition is examined for expirations, only the youngest partition incurs insertions, and the remaining partitions "in the middle" are not accessed. However, this solution is based upon the assumption that the order in which the data are inserted is equivalent to the expiration order, which means that the lifetime of each data item is the same. We motivate the need to break this assumption, demonstrate that the existing solutions no longer apply, and propose new index partitioning strategies that yield low update costs and fast access times.
Lukasz Golab, Piyush Prahladka, M. Tamer Ozsu, "Indexing Time-Evolving Data With Variable Lifetimes", Scientific and Statistical Database Management, International Conference on, vol. 00, no. , pp. 265-274, 2006, doi:10.1109/SSDBM.2006.29