This Article 
 Bibliographic References 
 Add to: 
Efficient Algorithms for Large-Scale Temporal Aggregation
May/June 2003 (vol. 15 no. 3)
pp. 744-759

Abstract—The ability to model time-varying natures is essential to many database applications such as data warehousing and mining. However, the temporal aspects provide many unique characteristics and challenges for query processing and optimization. Among the challenges is computing temporal aggregates, which is complicated by having to compute temporal grouping. In this paper, we introduce a variety of temporal aggregation algorithms that overcome major drawbacks of previous work. First, for small-scale aggregations, both the worst-case and average-case processing time have been improved significantly. Second, for large-scale aggregations, the proposed algorithms can deal with a database that is substantially larger than the size of available memory. Third, the parallel algorithm designed on a shared-nothing architecture achieves scalable performance by delivering nearly linear scale-up and speed-up, even at the presence of data skew. The contributions made in this paper are particularly important because the rate of increase in database size and response time requirements has out-paced advancements in processor and mass storage technology.

[1] M. Barnett, S. Gupta, D. Payne, L. Shuler, R. van de Geijn, and J. Watts, “Interprocessor Collective Communication Library (InterCom),” Proc. Scalable High Performance Computing Conf., pp. 357-364, May 1994.
[2] Jon Louis Bentley, “Algorithms for Klee's Rectangle Problems,” technical report unpublished, Computer Science Dept., Carnegie Mellon Univ., 1977.
[3] Ohio Supercomputer Center, LAM/MPI Parallel Computing,http://www.osc.edulam.html, 1998.
[4] S. Chaudhuri and K. Shim, “Including Group by in Query Optimization,” Proc. 20th Very Large Database Conf., pp. 354-366, Sept. 1994.
[5] S. Chaudhuri and U. Dayal, “An Overview of Data Warehousing and OLAP Technology,” SIGMOD Record, vol. 26, no. 1, Mar. 1997.
[6] T.H. Cormen,C.E. Leiserson, and R.L. Rivest,Introduction to Algorithms.Cambridge, Mass.: MIT Press/McGraw-Hill, 1990.
[7] D.J. DeWitt, R.H. Katz, F. Olken, L.D. Shapiro, and M.R. Stonebraker, “Implementation Techniques for Main Memory Database Systems,” Proc. ACM SIGMOD, 1984.
[8] R. Epstein, “Techniques for Processing of Aggregates in Relational Database Systems,” Technical Report UCB/ERL M7918, Univ. of California, Berkeley, Feb. 1979.
[9] J.C. Freytag and N. Goodman, “Translating Aggregate Queries Into Iterative Programs,” Proc. 12th Very Large Database Conf., pp. 138-146, Aug. 1986.
[10] J. Alvin, G. Gendrano, B.C. Huang, J.M. Rodrigue, B. Moon, and R.T. Snodgrass, “Parallel Algorithms for Computing Temporal Aggregates,” Proc. 15th Int'l Conf. Data Eng., Mar. 1999.
[11] C.S. Jensen and R.T. Snodgrass, “Semantics of Time-Varying Information,” Information Systems, vol. 21, no. 4, pp. 311-352, 1996.
[12] J.S. Kim, S.T. Kang, and M.H. Kim, “On Temporal Aggregate Processing Based on Time Points,” Information Processing Letters, vol. 71, no. 5-6, Sept. 1999.
[13] N. Kline and R.T. Snodgrass, “Computing Temporal Aggregates,” Proc. IEEE Int'l Conf. Data Eng., pp. 222–231, 1995.
[14] R.N. Kline, Aggregation in Temporal Databases, PhD thesis, Univ. Arizona, Tucson, May 1999.
[15] D. Knuth, The Art of Computer Programming, vol. 3: Sorting and Searching. Addison-Wesley, 1973.
[16] U. Manber, Introduction to Algorithms: A Creative Approach. Addison-Wesley, 1989.
[17] G. Piatetsky-Shapiro and C. Connel, “Accurate Estimation of the Number of Tuples Satisfying a Condition,” Proc. 1984 ACM-SIGMOD Conf., pp. 256-276, June 1984.
[18] R. Ramakrishnan, Database Management Systems, McGraw-Hill, 1997.
[19] A. Silberschatz, H.F. Kort, and S. Sudarshan, Database System Concepts. McGraw-Hill, third ed., 1999.
[20] R.T. Snodgrass, S. Gomez, and E. Mackenzie, “Aggregates in the Temporal Query Language TQuel,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 5, pp. 826-842, Oct. 1993.
[21] M. Stonebraker, “The Case for Shared Nothing,” A Quarterly Bull. of the IEEE Computer Soc. Technical Committee on Database Eng., vol. 9, no. 1, pp. 4-9, Mar. 1986.
[22] A. Tansel et al. Temporal Databases: Theory, Design, and Implementation. Database Systems and Applications Series, Benjamin/Cummings, 1993.
[23] P.A. Tuma, “Implementing Historical Aggregates in TempIS,” Master's thesis, Wayne State Univ., Nov. 1992.
[24] J. Yang and J. Widom, “Incremental Computation and Maintenance of Temporal Aggregates,” Proc. 17th Int'l Conf. Data Eng., Apr. 2001.
[25] X. Ye and J.A. Keane, “Processing Temporal Aggregates in Parallel,” Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, pp. 1373-1378, Oct. 1997.
[26] D. Zhang, A. Markowetz, V. Tsotras, D. Gunopulos, and B. Seeger, “Efficient Computation of Temporal Aggregates with Range Predicates,” Proc. 20th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, May 2001.

Index Terms:
Temporal databases, temporal aggregation, scalable query processing, data partitioning, balanced tree algorithm, merge-sort algorithm, temporal query processing, aggregate queries.
Bongki Moon, Ines Fernando Vega Lopez, Vijaykumar Immanuel, "Efficient Algorithms for Large-Scale Temporal Aggregation," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 3, pp. 744-759, May-June 2003, doi:10.1109/TKDE.2003.1198403
Usage of this product signifies your acceptance of the Terms of Use.