The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2008 vol.20)
pp: 433-448
ABSTRACT
Given a large spatio-temporal database of events, where each event consists of the following fields: event-ID, time, location, event-type, mining spatio-temporal sequential patterns is to identify significant event type sequences. Such spatio-temporal sequential patterns are crucial to investigate spatial and temporal evolutions of phenomena in many application domains. Recent literatures have explored the sequential patterns on transaction data and trajectory analysis on moving objects. However, these methods can not be directly applied to mining sequential patterns from a large number of spatio-temporal events. Two major research challenges are still remaining: (i) the definition of significance measures for spatio-temporal sequential patterns to avoid spurious ones; (ii) the algorithmic design under the significance measures which may not guarantee the downward closure property. In this paper, we propose a sequence index as the significance measure for spatio-temporal sequential patterns, which is meaningful due to its interpretability using spatial statistics. We propose a novel algorithm called Slicing-STS-Miner to tackle the algorithmic design challenges using the spatial sequence index which does not preserve the downward closure property. We compare the proposed algorithm with a simple algorithm called STS-Miner that utilizes the weak monotone property of the sequence index. Performance evaluations using both synthetic and real world datasets shows that the Slicing-STS-Miner is an order of magnitude faster than STS-Miner for large datasets.
INDEX TERMS
Spatial databases, Spatial databases and GIS, Data mining
CITATION
Yan Huang, Liqin Zhang, Pusheng Zhang, "A Framework for Mining Sequential Patterns from Spatio-Temporal Event Data Sets", IEEE Transactions on Knowledge & Data Engineering, vol.20, no. 4, pp. 433-448, April 2008, doi:10.1109/TKDE.2007.190712
REFERENCES
[1] M. Koubarakis, T.K. Sellis, A.U. Frank, S. Grumbach, R.H. Güting, C.S. Jensen, N.A. Lorentzos, Y. Manolopoulos, E. Nardelli, B. Pernici, H.-J. Schek, M. Scholl, B. Theodoulidis, and N. Tryfona, Spatio-Temporal Databases: The CHOROCHRONOS Approach. Springer, 2003.
[2] J. Roddick and M. Spiliopoulou, “A Bibliography of Temporal, Spatial and Spatio-Temporal Data Mining Research,” ACM SIGKDD Explorations, http://kdm.first.flinders.edu.au/IDMSTDMBib.html , 1999.
[3] P. Zhang, M. Steinbach, V. Kumar, S. Shekhar, P. Tan, S. Klooster, and C. Potter, “Discovery of Patterns of Earth Science Data Using Data Mining,” Next Generation of Data Mining Applications, 2004.
[4] Y. Huang, L. Zhang, and P. Zhang, “Finding Sequential Patterns from a Massive Number of Spatio-Temporal Events,” Proc. Sixth SIAM Int'l Conf. Data Mining (SDM '06), 2006.
[5] C. for Disease Control and P. (CDC), CDC West Nile Virus Homepage, http://www.cdc.gov/ncidod/dvbidwestnile, 2007.
[6] R. Srikant and R. Agrawal, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proc. Fifth Int'l Conf. Extending Database Technology (EDBT '96), pp. 3-17, 1996.
[7] M.J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Machine Learning, vol. 42, no. 1/2, pp. 31-60, 2001.
[8] J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 11, pp. 1424-1440, Nov. 2004.
[9] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu, “Freespan: Frequent Pattern-Projected Sequential Pattern Mining,” Proc. Sixth ACM SIGKDD, 2000.
[10] R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proc. IEEE 11th Int'l Conf. Data Eng. (ICDE '95), pp. 3-14, 1995.
[11] Y. Huang, S. Shekhar, and H. Xiong, “Discovering Colocation Patterns from Spatial Datasets: A General Approach,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 12, Dec. 2004.
[12] B. Ozden, S. Ramaswamy, and A. Silberschatz, “Cyclic Association Rules,” Proc. 14th IEEE Int'l Conf. Data Eng. (ICDE '98), 1998.
[13] J. Han, G. Dong, and Y. Yin, “Efficient Mining of Partial Periodic Patterns in Time Series Database,” Proc. 15th IEEE Int'l Conf. Data Eng. (ICDE '99), 1999.
[14] N. Mamoulis, H. Cao, G. Kollios, M. Hadjieleftheriou, Y. Tao, and D.W.L. Cheung, “Mining, Indexing, and Querying Historical Spatiotemporal Data,” Proc. 10th ACM SIGKDD, 2004.
[15] H. Cao, D.W. Cheung, and N. Mamoulis, “Discovering Partial Periodic Patterns in Discrete Data Sequences,” Proc. Eighth Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD '04), 2004.
[16] H. Cao, N. Mamoulis, and D.W. Cheung, “Mining Frequent Spatio-Temporal Sequential Patterns,” Proc. Fifth IEEE Int'l Conf. Data Mining (ICDM '05), pp. 82-89, 2005.
[17] N. Cressie, Statistics for Spatial Data. John Wiley & Sons, 1991.
[18] L. Arge, O. Procopiuc, S. Ramaswamy, T. Suel, and J. Vitter, “Scalable Sweeping-Based Spatial Join,” Proc. 24th Int'l Conf. Very Large Databases (VLDB '98), 1998.
[19] D.J.D.J.M. Patel, “Partition Based Spatial-Merge Join,” Proc. ACM SIGMOD '96, June 1996.
[20] S.T. Leutenegger and M.A. Lopez, “The Effect of Buffering on the Performance of R-Trees,” Proc. 14th IEEE Int'l Conf. Data Eng. (ICDE '98), 1998.
[21] N. Koudas and K.C. Sevcik, “Size Separation Spatial Join,” Proc. ACM SIGMOD '97, 1997.
[22] R. Agarwal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Int'l Conf. Very Large Data Bases (VLDB '94), 1994.
[23] “Discovery of Changes from the Global Carbon Cycle and Climate System Using Data Mining,” Univ. of Minnesota, http://www.ahpcrc.umn.edunasa-umn, 2004.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool