Issue No.11 - Nov. (2012 vol.24)

pp: 1977-1992

Pradeep Mohan , University of Minnesota, Twin-Cities, Minneapolis

Shashi Shekhar , University of Minnesota, Twin-Cities, Minneapolis

James A. Shine , US Army Corps of Engineers, Alexandria

James P. Rogers , US Army Corps of Engineers, Alexandria

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.146

ABSTRACT

Given a collection of Boolean spatiotemporal (ST) event-types, the cascading spatiotemporal pattern (CSTP) discovery process finds partially ordered subsets of these event-types whose instances are located together and occur serially. For example, analysis of crime data sets may reveal frequent occurrence of misdemeanors and drunk driving after and near bar closings on weekends, as well as after and near large gatherings such as football games. Discovering CSTPs from ST data sets is important for application domains such as public safety (e.g., identifying crime attractors and generators) and natural disaster planning, (e.g., preparing for hurricanes). However, CSTP discovery presents multiple challenges; three important ones are 1) the exponential cardinality of candidate patterns with respect to the number of event types, 2) computationally complex ST neighborhood enumeration required to evaluate the interest measure and 3) the difficulty of balancing computational complexity and statistical interpretation. Current approaches for ST data mining focus on mining totally ordered sequences or unordered subsets. In contrast, our recent work explores partially ordered patterns. Recently, we represented CSTPs as directed acyclic graphs (DAGs); proposed a new interest measure, the cascade participation index (CPI); outlined the general structure of a cascading spatiotemporal pattern miner (CSTPM); evaluated filtering strategies to enhance computational savings using a real-world crime data set and proposed a nested loop-based CSTPM to address the challenge posed by exponential cardinality of candidate patterns. This paper adds to our recent work by offering a new computational insight, namely, that the computational bottleneck for CSTP discovery lies in the interest measure evaluation. With this insight, we propose a new CSTPM based on spatiotemporal partitioning that significantly lowers the cost of interest measure evaluation. Analytical evaluation shows that our new CSTPM is correct and complete. Results from significant amount of new experimental evaluation with both synthetic and real data show that our new ST partitioning-based CSTPM outperforms the CSTPM from our previous work. We also present a case study that verifies the applicability of CSTP discovery process.

INDEX TERMS

Correlation, Data mining, Time measurement, Hurricanes, Indexes, Data models, Meteorology, spatiotemporal partial order, Cascading spatiotemporal patterns, space-time K-function, cascade participation index, spatiotemporal join, spatio-temporal continuity, positive ST autocorrelation

CITATION

Pradeep Mohan, Shashi Shekhar, James A. Shine, James P. Rogers, "Cascading Spatio-Temporal Pattern Discovery",

*IEEE Transactions on Knowledge & Data Engineering*, vol.24, no. 11, pp. 1977-1992, Nov. 2012, doi:10.1109/TKDE.2011.146REFERENCES

- [1] M.S. Scott and K. Dedel, "Assaults in and Around Bars,"
Problem Oriented Guides for Police, Problem Specific Guides, second ed., vol. 1, pp. 1-78, US Dept. of Justice, 2006.- [2]
Committee on Strategic Advice on the U.S. Climate Change Science Program; Nat'l Research Council: Restructuring Fed. Climate Research to Meet the Challenges of Climate Change. The Nat'l Academies Press, 2009.- [3] D.M. Morens, G.K. Folkers, and A.S. Fauci, "The Challenge of Emerging and Re-Emerging Infectious Diseases,"
Nature, vol. 430, pp. 242-249, July 2004.- [4] R.W. Robinson, "Counting Labeled Acyclic Digraphs,"
New Directions in the Theory of Graphs, F. Harary, ed., pp. 239-273, Academic Press, 1973.- [5] Y. Huang, S. Shekhar, and H. Xiong, "Discovering Colocation Patterns from Spatial Data Sets: A General Approach,"
IEEE Trans. Knowledge and Data Eng., vol. 16, no. 12, pp. 1472-1485, Dec. 2004.- [6] J. Wang, W. Hsu, and M.L. Lee, "A Framework for Mining Topological Patterns in Spatio-Temporal Databases,"
Proc. 14th ACM Int'l Conf. Information and Knowledge Management (CIKM '05), pp. 429-436, 2005.- [7] Y. Huang, L. Zhang, and P. Zhang, "A Framework for Mining Sequential Patterns from Spatio-Temporal Event Data Sets,"
IEEE Trans. Knowledge and Data Eng., vol. 20, no. 4, pp. 433-448, Apr. 2008.- [8] P. Mohan, S. Shekhar, J.A. Shine, and J.P. Rogers, "Cascading Spatio-Temporal Pattern Discovery: A Summary of Results,"
Proc. SIAM Int'l Conf. Data Mining (SDM), pp. 327-338, 2010,- [9] J.A. Shine, J.P. Rogers, S. Shekhar, and P. Mohan, "Discovering Partially Ordered Patterns of Terrorism via Spatio-Temporal Data Mining,"
Proc. 16th Army Conf. Applied Statistics, 2010.- [10] J.A. Shine, J.P. Rogers, S. Shekhar, and P. Mohan, "Cascade Models for Spatio-Temporal Pattern Discovery,"
Proc. First USACE Research and Development Conf., 2009.- [11] J.F. Allen, "Towards a General Theory of Action and Time,"
Artificial Intelligence, vol. 23, no. 2, pp. 123-154, 1984.- [12] M.F. Worboys, "Event-Oriented Approaches to Geographic Phenomena,"
Int'l J. Geographical Information Science, vol. 19, no. 1, pp. 1-28, 2005.- [13] R. Agrawal, T. Imielinski, and A.N. Swami, "Mining Association Rules between Sets of Items in Large Databases,"
Proc. ACM SIGMOD Conf. Management of Data, pp. 207-216, 1993,- [14] P.J. Diggle, A.G. Chetwynd, R. Haggkvist, and S. Morris, "Second-Order Analysis of Space-Time Clustering,"
Statistical Methods in Medical Research, vol. 4, pp. 124-136, 1995.- [15] B. Ripley, "Modelling Spatial Patterns,"
J. Royal Statistical Soc., Series B (Methodological), vol. 39, no. 2, pp. 172-212, 1977.- [16] J. Patel and D. DeWitt, "Partition Based Spatial-Merge Join,"
ACM SIGMOD Record, vol. 25, no. 2, pp. 259-270, 1996.- [17] P. Mohan, S. Shekhar, J.A. Shine, and J.P. Rogers, "Cascading Spatio-Temporal Pattern Discovery," Technical Report TR 11-010, Dept. of CS and Eng., Univ. of Minnesota, 2011.
- [18] L. C. P. Department, "Lincoln City Crime Records," http://www.lincoln.ne.gov/citypolice/, 2008.
- [19] F. MasseyJr., "The Kolmogorov-Smirnov Test for Goodness of Fit,"
J. Am. Statistical Assoc., vol. 46, pp. 68-78, 1951.- [20] M. Kuramochi and G. Karypis, "An Efficient Algorithm for Discovering Frequent Subgraphs,"
IEEE Trans. Knowledge and Data Eng., vol. 16, no. 9, pp. 1038-1051, Sept. 2004.- [21] M. Kuramochi and G. Karypis, "Finding Frequent Patterns in a Large Sparse Graph,"
Data Mining and Knowledge Discovery, vol. 11, no. 3, pp. 243-271, 2005.- [22] J. Pearl,
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Series in Representation and Reasoning. Morgan Kaufmann, 1988.- [23] R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations and Performance Improvements,"
Proc. Fifth Int'l Conf. Extending Database Technology: Advances in Database Technology, pp. 3-17, 1996.- [24] J. Pei et al., "Discovering Frequent Closed Partial Orders from Strings,"
IEEE Trans. Knowledge and Data Eng., vol. 18, no. 11, pp. 1467-1481, Nov. 2006.- [25] M.R. Garey and D.S. Johnson,
Computers and Intractability: A Guide to the Theory of NP-Completeness, Series of books in the mathematical sciences. W.H. Freeman and Company, 1979.- [26] R. Agrawal, D. Gunopulos, and F. Leymann, "Mining Process Models from Workflow Logs,"
Proc. Sixth Int'l Conf. Extending Database Technology, pp. 469-483, 1998.- [27] H. Cao, N. Mamoulis, and D. Cheung, "Discovery of Collocation Episodes in Spatiotemporal Data,"
Proc. Sixth Int'l Conf. Data Mining (ICDM '06), pp. 823-827, 2006.- [28] M. Celik, S. Shekhar, J.P. Rogers, and J.A. Shine, "Mixed-Drove Spatiotemporal Co-Occurrence Pattern Mining,"
IEEE Trans. Knowledge and Data Eng., vol. 20, no. 10, pp. 1322-1335, Oct. 2008.- [29] H. Cao, N. Mamoulis, and D. Cheung, "Discovery of Periodic Patterns in Spatiotemporal Sequences,"
IEEE Trans. Knowledge and Data Eng., vol. 19, no. 4, pp. 453-467, Apr. 2007.- [30] J. Chan, J. Bailey, and C. Leckie, "Discovering Correlated Spatio-Temporal Changes in Evolving Graphs,"
Knowledge and Information Systems, vol. 16, no. 1, pp. 53-96, 2008.- [31] G. Webb, "Discovering Significant Patterns,"
Machine Learning, vol. 68, no. 1, pp. 1-33, 2007.- [32] M. Jarke and J. Koch, "Query Optimization in Database Systems,"
ACM Computing Surveys, vol. 16, no. 2, pp. 111-152, 1984. |