The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2012 vol.24)
pp: 799-812
Taowei David Wang , Partners Heathcare, Charlestown
Amol Deshpande , Univeristy of Maryland, College Park
Ben Shneiderman , University of Maryland, College Park
ABSTRACT
We present Temporal Pattern Search (TPS), a novel algorithm for searching for temporal patterns of events in historical personal histories. The traditional method of searching for such patterns uses an automaton-based approach over a single array of events, sorted by time stamps. Instead, TPS operates on a set of arrays, where each array contains all events of the same type, sorted by time stamps. TPS searches for a particular item in the pattern using a binary search over the appropriate arrays. Although binary search is considerably more expensive per item, it allows TPS to skip many unnecessary events in personal histories. We show that TPS's running time is bounded by O(m^2n lg(n)), where m is the length of (number of events) a search pattern, and n is the number of events in a record (history). Although the asymptotic running time of TPS is inferior to that of a nondeterministic finite automaton (NFA) approach (O(mn)), TPS performs better than NFA under our experimental conditions. We also show TPS is very competitive with Shift-And, a bit-parallel approach, with real data. Since the experimental conditions we describe here subsume the conditions under which analysts would typically use TPS (i.e., within an interactive visualization program), we argue that TPS is an appropriate design choice for us.
INDEX TERMS
Pattern matching, temporal event data, information visualization, graphical user interfaces.
CITATION
Taowei David Wang, Amol Deshpande, Ben Shneiderman, "A Temporal Pattern Search Algorithm for Personal History Event Visualization", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 5, pp. 799-812, May 2012, doi:10.1109/TKDE.2010.257
REFERENCES
[1] J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman, "Efficient Pattern Matching over Event Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 147-160, 2008.
[2] R.S. Boyer and J.S. Moore, "A Fast String-Searching Algorithm," Comm. ACM, vol. 20, no. 10, pp. 762-772, 1977.
[3] R. Cox, "Regular Expression Matching Can Be Simple and Fast," http://swtch.com/rsc/regexpregexp1.html, 2007.
[4] DataMontage, http://www.stottlerhenke.comdatamontage/, 2011.
[5] A. Demers, J. Gehrke, M. Hong, M. Riedewald, and W. White, "Towards Expressive Publish/Subscribe Systems," Proc. 10th Int'l Conf. Extending Database Technology (EDBT), pp. 627-644, 2006.
[6] J. Fails, A. Karlson, L. Shahamat, and B. Shneiderman, "A Visual Interface for Multivariate Temporal Data: Finding Patterns of Events across Multiple Histories," Proc. IEEE Symp. Visual Analytics Science and Technology (VAST '06), pp. 167-174, 2006.
[7] D. Ficara, S. Giodano, G. Procissi, F. Vitucci, G. Antichi, and A.D. Pietro, "An Improved DFA for Fast Regular Expression Matching," ACM SIGCOMM Computer Comm. Rev., vol. 38, no. 5, pp. 29-40, 2008.
[8] L. Harada and Y. Hotta, "Order Checking in a CPOE Using Event Analyzer," Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM), pp. 549-555, 2005.
[9] L. Harada, Y. Hotta, and T. Ohmori, "Detection of Sequential Patterns of Events for Supporting Business Intelligence Solutions," Proc. Int'l Database Eng. and Applications Symp. (IDEAS '04), pp. 475-479, 2004.
[10] J.E. Hopcroft, R. Motwani, and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 2000.
[11] R.M. Karp and M.O. Rabin, "Efficient Randomized Patter Matching Algorithms," Technical Report TR-31-81, Aiken Computation Laboratory, Harvard Univ., 1981.
[12] D.E. Knuth, J.H. Moris, and V.R. Pratt, "Fast Pattern Matching in Strings," SIAM J. Computing, vol. 6, no. 2, pp. 323-350, 1977.
[13] S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese, "Curing Regular Expressions Matching Algorithms from Insomnia, Amnesia, and Acalculia," Proc. Third ACM/IEEE Symp. Architecture for Networking and Comm., Systems (ANCS), pp. 155-164, 2007.
[14] H. Lam, D. Russell, D. Tang, and T. Munzner, "Session Viewer: Visual Exploratory Analysis of Web Session Logs," Proc. IEEE Symp. Visual Analytics Science and Technology (VAST '07), pp. 147-154, 2007.
[15] S. Lam, "PatternFinder in Microsoft Amalga: Temporal Query Formulation and Result Visualization in Action," http://www.cs.umd.edu/hcil/patternFinderInAmalga PatternFinderS-HonorsPaper.pdf, 2011.
[16] Microsoft Amalga, http://www.microsoft.comamalga/, 2009.
[17] A. Møller "Regexp Library for Java," http://www.brics.dkautomaton/, 2001.
[18] S. Murphy, M. Mendis, K. Hackett, R. Kuttan, W. Pan, L. Phillips, V. Gainer, D. Berkowicz, J. Glaser, I. Kohane, and H. Chueh, "Architecture of the Open-Source Clinical Research Chart from Informatics for Integrating Biology and the Bedside," Proc. Am. Medical Informatics Assoc. Ann. Symp. (AMIA '07), pp. 548-552, 2007.
[19] G. Navarro, "Pattern Matching," J. Applied Statistics, vol. 31, no. 8, pp. 925-949, 2004.
[20] G. Navarro and M. Raffinot, "Fast and Flexible String Matching by Combining Bit-Parallelism and Suffix Automata," ACM J. Experimental Algorithmics, vol. 5, article 4, Dec. 2000, http://doi.acm.org/10.1145351827.384246.
[21] G. Navarro and M. Raffinot, Flexible Pattern Matching in Strings, pp. 77-97. Cambridge Univ. Press, 2002.
[22] C. Plaisant, S. Lam, B. Shneiderman, M. Smith, D. Roseman, G. Marchand, M. Gillam, C. Feied, J. Handler, and H. Rappaport, "Searching Electronic Health Records for Temporal Patterns in Patient Histories: A Case Study with Microsoft Amalga," Proc. Am. Medical and Informatics Assoc. Ann. Symp. (AMIA '08), pp. 601-605, 2008.
[23] M.O. Rabin and D. Scott, "Finite Automata and Their Decision Problems," IMB J. Research and Development, vol. 2, pp. 114-125, 1959.
[24] P. Sadri, C. Zaniolo, A. Zarkesh, and J. Adibi, "Expressing and Optimizing Pattern Queries in Database Systems," ACM Trans. Database Systems, vol. 29, no. 2, pp. 282-318, June 2004.
[25] M. Suntinger, H. Obweger, J. Schiefer, and M.E. Gröller, "The Event Tunnel: Interactive Visualization of Complex Event Streams for Business Process Pattern Analysis," Proc. IEEE Pacific Visualization Symp. (PacificVIS '08), pp. 111-118, 2008.
[26] K. Thompson, "Regular Expression Search Algorithm," Comm. ACM, vol. 11, no. 6, pp. 419-422, 1968.
[27] T.D. Wang, C. Plaisant, A.J. Quinn, R. Stanchak, S. Murphy, and B. Shneiderman, "Aligning Temporal Data by Sentinel Events: Discovering Patterns in Electronic Health Records," Proc. SIGCHI Conf. Human Factors in Computing Systems (CHI '08), pp. 457-466, 2008.
[28] T.D. Wang, C. Plaisant, B. Shneiderman, N. Spring, D. Roseman, G. Marchand, V. Mukherjee, and M. Smith, "Temporal Summaries: Supporting Temporal Categorical Aggregation and Comparison," IEEE Trans. Visualization and Computer Graphics, vol. 15, no. 6, pp. 1049-1056, Nov./Dec. 2009.
[29] H. Hu, B. Salzberg, and D. Zhang, "Online Event-Driven Subsequence Matching over Financial Data Stream," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 23-34, 2004.
[30] F. Yu, Z. Chen, Y. Diao, T. Lakshman, and R.H. Katz, "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection," Proc. ACM/IEEE Symp. Architecture for Networking and Comm. Systems, pp. 93-102, 2006.
24 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool