The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.35)
pp: 272-285
Fei Wang , IBM T.J. Watson Res. Center, Hawthorne, NY, USA
Noah Lee , Dept. of Biomed. Eng., Columbia Univ., New York, NY, USA
Jianying Hu , IBM T.J. Watson Res. Center, Hawthorne, NY, USA
Jimeng Sun , IBM T.J. Watson Res. Center, Hawthorne, NY, USA
S. Ebadollahi , IBM T.J. Watson Res. Center, Hawthorne, NY, USA
A. F. Laine , Dept. of Biomed. Eng., Columbia Univ., New York, NY, USA
ABSTRACT
This paper proposes a novel temporal knowledge representation and learning framework to perform large-scale temporal signature mining of longitudinal heterogeneous event data. The framework enables the representation, extraction, and mining of high-order latent event structure and relationships within single and multiple event sequences. The proposed knowledge representation maps the heterogeneous event sequences to a geometric image by encoding events as a structured spatial-temporal shape process. We present a doubly constrained convolutional sparse coding framework that learns interpretable and shift-invariant latent temporal event signatures. We show how to cope with the sparsity in the data as well as in the latent factor model by inducing a double sparsity constraint on the β-divergence to learn an overcomplete sparse latent factor model. A novel stochastic optimization scheme performs large-scale incremental learning of group-specific temporal event signatures. We validate the framework on synthetic data and on an electronic health record dataset.
INDEX TERMS
Convolution, Sparse matrices, Knowledge representation, Data mining, Complexity theory, Approximation methods, Convergence,beta-divergence, Temporal signature mining, sparse coding, dictionary learning, nonnegative matrix factorization, stochastic gradient descent
CITATION
Fei Wang, Noah Lee, Jianying Hu, Jimeng Sun, S. Ebadollahi, A. F. Laine, "A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 2, pp. 272-285, Feb. 2013, doi:10.1109/TPAMI.2012.111
REFERENCES
[1] B. Cao, D. Shen, J.T. Sun, X. Wang, Q. Yang, and Z. Chen, "Detect and Track Latent Factors with Online Nonnegative Matrix Factorization," Proc. 20th Int'l Joint Conf. Artificial Intelligence, pp. 2689-2694, 2007.
[2] F.R.K. Chung, Spectral Graph Theory. Am. Math. Soc., 1997.
[3] C. Ding, T. Li, and M.I. Jordan, "Convex and Semi-Nonnegative Matrix Factorizations," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 45-55, Jan. 2010.
[4] M. Dong, "A Tutorial on Nonlinear Time-Series Data Mining in Engineering Asset Health and Reliability Prediction: Concepts, Models, and Algorithms," Math. Problems in Eng., vol. 2010, pp. 1-23, 2010.
[5] J. Eggert and E. Korner, "Sparse Coding and NMF," Proc. IEEE Int'l Joint Conf. Neural Networks, vol. 2, pp. 2529-2533, 2004.
[6] W. Fei, L. Ping, and K. Christian, "Online Nonnegative Matrix Factorization for Document Clustering," Proc. 11th SIAM Int'l Conf. Data Mining, 2011.
[7] C. Févotte and J. Idier, Algorithms for Nonnegative Matrix Factorization with the Beta-Divergence, arXiv:1010.1763, 2010.
[8] P.O. Hoyer, "Non-Negative Matrix Factorization with Sparseness Constraints," J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.
[9] P.O. Hoyer, "Non-Negative Sparse Coding," Proc. 12th IEEE Workshop Neural Networks for Signal Processing, 2002.
[10] Y.R. Ramesh Kumar and P.A. Govardhan, "Stock Market Predictions—Integrating User Perception for Extracting Better Prediction a Framework," Int'l J. Eng. Science, vol. 2, no. 7, pp. 3305-3310, 2010.
[11] D.D. Lee and H.S. Seung, "Learning the Parts of Objects by Non-Negative Matrix Factorization," Nature, vol. 401, no. 6755, pp. 788-91, 1999.
[12] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, "A Symbolic Representation of Time Series, with Implications for Streaming Algorithms," Proc. Eighth ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp. 2-11, 2003.
[13] J. Mairal, F. Bach, Inria Willow Project-Team, and G. Sapiro, "Online Learning for Matrix Factorization and Sparse Coding," J. Machine Learning Research, vol. 11, pp. 19-60, 2010.
[14] F. Moerchen, "Time Series Knowledge Mining Fabian," PhD thesis, 2006.
[15] F. Moerchen and D. Fradkin, "Robust Mining of Time Intervals with Semi-Interval Partial Order Patterns," Proc. SIAM Conf. Data Mining, pp. 315-326, 2010.
[16] F. Mörchen and A. Ultsch, "Efficient Mining of Understandable Patterns from Multivariate Interval Time Series," Data Mining and Knowledge Discovery, vol. 15, no. 2, pp. 181-215, 2007.
[17] P. OGrady and B. Pearlmutter, "Discovering Convolutive Speech Phones Using Sparseness and Non-Negativity," Proc. Seventh Int'l Conf. Independent Component Analysis and Signal Separation, pp. 520-527, 2007.
[18] R. Andrew Russell, "Mobile Robot Learning by Self-Observation," Autonomous Robots, vol. 16, no. 1, pp. 81-93, (2004).
[19] J. Shlens, G.D. Field, J.L. Gauthier, M. Greschner, A. Sher, A.M. Litke, and E.J. Chichilnisky, "The Structure of Large-Scale Synchronized Firing in Primate Retina," J. Neuroscience: The Official J. Soc. for Neuroscience, vol. 29, no. 15, pp. 5022-5031, 2009.
[20] P. Smaragdis, "Non-Negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs," Proc. Fifth Int'l Conf. Independent Component Analysis and Blind Signal Separation, 2004.
[21] L. Xie, H. Sundaram, and M. Campbell, "Event Mining in Multimedia Streams," Proc. IEEE, vol. 96, no. 4, pp. 623-647, Apr. 2008.
[22] B.A. Young, E. Lin, M. Von Korff, G. Simon, P. Ciechanowski, E.J. Ludman, S. Everson-Stewart, L. Kinder, M. Oliver, E.J. Boyko, and W.J. Katon, "Diabetes Complications Severity Index and Risk of Mortality, Hospitalization, and Healthcareutilization," The Am. J. Managed Care, vol. 14, no. 1, pp. 15-23, 2008.
388 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool