The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2013 vol.35)
pp: 2340-2356
Tingting Mu , Dept. of Electr. Eng. & Electron., Univ. of Liverpool, Liverpool, UK
J. Y. Goulermas , Dept. of Electr. Eng. & Electron., Univ. of Liverpool, Liverpool, UK
ABSTRACT
In this paper, we study the co-embedding problem of how to map different types of patterns into one common low-dimensional space, given only the associations (relation values) between samples. We conduct a generic analysis to discover the commonalities between existing co-embedding algorithms and indirectly related approaches and investigate possible factors controlling the shapes and distributions of the co-embeddings. The primary contribution of this work is a novel method for computing co--embeddings, termed the automatic co-embedding with adaptive shaping (ACAS) algorithm, based on an efficient transformation of the co-embedding problem. Its advantages include flexible model adaptation to the given data, an economical set of model variables leading to a parametric co-embedding formulation, and a robust model fitting criterion for model optimization based on a quantization procedure. The secondary contribution of this work is the introduction of a set of generic schemes for the qualitative analysis and quantitative assessment of the output of co-embedding algorithms, using existing labeled benchmark datasets. Experiments with synthetic and real-world datasets show that the proposed algorithm is very competitive compared to existing ones.
INDEX TERMS
Vectors, Computational modeling, Algorithm design and analysis, Eigenvalues and eigenfunctions, Large scale integration, Adaptation models, Data models,structural matching, Relational data, data co-embedding, heterogeneous embedding, data visualization
CITATION
Tingting Mu, J. Y. Goulermas, "Automatic Generation of Co-Embeddings from Relational Data with Adaptive Shaping", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 10, pp. 2340-2356, Oct. 2013, doi:10.1109/TPAMI.2013.66
REFERENCES
[1] H. Zhong, J. Shi, and M. Visontai, "Detecting Unusual Activity in Video," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 819-826, 2004.
[2] T. Iwata, K. Saito, N. Ueda, S. Stromsten, T.L. Griffiths, and J.B. Tenenbaum, "Parametric Embedding for Class Visualization," Neural Computation, vol. 19, no. 9, pp. 2536-2556, 2007.
[3] A. Globerson, G. Chechik, F. Pereira, and N. Tishby, "Euclidean Embedding of Co-Occurrence Data," J. Machine Learning Research, vol. 8, pp. 2265-2295, 2007.
[4] V. Sindhwani and P. Melville, "Document-Word Co-Regularization for Semi-Supervised Sentiment Analysis," Proc. Eighth IEEE Int'l Conf. Data Mining, pp. 1025-1030, 2008.
[5] L.J.P. van der Maaten, E.O. Postma, and H.J. van den Herik, "Dimensionality Reduction: A Comparative Review," Technical Report TiCC-TR 2009-005, Tilburg Univ., 2009.
[6] T.F. Cox and M.A.A. Cox, Multidimensional Scaling. Chapman and Hall, 2000.
[7] I.S. Dhillon, "Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning," Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 269-274, 2001.
[8] A. Globerson, G. Chechik, F. Pereira, and N. Tishby, "Embedding Heterogeneous Data Using Statistical Models," Proc. 21st Nat'l Conf. Artificial Intelligence, 2006.
[9] T. Iwata, K. Saito, N. Ueda, S. Stromsten, T.L. Griffiths, and J.B. Tenenbaum, "Parametric Embedding for Class Visualization," Proc. Advances in Neural Information Processing Systems, 2005.
[10] P. Sarkar, S.M. Siddiqi, and G.J. Gordon, "Approximate Kalman Filters for Embedding Author-Word Co-Occurrence Data over Time," Proc. 11th Int'l Conf. Statistical Network Analysis, pp. 126-139, 2006.
[11] P. Sarkar, S.M. Siddiqi, and G.J. Gordon, "A Latent Space Approach to Dynamic Embedding of Co-Occurrence Data," Proc. 11th Int'l Conf. Artificial Intelligence and Statistics, 2007.
[12] J.P. Benzécri, "L'Analyse des Données," L'Analyse des Correspondances, vol. 2, 1973.
[13] M. Greenacre, Theory and Applications of Correspondence Analysis. Academic Press, 1983.
[14] J.R. Bellegarda, "Latent Semantic Mapping," IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 70-80, Sept. 2005.
[15] M. Richardson and G.F. Kuder, "Making a Rating Scale That Measures," Personnel J., vol. 12, pp. 36-40, 1933.
[16] H. Zha, X. He, C.H.Q. Ding, M. Gu, and H.D. Simon, "Bipartite Graph Partitioning and Data Clustering," Proc. 10th Int'l Conf. Information and Knowledge Management, pp. 25-32, 2001.
[17] B. Gao, T. Liu, X. Zheng, Q. Cheng, and W. Ma, "Consistent Bipartite Graph Co-Partitioning for Star-Structured High-Order Heterogeneous Data Co-Clustering," Proc. 11th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 41-50, 2005.
[18] B. Long, Z. Zhang, and P.S. Yu, "Co-Clustering by Block Value Decomposition," Proc. 11th ACM SIGKDD Int'l Conf. Knowledge Discovery in Data Mining, pp. 635-640, 2005.
[19] B. Long, Z. Zhang, X. Wu, and P.S. Yu, "Spectral Clustering for Multi-Type Relational Data," Proc. 23rd Int'l Conf. Machine Learning, pp. 585-592, 2006.
[20] M. Rege, M. Dong, and F. Fotouhi, "Bipartite Isoperimetric Graph Partitioning for Data Co-Clustering," Data Mining and Knowledge Discovery, vol. 16, no. 3, pp. 276-312, 2008.
[21] C.E. Bichot, "Co-Clustering Documents and Words by Minimizing the Normalized Cut Objective Function," J. Math. Modelling and Algorithms, vol. 9, no. 2, pp. 131-147, 2010.
[22] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, "Indexing by Latent Semantic Analysis," J. Am. Soc. for Information Science, vol. 41, pp. 391-407, 1990.
[23] A. Globerson, G. Chechik, F. Pereira, and N. Tishby, "Euclidean Embedding of Co-Occurrence Data," Proc. Advances in Neural Information Processing Systems, 2004.
[24] F.W. Young, ViSta: The Visual Statistics System. Wiley, 1996.
[25] P.M. Yelland, "An Introduction to Correspondence Analysis," Math. J., vol. 12, 2010.
[26] M. Greenacre, "Power Transformations in Correspondence Analysis," Computational Statistics and Data Analysis, vol. 53, no. 8, pp. 3108-3116, 2009.
[27] U. Luxburg, "A Tutorial on Spectral Clustering," Statistics and Computing, vol. 17, no. 4, pp. 395-416, 2007.
[28] T. Mu, J.Y. Goulermas, J. Tsujii, and S. Ananiadou, "Proximity-Based Frameworks for Generating Embeddings from Multi-Output Data," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2216-2232, Nov. 2012.
[29] W.S. Torgerson, "Multidimensional Scaling: I. Theory and Method," J. Psychometrika, vol. 17, no. 4, pp. 401-419, 1952.
[30] C. Eckart and G. Young, "The Approximation of One Matrix by Another of Lower Rank," Psychometrika, vol. 1, no. 3, pp. 211-218, 1936.
[31] B. Xie, M. Wang, and D. Tao, "Toward the Optimization of Normalized Graph Laplacian," IEEE Trans. Neural Networks, vol. 22, no. 4, pp. 660-666, Apr. 2011.
[32] N. Cristianini, J. Kandola, A. Elisseeff, and J. Shawe-Taylor, "On Optimizing Kernel Alignment," Technical Report NC-TR-01-087, Royal Holloway Univ. of London, 2001.
[33] UCI Machine Learning Repository, http://www.ics.uci.edu/mlearnMLRepository.html , 1992.
[34] M. Hahsler, K. Hornik, and C. Buchta, "Getting Things in Order: An Introduction to the R Package Seriation," J. Statistical Software, vol. 25, no. 3, pp. 1-34, 2008.
[35] H. Wu, Y. Tien, and C. Chen, "GAP: A Graphical Environment for Matrix Visualization and Cluster Analysis," Computational Statistics and Data Analysis, vol. 54, no. 3, pp. 767-778, 2010.
[36] J.C. Bezdek, R.J. Hathaway, and J.M. Huband, "Visual Assessment of Clustering Tendency for Rectangular Dissimilarity Matrices," IEEE Trans. Fuzzy Systems, vol. 15, no. 5, pp. 890-903, Oct. 2007.
[37] UCI, DELVE, and STATLOG Benchmark Repository, http://ida.first.fhg.de/projects/benchbenchmarks.htm , 2013.
53 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool