Issue No. 12 - December (2003 vol. 25)
Joachim M. Buhmann , IEEE
<p><b>Abstract</b>—For several major applications of data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concerning the problem of unsupervised structure detection or <it>clustering</it>, in this paper, a new embedding method for pairwise data into Euclidean vector spaces is introduced. We show that all clustering methods, which are invariant under additive shifts of the pairwise proximities, can be reformulated as grouping problems in Euclidian spaces. The most prominent property of this <it>constant shift embedding</it> framework is the complete <it>preservation of the cluster structure</it> in the embedding space. Restating pairwise clustering problems in vector spaces has several important consequences, such as the statistical description of the clusters by way of <it>cluster prototypes</it>, the generic extension of the grouping procedure to a discriminative <it>prediction rule</it>, and the applicability of standard <it>preprocessing methods</it> like denoising or dimensionality reduction.</p>
Clustering, pairwise proximity data, cost function, embedding, MDS.
M. Kawanabe, J. M. Buhmann, J. Laub and V. Roth, "Optimal Cluster Preserving Embedding of Nonmetric Proximity Data," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 25, no. , pp. 1540-1551, 2003.