loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 Ninth IEEE International Conference on Data Mining
Convex Non-negative Matrix Factorization in the Wild
Miami, Florida
December 06-December 09
ISBN: 978-0-7695-3895-2
Non-negative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a non-negative input matrix V into two non-negative matrix factors V = WH such that W describes "clusters" of the datasets. Analyzing genotypes, social networks, or images, it can be beneficial to ensure V to contain meaningful ``cluster centroids'', i.e., to restrict W to be convex combinations of data points. But how can we run this convex NMF in the wild, i.e., given millions of data points? Triggered by the simple observation that each data point is a convex combination of vertices of the data convex hull, we propose to restrict W further to be vertices of the convex hull. The benefits of this convex-hull NMF approach are twofold. First, the expected size of the convex hull of the candidate set typically grows much slower than the data set. Second, distance preserving low-dimensional embeddings allow one to compute candidate vertices efficiently. Our extensive experimental evaluation shows that convex-hull NMF compares favorably to convex NMF for large data sets both in terms of speed and reconstruction quality. Moreover, we show that our method can easily be applied to large-scale, real-world data sets, in our case consisting of 1.6 million images respectively 160 million votes on World of Warcraft guilds.
Index Terms:
data mining, matrix decomposition, data handling, non negative matrix factorization, archetypal analysis, social network analysis
Citation:
Christian Thurau, Kristian Kersting, Christian Bauckhage, "Convex Non-negative Matrix Factorization in the Wild," icdm, pp.523-532, 2009 Ninth IEEE International Conference on Data Mining, 2009
Usage of this product signifies your acceptance of the Terms of Use.