loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
15th International Conference on Pattern Recognition (ICPR'00) - Volume 2
A Probabilistic Hierarchical Clustering Method for Organizing Collections of Text Documents
Barcelona, Spain
September 03-September 08
ISBN: 0-7695-0750-6
In this paper, a generic probabilistic framework for the unsupervised hierarchical clustering of large-scale sparse high-dimensional data collections is proposed. The framework is based on a hierarchical probabilistic mixture methodology. Two classes of models emerge from the analysis and these have been termed as symmetric and asymmetric models. For text data, specifically both asymmetric and symmetric models based on the multinomial and binomial distributions are most appropriate. An EM method of parameter estimation is provided for all these models. An experimental comparison of the models is obtained for two extensive online document collections.
Citation:
Alexei Vinokourov, Mark Girolami, "A Probabilistic Hierarchical Clustering Method for Organizing Collections of Text Documents," icpr, vol. 2, pp.2182, 15th International Conference on Pattern Recognition (ICPR'00) - Volume 2, 2000
Usage of this product signifies your acceptance of the Terms of Use.