This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval
Dec. 2012 (vol. 34 no. 12)
pp. 2315-2326
C. Carpineto, Fondazione Ugo Bordoni, Rome, Italy
G. Romano, Fondazione Ugo Bordoni, Rome, Italy
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
Index Terms:
stochastic processes,information retrieval,optimisation,pattern clustering,probability,performance gain,consensus clustering,probabilistic Rand index,subtopic retrieval,similarity measurement,object-pair level agreement,object-pair level disagreement,occurrence probability,optimization problem,PRI value,stochastic optimization algorithm,Indexes,Clustering algorithms,Probabilistic logic,Partitioning algorithms,Search problems,Optimized production technology,Information retrieval,subtopic retrieval,Consensus clustering,Rand index,probabilistic Rand index,search results clustering
Citation:
C. Carpineto, G. Romano, "Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 12, pp. 2315-2326, Dec. 2012, doi:10.1109/TPAMI.2012.80
Usage of this product signifies your acceptance of the Terms of Use.