Issue No. 12 - Dec. (2012 vol. 34)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.80
C. Carpineto , Fondazione Ugo Bordoni, Rome, Italy
G. Romano , Fondazione Ugo Bordoni, Rome, Italy
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
stochastic processes, information retrieval, optimisation, pattern clustering, probability, performance gain, consensus clustering, probabilistic Rand index, subtopic retrieval, similarity measurement, object-pair level agreement, object-pair level disagreement, occurrence probability, optimization problem, PRI value, stochastic optimization algorithm, Indexes, Clustering algorithms, Probabilistic logic, Partitioning algorithms, Search problems, Optimized production technology, Information retrieval, subtopic retrieval, Consensus clustering, Rand index, probabilistic Rand index, search results clustering
C. Carpineto and G. Romano, "Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 2315-2326, 2012.