DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.28
Chanchala D. Kaddi , Georgia Institute of Technology, Atlanta
R. Mitchell Parry , Appalachian State University, Boone
May D. Wang , Georgia Institute of Technology, Atlanta
We propose a similarity measure based on the multivariate hypergeometric distribution for the pairwise comparison of images and data vectors. The formulation and performance of the proposed measure are compared with other similarity measures using synthetic data. A method of piecewise approximation is also implemented to facilitate application of the proposed measure to large samples. Example applications of the proposed similarity measure are presented using mass spectrometry imaging (MSI) data and gene expression microarray data. Results from synthetic and biological data indicate that the proposed measure is capable of providing meaningful discrimination between samples, and that it can be a useful tool for identifying potentially related samples in large-scale biological datasets.
Multivariate statistics, Mathematics of Computing, Probability and Statistics, Contingency table analysis, Computer Applications, Life and Medical Sciences, Biology and genetics, Physical Sciences and Engineering, Engineering, Chemistry
C. D. Kaddi, R. M. Parry and M. D. Wang, "Multivariate Hypergeometric Similarity Measure," in IEEE/ACM Transactions on Computational Biology and Bioinformatics.