IEEE/ACM Transactions on Computational Biology and Bioinformatics 2012 vol.9 Issue No.03 - May-June

Issue No.03 - May-June (2012 vol.9)

pp: 934-939

Italo Zoppis , Università degli Studi di Milano-Bicocca, Milan

Erica Gianazza , Università degli Studi di Milano-Bicocca, Monza

Massimiliano Borsani , Università degli Studi di Milano-Bicocca, Monza

Clizia Chinello , Università degli Studi di Milano-Bicocca, Monza

Veronica Mainini , Università degli Studi di Milano-Bicocca, Monza

Carmen Galbusera , Università degli Studi di Milano-Bicocca, Monza

Carlo Ferrarese , Ospedale San Gerardo, Monza

Gloria Galimberti , Ospedale San Gerardo, Monza

Sandro Sorbi , Università degli Studi di Firenze, Florence

Barbara Borroni , Università degli Studi di Brescia, Brescia

Fulvio Magni , Università degli Studi di Milano-Bicocca, Monza

Marco Antoniotti , Università degli Studi di Milano-Bicocca, Milan

Giancarlo Mauri , Università degli Studi di Milano-Bicocca, Milan

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.80

ABSTRACT

"Signal” alignments play critical roles in many clinical setting. This is the case of mass spectrometry (MS) data, an important component of many types of proteomic analysis. A central problem occurs when one needs to integrate (MS) data produced by different sources, e.g., different equipment and/or laboratories. In these cases, some form of "data integration” or "data fusion” may be necessary in order to discard some source-specific aspects and improve the ability to perform a classification task such as inferring the "disease classes” of patients. The need for new high-performance data alignments methods is therefore particularly important in these contexts. In this paper, we propose an approach based both on an information theory perspective, generally used in a feature construction problem, and the application of a mathematical programming task (i.e., the weighted bipartite matching problem). We present the results of a competitive analysis of our method against other approaches. The analysis was conducted on data from plasma/ethylenediaminetetraacetic acid of "control” and Alzheimer patients collected from three different hospitals. The results point to a significant performance advantage of our method with respect to the competing ones tested.

INDEX TERMS

Optimization, information theory, medicine, medical informatics, proteomics, data integration, graph algorithms.

CITATION

