Biological networks are fundamental to understanding the dynamics of human health and disease. They are built based on the identification of protein-protein interactions. Traditionally, information about protein interactions was collected from the small-scale screens. The accuracy of each interaction has often been validated with multiple experiments. With the development of highthroughput methods such as the two-hybrid assay and protein chip technology, the information within interaction databases has increased tremendously. However, large-scale protein interaction assays are notoriously noisy. Therefore, it is essential to develop strategies to validate the high-throughput data sets.
Different evidence has been used to validate the high-throughput protein-protein interaction data. Among them, functional association of protein pairs is used to verify the biological relevance of putative interacting proteins. It?s a reliable measure for the validity of protein interactions [1]. Traditionally, functional association has been assessed by the shared annotation of proteins in a controlled vocabulary system [2]. However, those methods are restricted to protein pairs having the same annotation. Human proteins have a lower level of accurate annotation than proteins in other organisms such as yeast. The percentage of human proteins sharing the same annotation is low, and the shared annotation may be too general to verify the functional association of two proteins. Therefore, the current methods may not be applicable to human interactome analysis.
Gene Ontology (GO) is a controlled vocabulary of over 17,000 terms used to describe biological process, molecular function and cellular component of genes and gene products in a generic cell. GO terms and their relationships are represented in the form of directed acyclic graphs (DAGs). Given a pair of terms, the traditional method for measuring similarity is to calculate the path distance between two nodes associated with these terms. Edges are weighted according to the depth of GO. This approach assumes that nodes and links in ontology