Human Disease-Gene Classification with Integrative Sequence-Based and Topological Features of Protein-Protein Interaction Networks
2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007) (2007)
Nov. 2, 2007 to Nov. 4, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/BIBM.2007.47
The discovery of human genes that contribute to the appearance and growth of hereditary diseases is an important problem in bioinformatics research. Many techniques have been devised for classifying genes based on information from a variety of sources such as sequence and functional annotation. Recently, the use of topological information in protein-protein interaction networks has shown promise in disease- genes. In this paper, we develop a disease-gene classification system that integrates topological features of protein interaction networks with sequence- derived and other features, utilizing support vector machines for disease-gene classification. We identified several novel topological, sequence, and function-based features that can help to characterize hereditary disease-genes. We also found that using a more complex classifier can contribute to disease-gene classification. We validated our methods by selecting previously unclassified genes that were predicted with high probabilities as disease-genes, and searching for evidence in recent literature of their involvements in disease.
X. Chen, A. Smalter and S. F. Lei, "Human Disease-Gene Classification with Integrative Sequence-Based and Topological Features of Protein-Protein Interaction Networks," 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007)(BIBM), Fremont, California, 2007, pp. 209-216.