2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2013)
Dec. 18, 2013 to Dec. 21, 2013
En-Shiun Annie Lee , Systems Design Engineering, University of Waterloo, Waterloo, Canada
Sanderz Fung , Systems Design Engineering, University of Waterloo, Waterloo, Canada
Ho-Yin Sze-To , Computer Science and Engineering, Chinese University of Hong Kong, Shatin, Hong Kong
Andrew K. C. Wong , Systems Design Engineering, University of Waterloo, Waterloo, Canada
Advances in bioinformatics have provided researchers with a large influx of novel sequences, thus making the analysis of the sequences for inherent biological knowledge crucial. By using pattern discovery and pattern synthesis on protein family sequences, conserved protein segments can be represented by Aligned Pattern Clusters (APC), which is more knowledge-rich in statistical association comparing to probabilistic models. Such representation enabled us to exploit their co-occurrence on the same protein sequence to identify functional regions. In this paper, we developed an efficient algorithm to identify the frequently co-occurring patterns using only homologous protein sequences as input. We applied our algorithm to triosephosphate isomerase and ubiquitin for a detailed study. We found that the discovered co-occurring patterns are close in spatial distance in most cases, by comparing to corresponding 3D structures. We also found that the co-occurrence of patterns are biologically significant. Residues which play important and co-operative roles in the glycolytic pathway of triosephosphate isomerase and residues which are responsible for ubiquitination and ubiquitin-binding of ubiquitin are all covered in our co-occurring APCs. These results demonstrate the power of our algorithm to reveal the concurrent distant functional and structural relation of proteins sequences based on co-occurrence clusters of APCs.
Clustering algorithms, Amino acids, Indexes, Educational institutions, Protein sequence
E. A. Lee, S. Fung, H. Sze-To and A. K. Wong, "Confirming biological significance of co-occurrence clusters of aligned pattern clusters," 2013 IEEE International Conference on Bioinformatics and Biomedicine(BIBM), Shanghai, China China, 2013, pp. 422-427.