The Community for Technology Leaders
2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2014)
Belfast, United Kingdom
Nov. 2, 2014 to Nov. 5, 2014
ISBN: 978-1-4799-5669-2
pp: 125-130
En-Shiun Annie Lee , Systems Design Engineering, University of Waterloo, Waterloo, Canada
Kwong-Sak Leung , Computer Science and Engineering Chinese University of Hong Kong, Shatin, Hong Kong
Ho-Yin Sze-To , Computer Science and Engineering, Chinese University of Hong Kong, Shatin, Hong Kong
Terrence Chi-Kong Lau , Biomedical Sciences, City University of Hong Kong, Kowloon, Hong Kong
Man-Hon Wong , Computer Science and Engineering Chinese University of Hong Kong, Shatin, Hong Kong
Andrew K. C. Wong , Systems Design Engineering, University of Waterloo, Waterloo, Canada
ABSTRACT
Understanding binding cores is of fundamental importance in deciphering Protein-DNA (TF-TFBS) binding and gene regulation. Variations (or mutations) in binding cores are ubiquitous and have different levels of effects on the binding specificity. To alleviate expensive experiments, we have developed a new method to discover directly from sequence data binding cores and study the effect due to variations. Although existing computational methods have produced satisfactory TF-TFBS binding cores, they are only one-to-one mappings with no site-specific information on residue/nucleotide variations; and also are largely overlapped. In this study, we propose a new representation for modeling TF-TFBS binding with variants known as TF-TFBS Co-Supportive Aligned Pattern Clusters (APCs), which are more compact, with more details for site-specific variants, and biologically more intuitive for analysis. To achieve this task, we have also developed an algorithm to discover TF-TFBS Co-Supportive APCs to capture binding cores at a higher precision with much faster runtime (≥1600X) comparing to other methods. The variants in TF-TFBS Co-Supportive APCs are also statistically analyzed and demonstrated that they can assist homology modeling to synthesize new biological knowledge.
INDEX TERMS
Proteins, DNA, Three-dimensional displays, Educational institutions, Amino acids, Pattern matching
CITATION

E. A. Lee, K. Leung, H. Sze-To, T. C. Lau, M. Wong and A. K. Wong, "Discovering protein-DNA binding cores by aligned pattern clustering," 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Belfast, United Kingdom, 2014, pp. 125-130.
doi:10.1109/BIBM.2014.6999140
87 ms
(Ver 3.3 (11022016))