This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Finding Protein Domain Boundaries: An Automated, Non-Homology-Based Method
November/December 2005 (vol. 20 no. 6)
pp. 26-33
Brian M. Gurbaxani, Centers for Disease Control and Prevention
Parag Mallick, University of California, Los Angeles
A Bayesian algorithm identifies structural domains in proteins using amino acid sequence information only. This approach differs from other sequence-only approaches, which are typically sequence-homology-based, not fully automated, or dependent on the structure being known. This approach catalogs "pattern" frequencies-occurrences of groups of amino acids-in a nonredundant database of known protein domains to identify those that appear to signal the beginnings and ends of domains. It uses those patterns to score new sequences and find their domain boundaries. Inspecting the patterns that appear significant in marking the fronts or backs of domains reveal subtle differences in amino acid use along each domain's length. These patterns might elucidate differences in function between chemically similar amino acids.

This article is part of a special issue on data mining in bioinformatics.

Index Terms:
Bayesian algorithm, protein domains, amino acid patterns
Citation:
Brian M. Gurbaxani, Parag Mallick, "Finding Protein Domain Boundaries: An Automated, Non-Homology-Based Method," IEEE Intelligent Systems, vol. 20, no. 6, pp. 26-33, Nov.-Dec. 2005, doi:10.1109/MIS.2005.106
Usage of this product signifies your acceptance of the Terms of Use.