Issue No. 02 - April-June (2008 vol. 5)
Many statistical measures and algorithmic techniqueshave been proposed for studying residue coupling inprotein families. Generally speaking, two residue positions areconsidered coupled if, in the sequence record, some of theiramino acid type combinations are significantly more commonthan others. While the proposed approaches have proven useful infinding and describing coupling, a significant missing componentis a formal probabilistic model that explicates and compactlyrepresents the coupling, integrates information about sequence,structure, and function, and supports inferential procedures foranalysis, diagnosis, and prediction.We present an approach to learning and using probabilisticgraphical models of residue coupling. These models capturesignificant conservation and coupling constraints observable ina multiply-aligned set of sequences. Our approach can place astructural prior on considered couplings, so that all identifiedrelationships have direct mechanistic explanations. It can alsoincorporate information about functional classes, and therebylearn a differential graphical model that distinguishes constraintscommon to all classes from those unique to individual classes.Such differential models separately account for class-specificconservation and family-wide coupling, two different sourcesof sequence covariation. They are then able to perform interpretablefunctional classification of new sequences, explainingclassification decisions in terms of the underlying conservationand coupling constraints. We apply our approach in studies ofboth G protein-coupled receptors and PDZ domains, identifyingand analyzing family-wide and class-specific constraints, andperforming functional classification. The results demonstrate thatgraphical models of residue coupling provide a powerful toolfor uncovering, representing, and utilizing significant sequencestructure-function relationships in protein families.
Correlated mutations, graphical models, evolutionary covariation, sequence-structure-function relationships, functional classification
John Thomas, Naren Ramakrishnan, Chris Bailey-Kellogg, "Graphical Models of Residue Coupling in Protein Families", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 5, no. , pp. 183-197, April-June 2008, doi:10.1109/TCBB.2007.70225