This Article 
 Bibliographic References 
 Add to: 
Combinatorial Analysis for Sequence and Spatial Motif Discovery in Short Sequence Fragments
July-September 2010 (vol. 7 no. 3)
pp. 524-536
Ronald Jackups Jr., University of Illinois at Chicago, Chicago
Jie Liang, University of Illinois at Chicago, Chicago
Motifs are overrepresented sequence or spatial patterns appearing in proteins. They often play important roles in maintaining protein stability and in facilitating protein function. When motifs are located in short sequence fragments, as in transmembrane domains that are only 6-20 residues in length, and when there is only very limited data, it is difficult to identify motifs. In this study, we introduce combinatorial models based on permutation for assessing statistically significant sequence and spatial patterns in short sequences. We show that our method can uncover previously unknown sequence and spatial motifs in \beta-barrel membrane proteins and that our method outperforms existing methods in detecting statistically significant motifs in this data set. Last, we discuss implications of motif analysis for problems involving short sequences in other families of proteins.

[1] A. Senes, M. Gerstein, and D.M. Engelman, "Statistical Analysis of Amino Acid Patterns in Transmembrane Helices: The GxxxG Motif Occurs Frequently and in Association with $\beta$ -Branched Residues at Neighboring Positions," J. Molecular Biology, vol. 296, pp. 921-936, 2000.
[2] A. Senes, D.E. Engel, and W.F. DeGrado, "Folding of Helical Membrane Proteins: The Role of Polar, GxxxG-Like and Proline Motifs," Current Opinion in Structural Biology, vol. 14, pp. 465-479, 2004.
[3] S. Robin, F. Rodolphe, and S. Schabth, DNA, Words, and Models: Statistics of Exceptional Words. Cambridge Univ. Press, 2005.
[4] M.A. Wouters and P.M. Curmi, "An Analysis of Side Chain Interactions and Pair Correlations within Antiparallel $\beta$ -Sheets: The Differences between Backbone Hydrogen-Bonded and Non-Hydrogen-Bonded Residue Pairs," Proteins, vol. 22, pp. 119-131, 1995.
[5] R. Hart, A. Royyuru, G. Stolovitzky, and A. Califano, "Systematic and Fully Automated Identification of Protein Sequence Patterns," J. Computational Biology, vol. 7, pp. 585-600, 2000.
[6] R. Jackups Jr., S. Cheng, and J. Liang, "Sequence Motifs and Antimotifs in Beta-Barrel Membrane Proteins from a Genome-Wide Analysis: The Ala-Tyr Dichotomy and Chaperone Binding Motifs," J. Molecular Biology, vol. 363, pp. 611-623, 2006.
[7] P.J. Baker, K.L. Britton, D.W. Rice, A. Rob, and T.J. Stillman, "Structural Consequences of Sequence Patterns in the Fingerprint Region of the Nucleotide Binding Fold. Implications for Nucleotide Specificity," J. Molecular Biology, vol. 228, pp. 662-671, 1992.
[8] M.B. Yaffe, K. Rittinger, S. Volinia, P.R. Caron, A. Aitken, H. Leffers, S.J. Gamblin, S.J. Smerdon, and L.C. Cantley, "The Structural Basis for 14-3-3: Phosphopeptide Binding Specificity," Cell. 91, pp. 961-971, 1997.
[9] C.E. Bonferroni, "Teoria Statistica Delle Classi E Calcolo Delle Probabilità," Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, vol. 8, pp. 3-62, 1936.
[10] V.G. Tusher, R. Tibshirani, and G. Chu, "Significance Analysis of Microarrays Applied to the Ionizing Radiation Response," Proc. Nat'l Academy of Sciences USA (PNAS '01), vol. 98, pp. 5116-5121, 2001.
[11] Y. Benjamini and Y. Hochberg, "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.," J. Royal Statistical Soc. B, vol. 57, pp. 289-300, 1995.
[12] W.C. Wimley, "Toward Genomic Identification of $\beta$ -Barrel Membrane Proteins: Composition and Architecture of Known Structures," Protein Science, vol. 11, pp. 301-312, 2002.
[13] G. von Heijne, "Membrane Proteins: From Sequence to Structure," Ann. Rev. Biophysics and Biomolecular Structure, vol. 23, pp. 167-192, 1994.
[14] R. Jackups Jr. and J. Liang, "Interstrand Pairing Patterns in Beta-Barrel Membrane Proteins: The Positive-Outside Rule, Aromatic Rescue, and Strand Registration Prediction," J. Molecular Biology, vol. 354, pp. 979-993, 2005.
[15] A. Pautsch and G.E. Schulz, "Structure of the Outer Membrane Protein A Transmembrane Domain," Nature Structural Biology, vol. 5, pp. 1013-1017, 1998.
[16] J. Vogt and G.E. Schulz, "The Structure of the Outer Membrane Protein OmpX from Escherichia coli Reveals Possible Mechanisms of Virulence," Structure with Folding and Design, vol. 7, pp. 1301-1309, 1999.
[17] L. Vandeputte-Rutten, M.P. Bos, J. Tommassen, and P. Gros, "Crystal Structure of Neisserial Surface Protein A (NspA), a Conserved Outer Membrane Protein with Vaccine Potential," J. Biological Chemistry, vol. 278, pp. 24825-24830, 2003.
[18] H. Hong, D.R. Patel, L.K. Tamm, and B. van den Berg, "The Outer Membrane Protein OmpW Forms an Eight-Stranded Beta-Barrel with a Hydrophobic Channel," J. Biological Chemistry, vol. 281, pp. 7568-7577, 2006.
[19] V.E. Ahn, E.I. Lo, C.K. Engel, L. Chen, P.M. Hwang, L.E. Kay, R.E. Bishop, and G.G. Prive, "A Hydrocarbon Ruler Measures Palmitate in the Enzymatic Acylation of Endotoxin," EMBO J., vol. 23, pp. 2931-2941, 2004.
[20] S.M. Prince, M. Achtman, and J.P. Derrick, "Crystal Structure of the OpcA Integral Membrane Adhesin from Neisseria meningitidis," Proc. Nat'l Academy of Sciences USA (PNAS '02), vol. 99, pp. 3417-3421, 2002.
[21] L. Vandeputte-Rutten, R.A. Kramer, J. Kroon, N. Dekker, M.R. Egmond, and P. Gros, "Crystal Structure of the Outer Membrane Protease OmpT from Escherichia coli Suggests a Novel Catalytic Site," EMBO J., vol. 20, pp. 5033-5039, 2001.
[22] H.J. Snijder, I. Ubarretxena-Belandia, M. Blaauw, K.H. Kalk, H.M. Verheij, M.R. Egmond, N. Dekker, and B.W. Dijkstra, "Structural Evidence for Dimerization-Regulated Activation of an Integral Membrane Phospholipase," Nature, vol. 401, pp. 717-721, 1999.
[23] C.J. Oomen, P. Van Ulsen, P. Van Gelder, M. Feijen, J. Tommassen, and P. Gros, "Structure of the Translocator Domain of a Bacterial Autotransporter," EMBO J., vol. 23, pp. 1257-1266, 2004.
[24] B. van den Berg, P.N. Black, W.M. Clemons Jr., and T.A. Rapoport, "Crystal Structure of the Long-Chain Fatty Acid Transporter FadL," Science, vol. 304, pp. 1506-1509, 2004.
[25] M.S. Weiss and G.E. Schulz, "Structure of Porin Refined at 1.8 $\AA$ Resolution," J. Molecular Biology, vol. 227, pp. 493-509, 1992.
[26] A. Kreusch and G.E. Schulz, "Refined Structure of the Porin from Rhodopseudomonas blastica. Comparison with the Porin from Rhodobacter capsulatus," J. Molecular Biology, vol. 243, pp. 891-905, 1994.
[27] S. Cowan, R.M. Garavito, J.N. Jansonius, J.A. Jenkins, R. Karlsson, N. Konig, E.F. Pai, R.A. Pauptit, P.J. Rizkallah, J.P. Rosenbusch et al, "The Structure of OmpF Porin in a Tetragonal Crystal Form," Structure, vol. 3, pp. 1041-1050, 1995.
[28] K. Zeth, K. Diederichs, W. Welte, and H. Engelhardt, "Crystal Structure of Omp32, the Anion-Selective Porin from Comamonas acidovorans, in Complex with a Periplasmic Peptide at 2.1 $\AA$ Resolution," Structure with Folding and Design, vol. 8, pp. 981-992, 2000.
[29] J.E. Meyer, M. Hofnung, and G.E. Schulz, "Structure of Maltoporin from Salmonella Typhimurium Ligated with a Nitrophenyl-Maltotrioside," J. Molecular Biology, vol. 266, pp. 761-775, 1997.
[30] D. Forst, W. Welte, T. Wacker, and K. Diederichs, "Structure of the Sucrose-Specific Porin ScrY from Salmonella typhimurium and its Complex with Sucrose," Nature Structural Biology, vol. 5, pp. 37-46, 1998.
[31] S.K. Buchanan, B.S. Smith, L. Venkatramani, D. Xia, L. Esser, M. Palnitkar, R. Chakraborty, D. van der Helm, and J. Deisenhofer, "Crystal Structure of the Outer Membrane Active Transporter FepA from Escherichia coli," Nature Structural Biology, vol. 6, pp. 56-63, 1999.
[32] A.D. Ferguson, E. Hofmann, J.W. Coulton, K. Diederichs, and W. Welte, "Siderophore-Mediated Iron Transport: Crystal Structure of FhuA with Bound Lipopolysaccharide," Science, vol. 282, pp. 2215-2220, 1998.
[33] A.D. Ferguson, R. Chakraborty, B.S. Smith, L. Esser, D. Van Der Helm, and J. Deisenhofer, "Structural Basis of Gating by the Outer Membrane Transporter FecA," Science, vol. 295, pp. 1715-1719, 2002.
[34] D.P. Chimento, A.K. Mohanty, R.J. Kadner, and M.C. Wiener, "Substrate-Induced Transmembrane Signaling in the Cobalamin Transporter BtuB," Nature Structural Biology, vol. 10, pp. 394-401, 2003.
[35] D. Cobessi, H. Celia, and F. Pattus, "Crystal Structure at High Resolution of Ferric-Pyochelin and Its Membrane Receptor FptA from Pseudomonas aeruginosa," J. Molecular Biology, vol. 352, pp. 893-904, 2005.
[36] V. Koronakis, A.J. Sharff, E. Koronakis, B. Luisi, and C. Hughes, "Crystal Structure of the Bacterial Membrane Protein TolC Central to Multidrug Efflux and Protein Export," Nature, vol. 405, pp. 914-919, 2000.
[37] L. Song, M.R. Hobaugh, C. Shustak, S. Cheley, H. Bayley, and J.E. Gouaux, "Structure of Staphylococcal $\alpha$ -Hemolysin, a Heptameric Transmembrane Pore," Science, vol. 274, pp. 1859-1866, 1996.
[38] J.S. Merkel and L. Regan, "Aromatic Rescue of Glycine in $\beta$  Sheets," Structure with Folding and Design, vol. 3, pp. 449-455, 1998.
[39] M.H. DeGroot, Probability and Statistics. Addison-Wesley, 1986.
[40] P. Whittle, "Some Distribution and Moment Formulae for the Markov Chain," J. Royal Statistical Soc., vol. 17, pp. 235-242, 1955.

Index Terms:
Motifs, combinatorial models, short sequence, sequence analysis.
Ronald Jackups Jr., Jie Liang, "Combinatorial Analysis for Sequence and Spatial Motif Discovery in Short Sequence Fragments," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 3, pp. 524-536, July-Sept. 2010, doi:10.1109/TCBB.2008.101
Usage of this product signifies your acceptance of the Terms of Use.