This Article 
 Bibliographic References 
 Add to: 
Global Visualization and Alignments of Whole Bacterial Genomes
July-September 2003 (vol. 9 no. 3)
pp. 361-377

Abstract—We present a novel visualization technique to align whole bacterial genomes with millions of nucleotides. Our basic design combines the descriptive power of pixel-based visualizations with the interpretative strength of digital image-processing filters. The innovative use of pixel enhancement techniques on pixel-based visualizations brings out the best of the recursive data patterns and further enhances the effectiveness of the visualization techniques. The result is a fast, versatile, and cost-effective analysis tool to reveal hidden structures that might lead to the discovery of functional identifications as well as phenotypic changes of whole bacterial genomes. Nine different whole bacterial genomes obtained from public genome banks are used to demonstrate our designs and prove their viability. Although the design of the new visualization technique is targeted at analyzing genomic sequences, we show with examples that it can be used to study other types of sequential data sets with a priori orders.

[1] R.A. Alm, L.S. Ling, D.T. Moir, B.L. King, E.D. Brown, P.C. Doig, D.R. Smith, B. Noonan, B.C. Guild, B.L. deJonge, G. Carmel, P.J. Tummino, A. Caruso, M. Uria-Nickelsen, D.M. Mills, C. Ives, R. Gibson, D. Merberg, S.D. Mills, Q. Jiang, D.E. Taylor, G.F. Vovis, and T.J. Trust, Genomic Sequence Comparison of Two Unrelated Isolates of the Human Gastric Pathogen Helicobacter Pylori Nature, vol. 397, no. 6715, pp. 176-180, Jan. 1999.
[2] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, Basic Local Alignment Search Tool J. Molecular Biology, vol. 215, no. 3, pp. 403-410, 1990.
[3] A.R. Butz, Alternative Algorithm for Hilbert's Space-Filling Curve IEEE Trans. Computers, vol. 20, no. 4, pp. 424-426, Apr. 1971.
[4] S.K. Card, J.D. Mackinlay, and B. Shneiderman, Readings in Information Visualization Using Vising to Think. Morgan Kaufmann, 1999.
[5] E.H. Chi, P. Barry, E. Shoop, J.V. Carlis, E. Retzel, and J. Riedl, Visualization of Biological Sequences Similarity Search Results Proc. IEEE Visualization '95, pp. 44-51, Oct. 1995.
[6] E.H. Chi, J. Riedl, E. Shoop, J.V. Carlis, E. Retzel, and P. Barry, Flexible Information Visualization of Multivariate Data from Biological Sequence Similarity Searches Proc. IEEE Visualization '96, pp. 133-140, Oct. 1996.
[7] A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson, O. White, and S. Salzberg, Alignment of Whole Genomes Nucleic Acids Research, vol. 27, no. 11, pp. 2369-2376, May 1999.
[8] L. Florea, C. Riemer, S. Schwartz, Z. Zhang, N. Stojanovic, W. Miller, and M. McClelland, Web-Based Visualization Tools for Bacterial Genome Alignments Nucleic Acids Research, vol. 28, no. 18, pp. 3486-3496, Aug. 2000.
[9] A.J. Gibbs and G.A. McIntyre, The Diagram, a Method for Comparing Sequences. Its Use with Amino Acid and Nucleotide Sequences European J. Biochemistry, vol. 16, pp. 1-11, 1970.
[10] R.C. Gonzalez and R.E. Woods, Digital Image Processing, second ed. Prentice Hall, 2001.
[11] E. Hamori and J. Ruskin, H Curves, a Novel Method of Representation of Nucleotide Series Especially Suited for Long DNA Sequences The J. Biological Chemistry, vol. 258, no. 2, pp. 1318-1327, July 1983.
[12] Data_setsdata_ sets.html, 2002.
[13] , 2002.
[14] http:/, 2002.
[15], 2002.
[16] D. Jerding and J. Stasko, The Information Mural: A Technique for Displaying and Navigating Large Information Spaces IEEE Trans. Visualization and Computer Graphics, vol. 4, no. 3, pp. 257-271, July 1998.
[17] S. Kalman, W. Mitchell, R. Marathe, C. Lammel, J. Fan, R.W. Hyman, L. Olinger, J. Grimwood, R.W. Davis, and R.S. Stephens, Comparative Genomes of Chlamydia pneumoniae and C. trachomatis Nature Genetics, vol. 21, no. 4, pp. 385-389, Apr. 1999.
[18] D.A. Keim, "Designing Pixel-Oriented Visualization Techniques: Theory and Applications," IEEE Trans. Visualization and Computer Graphics, vol. 6, no. 1, Jan.-Mar. 2000, pp. 59-78.
[19] D.A. Keim, H.-P. Kriegel, and T. Seidl, Visual Feedback in Querying Large Databases Proc. IEEE Visualization '93, pp. 158-165, Oct. 1993.
[20] D.A. Keim, H.-P. Kriegel, and M. Ankerst, “Recursive Pattern: A Technique for Visualizing Very Large Amounts of Data,” Proc. Visualization '95, pp. 279-286, 1995.
[21] W. Kohler, Gastalt Psychology: An Introduction to New Concepts in Modern Psychology. Liveright, 1992.
[22] A. Mamania, G. Grinstein, and K. Marx, Visualization Techniques for Genomic DNA Proc. SPIE '96 Visual Data Exploration and Analysis Conf., vol. 2656, pp. 189-199, 1996.
[23] J. Parkhill, M. Achtman, K.D. James, S.D. Bentley, C. Churcher, S.R. Klee, G. Morelli, D. Basham, D. Brown, T. Chillingworth, R.M. Davies, P. Davis, K. Devlin, T. Feltwell, N. Hamlin, S. Holroyd, K. Jagels, S. Leather, S. Moule, K. Mungall, M.A. Quail, M.A. Rajandream, K.M. Rutherford, M. Simmonds, J. Skelton, S. Whitehead, B.G. Spratt, and B.G. Barrell, Complete DNA Sequence of a Serogroup A Strain of N. Meningitidis Z2491 Nature, vol. 404, no. 6777, pp. 502-506, Mar. 2000.
[24] W.R. Pearson, Comparison of Methods for Searching Protein Sequence Databases Protein Sciences, vol. 4, no. 6, pp. 1145-1160, June 1995.
[25] H.-O. Peitgen, H. Jürgens, and D. Saupe, Chaos and Fractals New Frontiers of Science. New York: Springer-Verlag, 1992.
[26] J. Pustell and F.C. Kafatos, A Convenient and Adaptable Package of Computer Programs for DNA and Protein Sequence Management, Analysis, and Homology Determination Nucleic Acids Research, vol. 12, pp. 643-655, 1984.
[27] T.D. Read, R.C. Brunham, C. Shen, S.R. Gill, J.F. Heidelberg, O. White, E.K. Hickey, J. Peterson, T. Utterback, K. Berry, S. Bass, K. Linher, J. Weidman, H. Khouri, B. Craven, C. Bowman, R. Dodson, M. Gwinn, W. Nelson, R. DeBoy, J. Kolonay, G. McClarty, S.L. Salzberg, J. Eisen, and C.M. Fraser, Genome Sequences of Chlamydia Trachomatis MoPn and Chlamydia Pneumoniae AR39 Nucleic Acids Research, vol. 28, no. 6, pp. 1397-1406, June 2000.
[28] S. Schwartz, Z. Zhang, K.A. Frazer, A. Smit, C. Riemer, J. Bouck, R. Gibbs, R. Hardison, and W. Miler, PipMaker A Web Server for Aligning Two Genomic DNA Sequences Genome Research, vol. 10, no. 5, pp. 557-586, Apr. 2000.
[29] M. Shirai, H. Hirakawa, M. Kimoto, M. Tabuchi, F. Kishi, K. Ouchi, T. Shiba, K. Ishii, M. Hattori, S. Kuhara, and T. Nakazawa, Comparison of Whole Genome Sequences of Chlamydia pneumoniae J138 from Japan and CWL029 from USA Nucleic Acids Research, vol. 28, no. 12, pp. 2311-2314, June 2000.
[30] M. Singer and P. Berg, Genes&Genomes. Univ. Science Book, 1991.
[31] E.L.L. Sonnhammer and J.C. Wootton, Integrated Graphical Analysis of Protein Sequence Features Predicted from Sequence Composition PROTEINS: Structure, Function, and Genetics, vol. 45, pp. 262-273, 2001.
[32] R.S. Stephens, S. Kalman, C. Lammel, J. Fan, R. Marathe, L. Aravind, W. Mitchell, L. Olinger, R.L. Tatusov, Q. Zhao, E.V. Koonin, and R.W. Davis, Genome Sequence of an Obligate Intracellular Pathogen of Humans: Chlamydia Trachomatis Science, vol. 282, no. 5389, pp. 754-759, Oct. 1998.
[33] H. Tettelin, N.J. Saunders, J. Heidelberg, A.C. Jeffries, K.E. Nelson, J.A. Eisen, K.A. Ketchum, D.W. Hood, J.F. Peden, R.J. Dodson, W.C. Nelson, M.L. Gwinn, R. DeBoy, J.D. Peterson, E.K. Hickey, D.H. Haft, S.L. Salzberg, O. White, R.D. Fleischmann, B.A. Dougherty, T. Mason, A. Ciecko, D.S. Parksey, E. Blair, H. Cittone, E.B. Clark, M.D. Cotton, T.R. Utterback, H. Khouri, H. Qin, J. Vamathevan, J. Gill, J.V. Scarlato, V. Masignani, M. Pizza, G. Grandi, L. Sun, H.O. Smith, C.M. Fraser, E.R. Moxon, R. Rappuoli, and J.C. Venter, Complete Genome Sequence of N. Meningitidis Serogroup B Strain MC58 Science, vol. 287, no. 5459, pp. 1809-1915, Mar. 2000.
[34] J.F. Tomb et al., The Complete Genome Sequence of the Gastric Pathogen Helicobacter Pylori Nature, vol. 388, no. 6642, pp. 539-547, Aug. 1997.
[35] D. Voorhies, Space-Filling Curves and a Measure of Coherence Graphics Gems, pp. 26-30, Academic Press, 1991.
[36] M. Waterman, Introduction to Computational Biology: Maps, Sequences and Genomes. New York: Chapman&Hall, 1995.
[37] N. Wirth, Algorithms + Data Structures = Programs. Prentice Hall, 1976.
[38] D. Wu, J. Roberge, D.J. Cork, B.G. Nguyen, and T. Grace, Computer Visualization of Long Genomic Sequences Proc. IEEE Visualization '93, pp. 308-315, Oct. 1993.

Index Terms:
Visualization, bioinformatics, whole genome alignment, fractal curve, gestalt psychology, digital image processing.
Pak Chung Wong, Kwong Kwok Wong, Harlan Foote, Jim Thomas, "Global Visualization and Alignments of Whole Bacterial Genomes," IEEE Transactions on Visualization and Computer Graphics, vol. 9, no. 3, pp. 361-377, July-Sept. 2003, doi:10.1109/TVCG.2003.1207444
Usage of this product signifies your acceptance of the Terms of Use.