The Community for Technology Leaders
RSS Icon
Issue No.06 - November/December (2009 vol.15)
pp: 881-888
Cydney B. Nielsen , BC Cancer Agency, Genome Sciences Centre
Shaun D. Jackman , BC Cancer Agency, Genome Sciences Centre
Inanç Birol , BC Cancer Agency, Genome Sciences Centre
Steven J.M. Jones , BC Cancer Agency, Genome Sciences Centre
One bottleneck in large-scale genome sequencing projects is reconstructing the full genome sequence from the short subsequences produced by current technologies. The final stages of the genome assembly process inevitably require manual inspection of data inconsistencies and could be greatly aided by visualization. This paper presents our design decisions in translating key data features identified through discussions with analysts into a concise visual encoding. Current visualization tools in this domain focus on local sequence errors making high-level inspection of the assembly difficult if not impossible. We present a novel interactive graph display, ABySS-Explorer, that emphasizes the global assembly structure while also integrating salient data features such as sequence length. Our tool replaces manual and in some cases pen-and-paper based analysis tasks, and we discuss how user feedback was incorporated into iterative design refinements. Finally, we touch on applications of this representation not initially considered in our design phase, suggesting the generality of this encoding for DNA sequence data.
Bioinformatics visualization, design study, DNA sequence, genome assembly
Cydney B. Nielsen, Shaun D. Jackman, Inanç Birol, Steven J.M. Jones, "ABySS-Explorer: Visualizing Genome Sequence Assemblies", IEEE Transactions on Visualization & Computer Graphics, vol.15, no. 6, pp. 881-888, November/December 2009, doi:10.1109/TVCG.2009.116
[1] ANTLR. Another tool for language recognition. url, http:/
[2] R. A. Becker, S. G. Eick, and A. R. Wilks, Visualizing network data. IEEE Transactions on Visualization and Computer Graphics, 1: 16–28, 1995.
[3] J. K. Bonfield, K. f Smith, and R. Staden, A new dna sequence assembly program. Nucleic Acids Res, 23 (24): 4992–9, Dec 1995.
[4] J. Butler, I. MacCallum, M. Kleber, I. A. Shlyakhter, M. K. Belmonte, E. S. Lander, C. Nusbaum, and D. B. Jaffe, Allpaths: de novo assembly of whole-genome shotgun microreads. Genome Res, 18 (5): 810–20, May 2008.
[5] M. J. Chaisson, D. Brinza, and P. A. Pevzner, De novo fragment assembly with short mate-paired reads: Does the read length matter? Genome Res, 19 (2): 336–46, Feb 2009.
[6] M. J. Chaisson and P. A. Pevzner, Short read fragment assembly of bacterial genomes. Genome Res, 18 (2): 324–30, Feb 2008.
[7] J. D. Fekete, D. Wang, A. Aris, and C. Plaisant, Overlaying graph links on treemaps. IEEE Symposium on Information Visualization (Proceedings of Infovis 2003), Poster Compendium: 82–83, 2003.
[8] B. Fry, Visualizing Data. O'Reilly Media Inc., 2008.
[9] D. Gordon, C. Abajian, and P. Green, Consed: a graphical tool for sequence finishing. Genome Res, 8 (3): 195–202, Mar 1998.
[10] Graphviz. Graph visualization software. url, http:/
[11] M. Harrower and C. Brewer, An online tool for selecting color schems for maps. The Cartographic Journal, 40 (1): 27–37, 2003.
[12] D. Holten, Hierarchical edge bundles: visualization of adjacency relations in hierarchical data. IEEE Transactions on Visualization and Computer Graphics, (Proceedings of InfoVis 2006), 12 (5): 741—748, 2006.
[13] D. Holten and J. J. van Wijk, A user study on visualizing directed edges in graphs. Proceedings of CHI, pages 2299–2308, 2009.
[14] W. Huang and G. Marth, Eagleview: a genome assembly viewer for next-generation sequencing technologies. Genome Res, 18 (9): 1538–43, Sep 2008.
[15] T. Hubbard, D. Barker, E. Birney, G. Cameron, Y. Chen, L. Clark, T. Cox, J. Cuff, V. Curwen, T. Down, R. Durbin, E. Eyras, J. Gilbert, M. Hammond, L. Huminiecki, A. Kasprzyk, H. Lehvaslaiho, P. Lijnzaad, C. Melsopp, E. Mongin, R. Pettett, M. Pocock, S. Potter, A. Rust, E. Schmidt, S. Searle, G. Slater, J. Smith, W. Spooner, A. Stabenau, J. Stalker, E. Stupka, A. Ureta-Vidal, I. Vastrik, and M. Clamp, The ensembl genome database project. Nucleic Acids Research, 30 (1): 38–41, Jan 2002.
[16] JUNG. Java universal network/graph framework. url, http:/
[17] T. Kamada and S. Kawai, An algorithm for drawing general indirect graphs. Information Processing Letters, 31 (1): 7–15, 1989.
[18] W. Kent, C. Sugnet, T. Furey, K. Roskin, T. P. TH,A. Zahler, and D. Haussler, The human genome browser at ucsc. Genome Research, 12 (6): 996–1006, June 2002.
[19] P.-G. Kim, H.-G. Cho, and K. Park, A scaffold analysis tool using mate-pair information in genome sequencing. J Biomed Biotechnol, 2008: 675741, Jan 2008.
[20] J. Mackinlay, Automating the design of graphical presentations of relational information. ACM Transactions on Graphics, 5 (2): 110–141, April 1986.
[21] P. A. Pevzner, H. Tang, and M. S. Waterman, An eulerian path approach to dna fragment assembly. Proc Natl Acad Sci USA, 98 (17): 9748–53, Aug 2001.
[22] M. C. Schatz, A. M. Phillippy, B. Shneiderman, and S. L. Salzberg, Hawkeye: an interactive visual analytics tool for genome assemblies. Genome Biol, 8 (3): R34, Jan 2007.
[23] B. Shneiderman, The eyes have it: A task by data type taxonomy for information visualizations. Proceedings of the IEEE Symposium on Visual Languages, pages 336–343, 1996.
[24] J. Simpson, K. Wong, S. Jackman, J. Schein, S. Jones, and I. Birol, Abyss: A parallel assembler for short read sequence data. Genome Res, Feb 2009.
[25] D. R. Zerbino and E. Birney, Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res, 18 (5): 821–9, May 2008.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool