The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - Sept.-Oct. (2013 vol.10)
pp: 1234-1240
Liliana D. Florea , Johns Hopkins University School of Medicine, Baltimore
Steven L. Salzberg , Johns Hopkins School of Medicine and Bloomberg School of Public Health, Baltimore
ABSTRACT
Next generation sequencing technologies provide unprecedented power to explore the repertoire of genes and their alternative splice variants, collectively defining the transcriptome of a species in great detail. However, assembling the short reads into full-length gene and transcript models presents significant computational challenges. We review current algorithms for assembling transcripts and genes from next generation sequencing reads aligned to a reference genome, and lay out areas for future improvements.
INDEX TERMS
Decision support systems, Biological system modeling, Genetics, Algorithm design and analysis,medicine and science, Algorithms, biology and genetics, computer applications
CITATION
Liliana D. Florea, Steven L. Salzberg, "Genome-Guided Transcriptome Assembly in the Age of Next-Generation Sequencing", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 5, pp. 1234-1240, Sept.-Oct. 2013, doi:10.1109/TCBB.2013.140
REFERENCES
[1] J.C. Venter, M.D. Adams, E.W. Myers, P.W. Li, R.J. Mural, G.G. Sutton, H.O. Smith, M. Yandell, C.A. Evans, R.A. Holt et al. "The Sequence of the Human Genome," Science, vol. 291, no. 5507, pp. 1304-1351, 2001.
[2] The Int'l Human Genome Sequencing Consortium, , "Initial Sequencing and Analysis of the Human Genome," Nature, vol. 409, no. 6822, pp. 860-921, 2001.
[3] S.L. Salzberg, "Recent Advances in RNA Sequence Analysis," F1000 Biology Reports, vol. 2, p. 64, 2010.
[4] J.J. Li, C.R. Jiang, J.B. Brown, H. Huang, and P.J. Bickel, "Sparse Linear Modeling of Next-Generation mRNA Sequencing (RNA-Seq) Data for Isoform Discovery and Abundance Estimation," Proc. Nat'l Academy of Sciences USA, vol. 108, no. 50, pp. 19867-19872, 2001.
[5] C. Trapnell, B.A. Williams, G. Pertea, A. Mortazavi, G. Kwan, M.J. van Baren, S.L. Salzberg, B.J. Wold, and L. Pachter, "Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching During Cell Differentiation," Nature Biotechnology, vol. 28, no. 5, pp. 511-515, 2009.
[6] M. Guttman, M. Garber, J.Z. Levin, J. Donaghey, J. Robinson, X. Adiconis, L. Fan, M.J. Koziol, A. Gnirke, C. Nusbaum, J.L. Rinn, E.S. Lander, and A. Regev, "Ab Initio Reconstruction of Cell Type-Specific Transcriptomes in Mouse Reveals the Conserved Multi-Exonic Structure of LincRNAs," Nature Biotechnology, vol. 28, no. 5, pp. 503-510, 2010.
[7] W. Li, J. Feng, and T. Jiang, "IsoLasso: A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembly," J. Computational Biology, vol. 18, no. 11, pp. 1693-1707, 2011.
[8] M.G. Grabherr, B.J. Haas, M. Yassour, J.Z. Levin, D.A. Thompson, I. Amit, X. Adiconis, L. Fan, R. Raychowdhury, Q. Zeng, Z. Chen, E. Mauceli, N. Hacohen, A. Gnirke, N. Rhind, F. di Palma, B.W. Birren, C. Nusbaum, K. Lindblad-Toh, N. Friedman, and A. Regev, "Full-Length Transcriptome Assembly from RNA-Seq Data without a Reference Genome," Nature Biotechnology, vol. 29, no. 7, pp. 644-652, 2011.
[9] M.H. Schulz, D.R. Zerbino, M. Vingron, and E. Birney, "Oases: Robust De Novo RNA-Seq Assembly across the Dynamic Range of Expression Levels," Bioinformatics, vol. 28, no. 8, pp. 1086-1092, 2012.
[10] G. Robertson, J. Schein, R. Chiu, R. Corbett, M. Field, S.D. Jackman, K. Mungall, S. Lee, H.M. Okada, J.Q. Qian, M. Griffith, A. Raymond, N. Thiessen, T. Cezard, Y.S. Butterfield, R. Newsome, S.K. Chan, R. She, R. Varhol, B. Kamoh, A.L. Prabhu, A. Tam, Y.J. Zhao, R.A. Moore, M. Hirst, M.A. Marra, S.J.M. Jones, P.A. Hoodless, and I. Birol, "De Novo Assembly and Analysis of RNA-Seq Data," Nature Methods, vol. 7, no. 11, pp. 909-U962, 2010.
[11] J.A. Martin and Z. Wang, "Next-Generation Transcriptome Assembly," Nature Rev. Genetics, vol. 12, no. 10, pp. 671-682, 2011.
[12] E.T. Wang, R. Sandberg, S. Luo, I. Khrebtukova, L. Zhang, C. Mayr, S.F. Kingsmore, G.P. Schroth, and C.B. Burge, "Alternative Isoform Regulation in Human Tissue Transcriptomes," Nature, vol. 456, no. 7221, pp. 470-476, 2008.
[13] Q. Pan, O. Shai, L.J. Lee, B.J. Frey, and B.J. Blencowe, B.J., "Deep Surveying of Alternative Splicing Complexity in the Human Transcriptome by High-Throughput Sequencing," Nature Genetics, vol. 40, no. 12, pp. 1413-1415, 2008.
[14] B.R. Graveley, "Alternative Splicing: Increasing Diversity in the Proteomic World," Trends in Genetics, vol. 17, no. 2, pp. 100-107, 2001.
[15] K.D. Hansen, S.E. Brenner, and S. Dudoit, "Biases in Illumina Transcriptome Sequencing Caused by Random Hexamer Priming," Nucleic Acids Research, vol. 38, no. 12, pp. e131, 2010.
[16] Z. Wang, M. Gerstein, M., and M. Snyder, "RNA-Seq: a Revolutionary Tool for Transcriptomics," Nature Rev. Genetics, vol. 10, no. 1, pp. 57-63, 2009.
[17] C. Trapnell, L. Pachter, and S.L. Salzberg, "TopHat: Discovering Splice Junctions with RNA-Seq," Bioinformatics, vol. 25, no. 9, pp. 1105-1111, 2009.
[18] D. Kim, G. Pertea, C. Trapnell, H. Pimentel, R. Kelley, and S.L. Salzberg, "TopHat2: Accurate Alignment of Transcriptomes in the Presence of Insertions, Deletions and Gene Fusions," Genome Biology, vol. 14, no. 4, pp. R36, 2013.
[19] K. Wang, D. Singh, Z. Zeng, S.J. Coleman, Y. Huang, G.L. Savich, X. He, P. Mieczkowski, S.A. Grimm, C.M. Perou, J.N. MacLeod, D.Y. Chiang, J.F. Prins, and J. Liu, "MapSplice: Accurate Mapping of RNA-Seq Reads for Splice Junction Discovery," Nucleic Acids Research, vol. 38, no. 18, pp. e178, 2010.
[20] A. Dobin, C.A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, and T.R. Gingeras, "STAR: Ultrafast Universal RNA-Seq Aligner," Bioinformatics, vol. 29, no. 1, pp. 15-21, 2012.
[21] Y. Li, H. Li-Byarlay, P. Burns, M. Borodovsky, G.E. Robinson, and J. Ma, "TrueSight: A New Algorithm for Splice Junction Detection Using RNA-Seq," Nucleic Acids Research, vol. 41, no. 4, pp. e51, 2013.
[22] J. Wu, O. Anczukow, A.R. Krainer, M.Q. Zhang, and C. Zhang, "OLego: Fast and Sensitive Mapping of Spliced mRNA-Seq Reads using Small Seeds," Nucleic Acids Research, vol. 41, no. 10, pp. 5149-5163, 2013.
[23] J. Goecks, A. Nekrutenko, and J. Taylor, "Galaxy: A Comprehensive Approach for Supporting Accessible, Reproducible, and Transparent Computational Research in the Life Sciences," Genome Biology, vol. 11, no. 8, pp. R86, 2010.
[24] S. Heber, M. Alekseyev, S.H. Sze, H. Tang, and P.A. Pevzner, "Splicing Graphs and the EST Assembly Problem," Bioinformatics, vol. 18, Suppl. 1, pp. S181-188, 2002.
[25] L. Florea, V. Di Francesco, J. Miller, R. Turner, A. Yao, M. Harris, B. Walenz, C. Mobarry, G.V. Merkulov, R. Charlab, I. Dew, Z. Deng, S. Istrail, P. Li, and G. Sutton, "Gene and Alternative Splicing Annotation with AIR," Genome Research, vol. 15, no. 1, pp. 54-66, 2005.
[26] M.F. Rogers, J. Thomas, A.S. Reddy, and A. Ben-Hur, "SpliceGrapher: Detecting Patterns of Alternative Splicing from RNA-Seq Data in the Context of Gene Models and EST Data," Genome Biology, vol. 13, no. 1, pp. R4, 2012.
[27] A.M. Mezlini, E.J. Smith, M. Fiume, O. Buske, G.L. Savich, S. Shah, S. Aparicio, D.Y. Chiang, A. Goldenberg, A., and M. Brudno, "iReckon: Simultaneous Isoform Discovery and Abundance Estimation from RNA-Seq Data," Genome Research, vol. 23, no. 3, pp. 519-529, 2013.
[28] A. Ameur, A. Zaghlool, J. Halvardson, A. Wetterbom, U. Gyllensten, L. Cavelier, and L. Feuk, "Total RNA Sequencing Reveals Nascent Transcription and Widespread Co-Transcriptional Splicing in the Human Brain," Nature Structural and Molecular Biology, vol. 18, no. 12, pp. 1435-1440, 2011.
[29] B. Tian, J. Hu, H. Zhang, and C.S. Lutz, "A large-Scale Analysis of mRNA Polyadenylation of Human and Mouse Genes," Nucleic Acids Research, vol. 33, no. 1, pp. 201-212, 2005.
[30] S. Djebali, C.A. Davis, A. Merkel, A. Dobin et al. , "Landscape of Transcription in Human Cells," Nature, vol. 489, no. 7414, pp. 101-108, 2012.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool