This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Cluster-on-a-Chip Architecture for High-Throughput Phylogeny Search
April 2012 (vol. 23 no. 4)
pp. 579-588
Tiffany M. Mintz, University of South Carolina, Columbia
Jason D. Bakos, University of South Carolina, Columbia
In this paper, we describe an FPGA-based coprocessor architecture that performs a high-throughput branch-and-bound search of the space of phylogenetic trees corresponding to the number of input taxa. Our coprocessor architecture is designed to accelerate maximum-parsimony phylogeny reconstruction for gene-order and sequence data and is amenable to both exhaustive and heuristic tree searches. Our architecture exposes coarse-grain parallelism by dividing the search space among parallel processing elements (PEs) and each PE exposes fine-grain memory parallelism for their lower-bound computation, the kernel computation performed by each PE. Inter-PE communication is performed entirely on-chip. When using this coprocessor for maximum-parsimony reconstruction for gene-order data, our coprocessor achieves a 40X improvement over software in search throughput, corresponding to a 14X end-to-end application improvement when including all communication and systems overheads.

[1] M.P. de Moraes Zamith, E.W.G. Clua, A. Conci, A. Montenegro, P.A. Pagliosa, and L. Valente, "Parallel Processing between GPU and CPU: Concepts in a Game Architecture," Proc. Computer Graphics, Imaging and Visualisation (CGIV '07), pp. 115-120, Aug. 2007.
[2] B. Pieters, D. Van Rijsselbergen, W. De Neve, and R. Van de Walle, "Motion Compensation and Reconstruction of H.264/AVC Video Bitstreams Using the GPU," Proc. Eighth Int'l Workshop Image Analysis for Multimedia Interactive Services (WIAMIS '07), pp. 69-72, June 2007.
[3] "PhysX PPU by Ageia," http://www.ageia.com/pdfds_ product_overview.pdf , 2011.
[4] "IBM Roadrunner Project," http://www.ibm.com/ibm/ ideasfromibm/ us/roadrunner/20080609index.shtml, retrieved, Dec. 2008.
[5] http:/www.cray.com, Dec. 2007.
[6] "SGI Products," http://www.sgi.com/productsrasc, Jan. 2009.
[7] J.-W. Jang, S.B. Choi, and V.K. Prasanna, "Energy-and Time-Efficient Matrix Multiplication on FPGAs," IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 13, no. 11, pp. 1305-1319, Nov. 2005.
[8] L. Zhuo and V.K. Prasanna, "Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs," Proc. 18th Int'l Parallel and Distributed Processing Symp., p. 92, Apr. 2004.
[9] E. Allen Michalski and D.A. Buell, "The Scalable Architecture for RSA Cryptography on Large FPGAs," Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL '06), Aug. 2006.
[10] P.D. Michailidis and K.G. Margaritis, "A Programmable Array Processor Architecture for Flexible Approximate String Matching Algorithms," J. Parallel and Distributed Computing, vol. 67, no. 2, pp. 131-141, 2007.
[11] A. Boukerche, J.M. Correa, A.C.M.A. de Melo, R.P. Jacobi, and A.F. Rocha, "Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space," Proc. IEEE Int'l Parallel and Distributed Processing Symp., Mar. 2007.
[12] X. Lin, Z. Peiheng, B. Dongbo, F. Shengzhong, and S. Ninghui, "To Accelerate Multiple Sequence Alignment Using FPGAs," Proc. Eighth Int'l Conf. High-Performance Computing in Asia-Pacific Region (HPCASIA '05), Nov. 2005.
[13] T. Oliver, B. Schmidt, D. Maskell, D. Nathan, and R. Clemens, "Multiple Sequence Alignment on an FPGA," Proc. 11th Int'l Conf. Parallel and Distributed Systems—Workshops (ICPADS '05), July 2005.
[14] Z.K. Baker and V.K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs," IEEE Trans. Dependable and Secure Computing, vol. 3, no. 4, pp. 289-300, Oct.-Dec. 2006.
[15] F. Cardells-Tormo and P.-L. Molinet, "Area-Efficient 2-D Shift-Variant Convolvers for FPGA-Based Digital Image Processing," IEEE Trans. Circuits and Systems II: Express Briefs, vol. 53, no. 2, pp. 105-109, Feb. 2006.
[16] M. Rawski, P. Tomaszewicz, H. Selvaraj, and T. Luba, "Efficient Implementation of Digital Filters with Use of Advanced Synthesis Methods Targeted FPGA Architectures," Proc. Eighth Euromicro Conf. Digital System Design, pp. 460-466, Aug./Sept. 2005.
[17] A. Madanayake, L. Bruton, and C. Comis, "FPGA Architectures for Real-Time 2D/3D FIR/IIR Plane Wave Filters," Proc. Int'l Symp. Circuits and Systems (ISCAS '04), vol. 3, pp. 613-616, May 2004.
[18] I.S. Uzun, A. Amira, A. Bouridane, and A., "FPGA Implementations of Fast Fourier Transforms for Real-Time Signal and Image Processing," IEE Proc. Vision, Image and Signal Processing, vol. 152, no. 3, pp. 283-296, June 2005.
[19] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs," Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 171-180, Apr. 2005.
[20] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs," Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines, 2005.
[21] K.D. Underwood, "FPGAs vs. CPUs: Trends in Peak Floating-Point Performance," Proc. ACM/SIGDA 12th Int'l Symp. Field Programmable Gate Arrays (FPGA), pp. 171-180, 2004.
[22] K.D. Underwood and K.S. Hemmert, "Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance," Proc. 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 219-228, 2004.
[23] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs," Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 171-180, 2005.
[24] J. Felsenstein, Inferring Phylogenies. Sinauer Assoc., 2004.
[25] A. Stamatakis, "An Efficient Program for Phylogenetic Inference Using Simulated Annealing," Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), Apr. 2005.
[26] J. Zola, D. Trystram, A. Tchernykh, and C. Brizuela, "Parallel Multiple Sequence Alignment with Local Phylogeny Search by Simulated Annealing," Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), Apr. 2006.
[27] D. Barker, "LVB: Parsimony and Simulated Annealing in the Search of Phylogenetic Trees," Bioinformatics, vol. 20, pp. 274-275, 2004.
[28] M.J. Brauer, M.T. Holder, L.A. Dries, D.J. Zwickl, P.O. Lewis, and D.M. Hillis, "Genetic Algorithms and Parallel Processing in Maximum-Likelihood Phylogeny Inference," Molecular Biology and Evolution, vol. 19, no. 10, pp. 1717-1726, 2002.
[29] A.R. Lemmon and M.C. Milinkovitch, "The Metapopulation Genetic Algorithm: An Efficient Solution for the Problem of Large Phylogeny Estimation," Proc. Nat'l Academy of Sciences, vol. 99, no. 16, pp. 10516-10521, 2002.
[30] N. Saitou and N. Nei, "The Neighbor-Joining Method: A New Method for Reconstrucing Phylogenetic Trees," Molecular Biology and Evolution, vol. 4, pp. 406-425, 1987.
[31] M. Blanchette, G. Bourque, and D. Sankoff, "Breakpoint Phylogenies," Proc. Workshop Genome Informatics, pp. 25-34, S. Miyano and T. Takagi, eds., 1997.
[32] B.M.E. Moret, J. Tang, and T. Warnow, "Reconstructing Phylogenies from Gene-Content and Gene-Order Data," Math. of Evolution and Phylogeny, O. Gascuel, ed., pp 321-352, Oxford Univ. Press, 2005.
[33] G. Bourque and P. Pevzner, "Genome-Scale Evolution: Reconstructing Gene Orders in the Ancestral Species," Genome Research, vol. 12, pp. 26-36, 2002.
[34] B.M.E. Moret, J. Tang, L. Wang, and T. Warnow, "Steps toward Accurate Reconstructions of Phylogenies from Gene-Order Data," J. Computer and System Sciences, vol. 65, no. 3, pp 508-525, Nov. 2002.
[35] B.M.E. Moret, L.-S. Wang, T. Warnow, and S. Wyman, "New Approaches for Reconstructing Phylogenies Based on Gene Order," Proc. Ninth Conf. Intelligent Systems for Molecular Biology (ISMB '01) in Bioinformatics, vol. 17, pp. S165-S173, 2001.
[36] B.M.E. Moret, S. Wyman, D.A. Bader, T. Warnow, and M. Yan, "A New Implementation and Detailed Study of Breakpoint Analysis," Proc. Sixth Pacific Symp. Biocomputing (PSB), pp. 583-594, 2001.
[37] B.M.E. Moret, D.A. Bader, and T. Warnow, "High-Performance Algorithm Engineering for Computational Phylogenetics," J. Supercomputing, vol. 22, pp. 99-111, 2002.
[38] D. Huson, S. Nettles, and T. Warnow, "Disk-Covering, a Fast Converging Method for Phylogenetic Tree Reconstruction," J. Computational Biology, vol. 6, no. 3, pp. 369-386, 1999.
[39] U. Roshan, B.M.E. Moret, T.L. Williams, and T. Warnow, "Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees," Proc. Third IEEE Computational Systems Bioinformatics Conf. (CSB '04), pp. 98-109, 2004.
[40] D. Huson, S. Nettles, and T. Warnow, "Disk-Covering, a Fast Converging Method for Phylogenetic Tree Reconstruction," J. Computational Biology, vol. 6, no. 3, pp. 369-386, 1999.
[41] J. Tang and B.M.E. Moret, "Scaling up Accurate Phylogenetic Reconstruction from Gene-Order Data," Proc. 11th Conf. Intelligent Systems for Molecular Biology (ISMB '03) in Bioinformatics, vol. 19, pp. i305-i312, 2003.

Index Terms:
Biology and genetics, distributed systems, parallelism and concurrency, reconfigurable hardware.
Citation:
Tiffany M. Mintz, Jason D. Bakos, "A Cluster-on-a-Chip Architecture for High-Throughput Phylogeny Search," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 4, pp. 579-588, April 2012, doi:10.1109/TPDS.2010.191
Usage of this product signifies your acceptance of the Terms of Use.