Subscribe

Issue No.04 - April (2012 vol.23)

pp: 579-588

Tiffany M. Mintz , University of South Carolina, Columbia

Jason D. Bakos , University of South Carolina, Columbia

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.191

ABSTRACT

In this paper, we describe an FPGA-based coprocessor architecture that performs a high-throughput branch-and-bound search of the space of phylogenetic trees corresponding to the number of input taxa. Our coprocessor architecture is designed to accelerate maximum-parsimony phylogeny reconstruction for gene-order and sequence data and is amenable to both exhaustive and heuristic tree searches. Our architecture exposes coarse-grain parallelism by dividing the search space among parallel processing elements (PEs) and each PE exposes fine-grain memory parallelism for their lower-bound computation, the kernel computation performed by each PE. Inter-PE communication is performed entirely on-chip. When using this coprocessor for maximum-parsimony reconstruction for gene-order data, our coprocessor achieves a 40X improvement over software in search throughput, corresponding to a 14X end-to-end application improvement when including all communication and systems overheads.

INDEX TERMS

Biology and genetics, distributed systems, parallelism and concurrency, reconfigurable hardware.

CITATION

Tiffany M. Mintz, Jason D. Bakos, "A Cluster-on-a-Chip Architecture for High-Throughput Phylogeny Search",

*IEEE Transactions on Parallel & Distributed Systems*, vol.23, no. 4, pp. 579-588, April 2012, doi:10.1109/TPDS.2010.191REFERENCES

- [1] M.P. de Moraes Zamith, E.W.G. Clua, A. Conci, A. Montenegro, P.A. Pagliosa, and L. Valente, "Parallel Processing between GPU and CPU: Concepts in a Game Architecture,"
Proc. Computer Graphics, Imaging and Visualisation (CGIV '07), pp. 115-120, Aug. 2007.- [2] B. Pieters, D. Van Rijsselbergen, W. De Neve, and R. Van de Walle, "Motion Compensation and Reconstruction of H.264/AVC Video Bitstreams Using the GPU,"
Proc. Eighth Int'l Workshop Image Analysis for Multimedia Interactive Services (WIAMIS '07), pp. 69-72, June 2007.- [3] "PhysX PPU by Ageia," http://www.ageia.com/pdfds_ product_overview.pdf , 2011.
- [4] "IBM Roadrunner Project," http://www.ibm.com/ibm/ ideasfromibm/ us/roadrunner/20080609index.shtml, retrieved, Dec. 2008.
- [5] http:/www.cray.com, Dec. 2007.
- [6] "SGI Products," http://www.sgi.com/productsrasc, Jan. 2009.
- [7] J.-W. Jang, S.B. Choi, and V.K. Prasanna, "Energy-and Time-Efficient Matrix Multiplication on FPGAs,"
IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 13, no. 11, pp. 1305-1319, Nov. 2005.- [8] L. Zhuo and V.K. Prasanna, "Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs,"
Proc. 18th Int'l Parallel and Distributed Processing Symp., p. 92, Apr. 2004.- [9] E. Allen Michalski and D.A. Buell, "The Scalable Architecture for RSA Cryptography on Large FPGAs,"
Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL '06), Aug. 2006.- [10] P.D. Michailidis and K.G. Margaritis, "A Programmable Array Processor Architecture for Flexible Approximate String Matching Algorithms,"
J. Parallel and Distributed Computing, vol. 67, no. 2, pp. 131-141, 2007.- [11] A. Boukerche, J.M. Correa, A.C.M.A. de Melo, R.P. Jacobi, and A.F. Rocha, "Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space,"
Proc. IEEE Int'l Parallel and Distributed Processing Symp., Mar. 2007.- [12] X. Lin, Z. Peiheng, B. Dongbo, F. Shengzhong, and S. Ninghui, "To Accelerate Multiple Sequence Alignment Using FPGAs,"
Proc. Eighth Int'l Conf. High-Performance Computing in Asia-Pacific Region (HPCASIA '05), Nov. 2005.- [13] T. Oliver, B. Schmidt, D. Maskell, D. Nathan, and R. Clemens, "Multiple Sequence Alignment on an FPGA,"
Proc. 11th Int'l Conf. Parallel and Distributed Systems—Workshops (ICPADS '05), July 2005.- [14] Z.K. Baker and V.K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs,"
IEEE Trans. Dependable and Secure Computing, vol. 3, no. 4, pp. 289-300, Oct.-Dec. 2006.- [15] F. Cardells-Tormo and P.-L. Molinet, "Area-Efficient 2-D Shift-Variant Convolvers for FPGA-Based Digital Image Processing,"
IEEE Trans. Circuits and Systems II: Express Briefs, vol. 53, no. 2, pp. 105-109, Feb. 2006.- [16] M. Rawski, P. Tomaszewicz, H. Selvaraj, and T. Luba, "Efficient Implementation of Digital Filters with Use of Advanced Synthesis Methods Targeted FPGA Architectures,"
Proc. Eighth Euromicro Conf. Digital System Design, pp. 460-466, Aug./Sept. 2005.- [17] A. Madanayake, L. Bruton, and C. Comis, "FPGA Architectures for Real-Time 2D/3D FIR/IIR Plane Wave Filters,"
Proc. Int'l Symp. Circuits and Systems (ISCAS '04), vol. 3, pp. 613-616, May 2004.- [18] I.S. Uzun, A. Amira, A. Bouridane, and A., "FPGA Implementations of Fast Fourier Transforms for Real-Time Signal and Image Processing,"
IEE Proc. Vision, Image and Signal Processing, vol. 152, no. 3, pp. 283-296, June 2005.- [19] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs,"
Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 171-180, Apr. 2005.- [20] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs,"
Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines, 2005.- [21] K.D. Underwood, "FPGAs vs. CPUs: Trends in Peak Floating-Point Performance,"
Proc. ACM/SIGDA 12th Int'l Symp. Field Programmable Gate Arrays (FPGA), pp. 171-180, 2004.- [22] K.D. Underwood and K.S. Hemmert, "Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance,"
Proc. 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 219-228, 2004.- [23] K.S. Hemmert and K.D. Underwood, "An Analysis of the Double-Precision Floating-Point FFT on FPGAs,"
Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 171-180, 2005.- [24] J. Felsenstein,
Inferring Phylogenies. Sinauer Assoc., 2004.- [25] A. Stamatakis, "An Efficient Program for Phylogenetic Inference Using Simulated Annealing,"
Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), Apr. 2005.- [26] J. Zola, D. Trystram, A. Tchernykh, and C. Brizuela, "Parallel Multiple Sequence Alignment with Local Phylogeny Search by Simulated Annealing,"
Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), Apr. 2006.- [27] D. Barker, "LVB: Parsimony and Simulated Annealing in the Search of Phylogenetic Trees,"
Bioinformatics, vol. 20, pp. 274-275, 2004.- [28] M.J. Brauer, M.T. Holder, L.A. Dries, D.J. Zwickl, P.O. Lewis, and D.M. Hillis, "Genetic Algorithms and Parallel Processing in Maximum-Likelihood Phylogeny Inference,"
Molecular Biology and Evolution, vol. 19, no. 10, pp. 1717-1726, 2002.- [29] A.R. Lemmon and M.C. Milinkovitch, "The Metapopulation Genetic Algorithm: An Efficient Solution for the Problem of Large Phylogeny Estimation,"
Proc. Nat'l Academy of Sciences, vol. 99, no. 16, pp. 10516-10521, 2002.- [30] N. Saitou and N. Nei, "The Neighbor-Joining Method: A New Method for Reconstrucing Phylogenetic Trees,"
Molecular Biology and Evolution, vol. 4, pp. 406-425, 1987.- [31] M. Blanchette, G. Bourque, and D. Sankoff, "Breakpoint Phylogenies,"
Proc. Workshop Genome Informatics, pp. 25-34, S. Miyano and T. Takagi, eds., 1997.- [32] B.M.E. Moret, J. Tang, and T. Warnow, "Reconstructing Phylogenies from Gene-Content and Gene-Order Data,"
Math. of Evolution and Phylogeny, O. Gascuel, ed., pp 321-352, Oxford Univ. Press, 2005.- [33] G. Bourque and P. Pevzner, "Genome-Scale Evolution: Reconstructing Gene Orders in the Ancestral Species,"
Genome Research, vol. 12, pp. 26-36, 2002.- [34] B.M.E. Moret, J. Tang, L. Wang, and T. Warnow, "Steps toward Accurate Reconstructions of Phylogenies from Gene-Order Data,"
J. Computer and System Sciences, vol. 65, no. 3, pp 508-525, Nov. 2002.- [35] B.M.E. Moret, L.-S. Wang, T. Warnow, and S. Wyman, "New Approaches for Reconstructing Phylogenies Based on Gene Order,"
Proc. Ninth Conf. Intelligent Systems for Molecular Biology (ISMB '01) in Bioinformatics, vol. 17, pp. S165-S173, 2001.- [36] B.M.E. Moret, S. Wyman, D.A. Bader, T. Warnow, and M. Yan, "A New Implementation and Detailed Study of Breakpoint Analysis,"
Proc. Sixth Pacific Symp. Biocomputing (PSB), pp. 583-594, 2001.- [37] B.M.E. Moret, D.A. Bader, and T. Warnow, "High-Performance Algorithm Engineering for Computational Phylogenetics,"
J. Supercomputing, vol. 22, pp. 99-111, 2002.- [38] D. Huson, S. Nettles, and T. Warnow, "Disk-Covering, a Fast Converging Method for Phylogenetic Tree Reconstruction,"
J. Computational Biology, vol. 6, no. 3, pp. 369-386, 1999.- [39] U. Roshan, B.M.E. Moret, T.L. Williams, and T. Warnow, "Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees,"
Proc. Third IEEE Computational Systems Bioinformatics Conf. (CSB '04), pp. 98-109, 2004.- [40] D. Huson, S. Nettles, and T. Warnow, "Disk-Covering, a Fast Converging Method for Phylogenetic Tree Reconstruction,"
J. Computational Biology, vol. 6, no. 3, pp. 369-386, 1999.- [41] J. Tang and B.M.E. Moret, "Scaling up Accurate Phylogenetic Reconstruction from Gene-Order Data,"
Proc. 11th Conf. Intelligent Systems for Molecular Biology (ISMB '03) in Bioinformatics, vol. 19, pp. i305-i312, 2003. |