Search For:

Displaying 1-11 out of 11 total
A Cluster-on-a-Chip Architecture for High-Throughput Phylogeny Search
Found in: IEEE Transactions on Parallel and Distributed Systems
By Tiffany M. Mintz,Jason D. Bakos
Issue Date:April 2012
pp. 579-588
In this paper, we describe an FPGA-based coprocessor architecture that performs a high-throughput branch-and-bound search of the space of phylogenetic trees corresponding to the number of input taxa. Our coprocessor architecture is designed to accelerate m...
FPGA Acceleration of Gene Rearrangement Analysis
Found in: Field-Programmable Custom Computing Machines, Annual IEEE Symposium on
By Jason D. Bakos
Issue Date:April 2007
pp. 85-94
In this paper we present our work toward FPGA acceleration of phylogenetic reconstruction, a type of analysis that is commonly performed in the fields of systematic biology and comparative genomics. In our initial study, we have targeted a specific applica...
A Sparse Matrix Personality for the Convey HC-1
Found in: Field-Programmable Custom Computing Machines, Annual IEEE Symposium on
By Krishna K. Nagar, Jason D. Bakos
Issue Date:May 2011
pp. 1-8
In this paper we describe a double precision floating point sparse matrix-vector multiplier (SpMV) and its performance as implemented on a Convey HC-1 reconfigurable computer. The primary contributions of this work are a novel streaming reduction architect...
Sparse matrix-vector multiply on the Texas Instruments C6678 Digital Signal Processor
Found in: 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
By Yang Gao,Jason D. Bakos
Issue Date:June 2013
pp. 168-174
The Texas Instruments (TI) C6678 “Shannon” is TI's most recently-released Digital Signal Processor (DSP). Although its original purpose was voice and video encoding and decoding, it may have the potential to become a practical coprocessor for scientific co...
Memory Access Scheduling on the Convey HC-1
Found in: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
By Zheming Jin,Jason D. Bakos
Issue Date:April 2013
pp. 237
In this paper we describe a technique for scheduling memory accesses to improve effective memory bandwidth on the Convey HC-1 platform.
GPU Acceleration of Pyrosequencing Noise Removal
Found in: 2012 Symposium on Application Accelerators in High Performance Computing (SAAHPC)
By Yang Gao,Jason D. Bakos
Issue Date:July 2012
pp. 94-101
Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chim...
High-Performance Heterogeneous Computing with the Convey HC-1
Found in: Computing in Science and Engineering
By Jason D. Bakos
Issue Date:November 2010
pp. 80-87
<p>Unlike other socket-based reconfigurable coprocessors, the Convey HC-1 contains nearly 40 field-programmable gate arrays, scatter-gather memory modules, a high-capacity crossbar switch, and a fully coherent memory system.</p>
Exploiting Matrix Symmetry to Improve FPGA-Accelerated Conjugate Gradient
Found in: Field-Programmable Custom Computing Machines, Annual IEEE Symposium on
By Jason D. Bakos, Krishna K. Nagar
Issue Date:April 2009
pp. 223-226
In this paper we describe a new approach for accelerating the Conjugate Gradient (CG) method using an FPGA co-processor. As in previous approaches, our co-processor performs a double-precision sparse matrix-vector multiplication. However, our implementatio...
Lightweight Error Correction Coding for System-Level Interconnects
Found in: IEEE Transactions on Computers
By Jason D. Bakos, Donald M. Chiarulli, Steven P. Levitan
Issue Date:March 2007
pp. 289-304
No summary available.
A Reconfigurable Distributed Computing Fabric Exploiting Multilevel Parallelism
Found in: Field-Programmable Custom Computing Machines, Annual IEEE Symposium on
By Charles L. Cathey, Jason D. Bakos, Duncan A. Buell
Issue Date:April 2006
pp. 121-130
This paper presents a novel reconfigurable data flow processing architecture that promises high performance by explicitly targeting both fine- and course-grained parallelism. This architecture is based on multiple FPGAs organized in a scalable direct netwo...
An integrated reduction technique for a double precision accumulator
Found in: Proceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA '09)
By Jason D. Bakos, Krishna K. Nagar, Yan Zhang
Issue Date:November 2009
pp. 11-18
The accumulation operation, An+1 = An + X, is perhaps one of the most fundamental and widely-used operations in numerical mathematics and digital signal processing. However, designing double-precision floating-point accumulators presents a unique set of ch...