loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'04)
Napa, California
April 20-April 23
ISBN: 0-7695-2230-0
Emre ?zre, Trinity College, Dublin, Ireland
Andy P. Nisbet, Trinity College, Dublin, Ireland
David Gregg, Trinity College, Dublin, Ireland
This paper discusses the balance between loop-level parallelism and clock rate of enhancing the performance of DSP applications fully implemented on FPGAs. Loop-level parallelism reduces the total cycles of an application at the cost of increased routing complexity that often results in lower clock rates. We analyze loops that can be fully parallelized and show that it is possible to achieve better performance by controlling the number of parallel iterations of the loops than using fully parallel loops. We have implemented loop parallelism in our compilation framework and fine-tune them to enhance the performance of DSP applications that target Xilinx Virtex-II FPGA chip. Our experimental results show that it is possible to reach a performance equilibrium point where the total number of cycles and the overall clock frequency can be adjusted to maximize the overall performance of an application.
Citation:
Emre ?zre, Andy P. Nisbet, David Gregg, "Fine-Tuning Loop-Level Parallelism for Increasing Performance of DSP Applications on FPGAs," fccm, pp.273-274, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.