loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 International Conference on Parallel Processing (ICPP'06)
Vector Lane Threading
Columbus, Ohio
August 14-August 18
ISBN: 0-7695-2636-5
Suzanne Rivoire, Stanford University, USA
Rebecca Schultz, Stanford University, USA
Tomofumi Okuda, Sony Corporation, Japan
Christos Kozyrakis, Stanford University, USA
Multi-lane vector processors achieve excellent computational throughput for programs with high data-level parallelism (DLP). However, application phases without significant DLP are unable to fully utilize the datapaths in the vector lanes. In this paper, we propose vector lane threading (VLT), an architectural enhancement that allows idle vector lanes to run short-vector or scalar threads. VLTenhanced vector hardware can exploit both data-level and thread-level parallelism to achieve higher performance. We investigate implementation alternatives for VLT, focusing mostly on the instruction issue bandwidth requirements. We demonstrate that VLT?s area overhead is small. For applications with short vectors, VLT leads to additional speedup of 1.4 to 2.3 over the base vector design. For scalar threads, VLT outperforms a 2-way CMP design by a factor of two. Overall, VLT allows vector processors to reach high computational throughput for a wider range of parallel programs and become a competitive alternative to CMP systems.
Citation:
Suzanne Rivoire, Rebecca Schultz, Tomofumi Okuda, Christos Kozyrakis, "Vector Lane Threading," icpp, pp.55-64, 2006 International Conference on Parallel Processing (ICPP'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.