loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06)
An Improved Systolic Architecture for LU Decomposition
Steamboat Springs, Colorado, USA
September 11-September 13
ISBN: 0-7695-2682-9
DaeGon Kim, Colorado State University
Sanjay V. Rajopadhye, Colorado State University
LU-Decomposition is a classic problem for which many systolic array implementations have been proposed, the best of which takes 3n-3 time on n2/2 PEs, for a dense, n?n matrix. In this paper, we first give a proof that if only nearest neighbor communication is allowed, this time is a lower bound. We then generalize it to 2n + n/k - 3 time if one allows k-bounded broadcasts (i.e., if it takes m/k time steps to broadcast a value to m destination nodes). We also present a new architecture with this improved execution time, which uses n2/2 PEs, each one consisting of two multiplier-subtractor units, but active only on alternate cycles. This leads to a speedup and efficiency of kn2/(6k+3) and 2k/(6k + 3) respectively. For k = 1, our proposed architecture achieves the performance of the best previously known systolic array implementation. Special cases of our results include similar improvements to algorithms for solving (upper and lower) triangular linear systems by (forward and backward) substitution.
Citation:
DaeGon Kim, Sanjay V. Rajopadhye, "An Improved Systolic Architecture for LU Decomposition," asap, pp.231-238, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.