Sixth IEEE International Conference on Computer and Information Technology (CIT'06) Matrix Transpose on 2D Torus Array Processor Seoul, Korea September 20-September 22 ISBN: 0-7695-2687-X
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CIT.2006.117
Previously, we represented the index space of the (n?n)- matrix multiply-add problem C=C+A?B as a 3D torus, where A, B, and C are rolled along the corresponding axes of the index space. All optimal 2D data allocations (resulted from projection) to solve the problem on the n?n torus array processor in n multiply-add-roll steps were obtained. In this paper, we formulate the operations needed for aligning both the data before computing and the results after computing as matrix multiply-add problems. These alignment operations are combined with the optimal data allocations that solve the matrix multiply-add problem to propose new algorithms to transpose an n?n matrix on the n?n torus array processor in O(n) multiply-add-roll steps. Using the proposed algorithms, we showed different approaches to solve the transposed matrix multiply-add problem, C=C+A^T?B^T , on the 2D torus array processor.
Citation:
Ahmed S. Zekri, Stanislav G. Sedukhin, "Matrix Transpose on 2D Torus Array Processor," cit, pp.45, Sixth IEEE International Conference on Computer and Information Technology (CIT'06), 2006 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||