loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
3rd Euromicro Workshop on Parallel and Distributed Processing
A hierarchical locality algorithm for NUMA compilation
San Remo, Italy
January 25-January 27
ISBN: 0-8186-7031-2
M. O'Boyle, Dept. of Comput. Sci., Manchester Univ., UK
A compiler algorithm which exploits program locality and reduces the latency overhead in parallel hierarchical memory machines is described. By applying the appropriate transformation at different levels of the hierarchy, the amount of nonlocal accesses between processors is minimised. Similarly, the memory structure within a processor is exploited so reducing the amount of communication between local main memory and private cache. This algorithm is based on a compound sequence of transformations that goes beyond unimodular transformations described in previous Work. This algorithm can exploit locality in complex array accesses and general iteration spaces. Furthermore, by use of strip mining and a novel use of data alignment, excessive storage for temporaries can be prevented.
Index Terms:
parallel machines; distributed memory systems; program compilers; storage management; hierarchical locality algorithm; NUMA compilation; compiler algorithm; program locality; latency overhead; parallel hierarchical memory machines; nonlocal accesses; memory structure; compound sequenc; unimodular transformations; complex array accesses; general iteration spaces; strip mining; data alignment
Citation:
M. O'Boyle, "A hierarchical locality algorithm for NUMA compilation," pdp, pp.106, 3rd Euromicro Workshop on Parallel and Distributed Processing, 1995
Usage of this product signifies your acceptance of the Terms of Use.