loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14
Memory Performance Model for Loops and Kernels on Power3 Processors
Santa Fe, New Mexico
April 26-April 30
ISBN: 0-7695-2132-0
Wayne Pfeiffer, San Diego Supercomputer Center
A performance model for loops and kernels limited by memory access is developed that is applicable to Power3 processors. The output of the model is the time delay arising from cache and TLB misses. The input variables are the miss rates of each cache and the TLB, while the model parameters are the miss penalties of each cache and the TLB. Load misses are treated separately from store misses and typically have smaller penalties because of prefetching. The parameters have been obtained by fits to data from simple test loops measured with a hardware performance monitor. Results are presented for two types of Power3 processor running in serial as well as for one of the processor types running in parallel. For codes limited by store misses, the model fits the data very well. For codes limited by load misses, the model shows greater variability relative to the data, presumably because of the limited treatment of prefetching.
Citation:
Wayne Pfeiffer, "Memory Performance Model for Loops and Kernels on Power3 Processors," ipdps, vol. 15, pp.252a, 18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14, 2004
Usage of this product signifies your acceptance of the Terms of Use.