loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14
Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe
Santa Fe, New Mexico
April 26-April 30
ISBN: 0-7695-2132-0
Gorden Griem, Lawrence Berkeley National Laboratory
Leonid Oliker, Lawrence Berkeley National Laboratory
John Shalf, Lawrence Berkeley National Laboratory
Katherine Yelick, Lawrence Berkeley National Laboratory and University of California at Berkeley
The gap between peak and delivered performance for scientific applications running on microprocessor-based systems has grown considerably in recent years. The inability to achieve the desired performance even on a single processor is often attributed to an inadequate memory system, but without identification or quantification of a specific bottleneck. In this work, we use an adaptable synthetic benchmark to isolate application characteristics that cause a significant drop in performance, giving application programmers and architects information about possible optimizations. Our adaptable probe, called sqmat, uses only four parameters to capture key characteristics of scientific workloads: working-set size, computational intensity, indirection, and irregularity. This paper describes the implementation of sqmat and uses its tunable parameters to evaluate four leading 64-bit microprocessors that are popular building blocks for current high performance systems: Intel Itanium2, AMD Opteron, IBM Power3, and IBM Power4.
Citation:
Gorden Griem, Leonid Oliker, John Shalf, Katherine Yelick, "Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe," ipdps, vol. 15, pp.255a, 18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14, 2004
Usage of this product signifies your acceptance of the Terms of Use.