System-level monitoring of floating-point performance to improve effective system utilization
State of the Practice Reports (SC '11)
By David L. Hart, Davide Del Vento, Richard Valent, Rory Kelly, Si Liu, Siddhartha S. Ghosh, Thomas Engel
Issue Date:November 2011
NCAR's Bluefire supercomputer is instrumented with a set of low-overhead processes that continually monitor the floating-point counters of its 3,840 batch-compute cores. We extract performance numbers for each batch job by correlating the data from corresp...