International Symposium on Code Generation and Optimization (CGO'07)
Structure Layout Optimization for Multithreaded Programs
San Jose, California
March 11-March 14
ISBN: 0-7695-2764-7
Structure layout optimizations seek to improve runtime performance by improving data locality and reuse. The structure layout heuristics for single-threaded benchmarks differ from those for multi-threaded applications running on multiprocessor machines, where the effects of false sharing need to be taken into account. In this paper we propose a technique for structure layout transformations for multithreaded applications that optimizes both for improved spatial locality and reduced false sharing, simultaneously. We develop a semi-automatic tool that produces actual structure layouts for multi-threaded programs and outputs the key factors contributing to the layout decisions. We apply this tool on the HP-UX kernel and demonstrate the effects of these transformations for a variety of already highly hand-tuned key structures with different set of properties. We show that na??ve heuristics can result in massive performance degradations on such a highly tuned application, while our technique generally avoids those pitfalls. The improved structures produced by our tool improve performance by up to 3.2% over a highly tuned baseline.
Citation:
Easwaran Raman, Robert Hundt, Sandya Mannarswamy, "Structure Layout Optimization for Multithreaded Programs," cgo, pp.271-282, International Symposium on Code Generation and Optimization (CGO'07), 2007