The Community for Technology Leaders
2009 IEEE International Symposium on Performance Analysis of Systems and Software (2009)
Boston, MA USA
Apr. 26, 2009 to Apr. 28, 2009
ISBN: 978-1-4244-4184-6
TABLE OF CONTENTS

Differentiating the roles of IR measurement and simulation for power and temperature-aware design (PDF)

Wei Huang , Departments of Computer Science, University of Virginia, Charlottesville, 22904, USA
Kevin Skadron , Departments of Computer Science, University of Virginia, Charlottesville, 22904, USA
Sudhanva Gurumurthi , Departments of Computer Science, University of Virginia, Charlottesville, 22904, USA
Robert J. Ribando , Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, 22904, USA
Mircea R. Stan , Electrical and Computer Engineering, University of Virginia, Charlottesville, 22904, USA
pp. 1-10

User- and process-driven dynamic voltage and frequency scaling (PDF)

Bin Lin , Department of EECS, Northwestern University, USA
Arindam Mallik , Department of EECS, Northwestern University, USA
Peter Dinda , Department of EECS, Northwestern University, USA
Gokhan Memik , Department of EECS, Northwestern University, USA
Robert Dick , Department of EECS, Northwestern University, USA
pp. 11-22

Accuracy of performance counter measurements (PDF)

Dmitrijs Zaparanuks , Faculty of Informatics, University of Lugano, Switzerland
Milan Jovic , Faculty of Informatics, University of Lugano, Switzerland
Matthias Hauswirth , Faculty of Informatics, University of Lugano, Switzerland
pp. 23-32

GARNET: A detailed on-chip network model inside a full-system simulator (PDF)

Niket Agarwal , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Tushar Krishna , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Li-Shiuan Peh , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Niraj K. Jha , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
pp. 33-42

Cetra: A trace and analysis framework for the evaluation of Cell BE systems (PDF)

Julio Merino , Departament d'Arquitectura de Computadors. Universitat Politècnica de Catalunya, C/Jordi Girona 1-3, Campus Nord. 08038 Barcelona, Spain
Lluc Alvarez , Departament d'Arquitectura de Computadors. Universitat Politècnica de Catalunya, C/Jordi Girona 1-3, Campus Nord. 08038 Barcelona, Spain
Marisa Gil , Departament d'Arquitectura de Computadors. Universitat Politècnica de Catalunya, C/Jordi Girona 1-3, Campus Nord. 08038 Barcelona, Spain
Nacho Navarro , Departament d'Arquitectura de Computadors. Universitat Politècnica de Catalunya, C/Jordi Girona 1-3, Campus Nord. 08038 Barcelona, Spain
pp. 43-52

Zesto: A cycle-level simulator for highly detailed microarchitecture exploration (PDF)

Gabriel H. Loh , Georgia Institute of Technology, College of Computing, USA
Samantika Subramaniam , Georgia Institute of Technology, College of Computing, USA
Yuejian Xie , Georgia Institute of Technology, College of Computing, USA
pp. 53-64

Lonestar: A suite of parallel irregular programs (PDF)

Milind Kulkarni , The University of Texas at Austin, USA
Martin Burtscher , The University of Texas at Austin, USA
Calin Cascaval , IBM T.J. Watson Research Center, USA
Keshav Pingali , The University of Texas at Austin, USA
pp. 65-76

Exploring speculative parallelism in SPEC2006 (PDF)

Venkatesan Packirisamy , University of Minnesota, Minneapolis, USA
Antonia Zhai , University of Minnesota, Minneapolis, USA
Wei-Chung Hsu , University of Minnesota, Minneapolis, USA
Pen-Chung Yew , University of Minnesota, Minneapolis, USA
Tin-Fook Ngai , Intel Corporation, USA
pp. 77-88

Machine learning based online performance prediction for runtime parallelization and task scheduling (PDF)

Jiangtian Li , Dept. of Computer Science, North Carolina State University, Raleigh, 27606, USA
Xiaosong Ma , Dept. of Computer Science, North Carolina State University, Raleigh, 27606, USA
Karan Singh , Computer Systems Laboratory, Cornell University, Ithaca, NY 14850, USA
Martin Schulz , Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, CA 94550, USA
Bronis R. de Supinski , Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, CA 94550, USA
Sally A. McKee , Department of Computer Science and Engineering, Chalmers University of Technology, Göteborg, Sweden
pp. 89-100

WARP: Enabling fast CPU scheduler development and evaluation (PDF)

Haoqiang Zheng , Columbia University, USA
Jason Nieh , Columbia University, USA
pp. 101-112

CMPSched$im: Evaluating OS/CMP interaction on shared cache management (PDF)

Jaideep Moses , Hardware Architecture Laboratory, Intel Corporation, USA
Konstantinos Aisopos , Princeton University, USA
Aamer Jaleel , Hardware Architecture Laboratory, Intel Corporation, USA
Ravi Iyer , Hardware Architecture Laboratory, Intel Corporation, USA
Ramesh Illikkal , Hardware Architecture Laboratory, Intel Corporation, USA
Don Newell , Hardware Architecture Laboratory, Intel Corporation, USA
Srihari Makineni , Hardware Architecture Laboratory, Intel Corporation, USA
pp. 113-122

Understanding the cost of thread migration for multi-threaded Java applications running on a multicore platform (PDF)

Qiming Teng , IBM China Research, China
Peter F. Sweeney , IBM TJ Watson Research Center, USA
Evelyn Duesterwald , IBM TJ Watson Research Center, USA
pp. 123-132

The data-centricity of Web 2.0 workloads and its impact on server performance (PDF)

Moriyoshi Ohara , IBM Research, Japan
Priya Nagpurkar , IBM Research, Japan
Yohei Ueda , IBM Research, Japan
Kazuaki Ishizaki , IBM Research, Japan
pp. 133-142

Characterizing and optimizing the memory footprint of de novo short read DNA sequence assembly (PDF)

Jeffrey J. Cook , Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Craig Zilles , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
pp. 143-152

An analytic model of optimistic Software Transactional Memory (PDF)

Armin Heindl , Department of Computer Science, University of Erlangen-Nuremberg, Germany
Gilles Pokam , Microprocessor Technology Lab, Intel Corporation, Santa Clara, CA, USA
Ali-Reza Adl-Tabatabai , Microprocessor Technology Lab, Intel Corporation, Santa Clara, CA, USA
pp. 153-162

Analyzing CUDA workloads using a detailed GPU simulator (PDF)

Ali Bakhoda , University of British Columbia, Vancouver, Canada
George L. Yuan , University of British Columbia, Vancouver, Canada
Wilson W. L. Fung , University of British Columbia, Vancouver, Canada
Henry Wong , University of British Columbia, Vancouver, Canada
Tor M. Aamodt , University of British Columbia, Vancouver, Canada
pp. 163-174

Evaluating GPUs for network packet signature matching (PDF)

Randy Smith , University of Wisconsin-Madison, USA
Neelam Goyal , University of Wisconsin-Madison, USA
Justin Ormont , University of Wisconsin-Madison, USA
Karthikeyan Sankaralingam , University of Wisconsin-Madison, USA
Cristian Estan , University of Wisconsin-Madison, USA
pp. 175-184

Online compression of cache-filtered address traces (PDF)

Pierre Michaud , INRIA Rennes - Bretagne Atlantique, Campus universitaire de Beaulieu, 35042 Cedex, France
pp. 185-194

Analysis of the TRIPS prototype block predictor (PDF)

Nitya Ranganathan , Department of Computer Sciences, The University of Texas at Austin, USA
Doug Burger , Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA
Stephen W. Keckler , Department of Computer Sciences, The University of Texas at Austin, USA
pp. 195-206

Experiment flows and microbenchmarks for reverse engineering of branch predictor structures (PDF)

Vladimir Uzelac , Electrical and Computer Engineering Department, The University of Alabama in Huntsville, USA
Aleksandar Milenkovic , Electrical and Computer Engineering Department, The University of Alabama in Huntsville, USA
pp. 207-217

Analyzing the impact of on-chip network traffic on program phases for CMPs (PDF)

Yu Zhang , EECS Department, Northwestern University, Evanston, IL, USA
Berkin Ozisikyilmaz , EECS Department, Northwestern University, Evanston, IL, USA
Gokhan Memik , EECS Department, Northwestern University, Evanston, IL, USA
John Kim , EECS Department, Northwestern University, Evanston, IL, USA
Alok Choudhary , EECS Department, Northwestern University, Evanston, IL, USA
pp. 218-226

SuiteSpecks and SuiteSpots: A methodology for the automatic conversion of benchmarking programs into intrinsically checkpointed assembly code (PDF)

Jeff Ringenberg , The University of Michigan, Electrical Engineering and Computer Science, USA
Trevor Mudge , The University of Michigan, Electrical Engineering and Computer Science, USA
pp. 227-237

Accurately approximating superscalar processor performance from traces (PDF)

Kiyeon Lee , Dept. of Computer Science, University of Pittsburgh, USA
Shayne Evans , Dept. of Computer Science, University of Pittsburgh, USA
Sangyeun Cho , Dept. of Computer Science, University of Pittsburgh, USA
pp. 238-248

QUICK: A flexible full-system functional model (PDF)

Dam Sunwoo , The University of Texas at Austin, USA
Joonsoo Kim , The University of Texas at Austin, USA
Derek Chiou , The University of Texas at Austin, USA
pp. 249-258
97 ms
(Ver 3.3 (11022016))