The Community for Technology Leaders
2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2012)
New Brunswick, NJ, USA
Apr. 1, 2012 to Apr. 3, 2012
ISBN: 978-1-4673-1143-4
TABLE OF CONTENTS
Papers

Sponsors (PDF)

pp. iii

Message from the general chair (PDF)

Rajeev Balasubramonian , University of Utah, USA
pp. vii

Message from the program chair (PDF)

Vijayalakshmi Srinivasan , IBM T.J. Watson Research Center, USA
pp. viii

List of reviewers (PDF)

pp. x-xi

Stargazer: Automated regression-based GPU design space exploration (Abstract)

Wenhao Jia , Princeton University, USA
Kelly A. Shaw , University of Richmond, USA
Margaret Martonosi , Princeton University, USA
pp. 2-13

A mechanistic performance model for superscalar in-order processors (Abstract)

Lieven Eeckhout , ELIS Department, Ghent University, Belgium
Stijn Eyerman , ELIS Department, Ghent University, Belgium
Maximilien Breughe , ELIS Department, Ghent University, Belgium
pp. 14-24

An LTE Uplink Receiver PHY benchmark and subframe-based power management (Abstract)

Peter Brauer , Ericsson AB, Gothenburg, Sweden
Sally A. McKee , Chalmers University of Technology, Gothenburg, Sweden
Andras Vajda , Ericsson AB, Gothenburg, Sweden
Magnus Sjalander , Chalmers University of Technology, Gothenburg, Sweden
David Engdal , Ericsson AB, Gothenburg, Sweden
pp. 25-34

BigHouse: A simulation infrastructure for data center systems (Abstract)

Thomas F. Wenisch , Advanced Computer Architecture Lab, The University of Michigan, USA
David Meisner , Advanced Computer Architecture Lab, The University of Michigan, USA
Junjie Wu , Advanced Computer Architecture Lab, The University of Michigan, USA
pp. 35-45

A lightweight hybrid hardware/software approach for object-relative memory profiling (Abstract)

Yongbing Huang , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
Mingyu Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
Yungang Bao , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
Licheng Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
Zehan Cui , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
Guangming Tan , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China
pp. 46-57

Lynx: A dynamic instrumentation system for data-parallel applications on GPGPU architectures (Abstract)

Sudhakar Yalamanchili , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
Naila Farooqui , College of Computing, Georgia Institute of Technology, USA
Greg Eisenhauer , College of Computing, Georgia Institute of Technology, USA
Andrew Kerr , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
Karsten Schwan , College of Computing, Georgia Institute of Technology, USA
pp. 58-67

An FPGA-based multi-core platform for testing and analysis of architectural techniques (Abstract)

Resit Sendag , Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, 02881, USA
Will Simoneau , Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, 02881, USA
pp. 68-77

Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs (Abstract)

Babak Behzad , Department of Computer Science, University of Illinois at Urbana-Champaign, 61801, USA
Swapnil Ghike , Department of Computer Science, University of Illinois at Urbana-Champaign, 61801, USA
Ehsan Totoni , Department of Computer Science, University of Illinois at Urbana-Champaign, 61801, USA
Josep Torrellas , Department of Computer Science, University of Illinois at Urbana-Champaign, 61801, USA
pp. 78-87

Characterizing and evaluating a key-value store application on heterogeneous CPU-GPU systems (Abstract)

Tor M. Aamodt , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Tayler H. Hetherington , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Timothy G. Rogers , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Lisa Hsu , Advanced Micro Devices, Inc. (AMD), USA
Mike O'Connor , Advanced Micro Devices, Inc. (AMD), USA
pp. 88-98

Selective commitment and selective margin: Techniques to minimize cost in an IaaS cloud (Abstract)

Jiachen Xue , School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
Yu-Ju Hong , School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
Mithuna Thottethodi , School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
pp. 99-109

Exploiting temporal locality in network traffic using commodity multi-cores (Abstract)

Jordi Tubella , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
Antonio Gonzalez , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
Govind Sreekar Shenoy , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
pp. 110-111

Power and performance analysis of network traffic prediction techniques (Abstract)

Lizy K. John , University of Texas at Austin, USA
Muhammad Faisal Iqbal , University of Texas at Austin, USA
pp. 112-113

A cycle-level SIMT-GPU simulation framework (Abstract)

Chien-Wei Lo , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Yu-Jung Cheng , Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan
Po-Han Wang , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Chia-Lin Yang , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
pp. 114-115

Bandwidth bandit: Understanding memory contention (Abstract)

David Black-Schaffer , Uppsala University, Department of Information Technology, Sweden
Erik Hagersten , Uppsala University, Department of Information Technology, Sweden
Nikos Nikoleris , Uppsala University, Department of Information Technology, Sweden
David Eklov , Uppsala University, Department of Information Technology, Sweden
pp. 116-117

Performance modeling and characterization of large last level caches (Abstract)

Alan Bivens , IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Li Zhang , IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Parijat Dube , IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Michael Tsao , IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
pp. 118-119

SLA-guided energy savings for enterprise servers (Abstract)

Martin Dimitrov , Intel Corporation, USA
Kshitij A. Doshi , Intel Corporation, USA
Vlasia Anagnostopoulou , Department of Computer Science, University of California, Santa Barbara, USA
pp. 120-121

Understanding the communication characteristics in HBase: What are the fundamental bottlenecks? (Abstract)

Nusrat S. Islam , Department of Computer Science and Engineering, The Ohio State University, USA
Hao Wang , Department of Computer Science and Engineering, The Ohio State University, USA
Hari Subramoni , Department of Computer Science and Engineering, The Ohio State University, USA
Md. Wasi-ur-Rahman , Department of Computer Science and Engineering, The Ohio State University, USA
Chet Murthy , IBM T.J Watson Research Center, Yorktown Heights, NY, USA
Xiangyong Ouyang , Department of Computer Science and Engineering, The Ohio State University, USA
Dhabaleswar K. Panda , Department of Computer Science and Engineering, The Ohio State University, USA
Jithin Jose , Department of Computer Science and Engineering, The Ohio State University, USA
Jian Huang , Department of Computer Science and Engineering, The Ohio State University, USA
pp. 122-123

Data sharing in multi-threaded applications and its impact on chip design (Abstract)

Ahmad Samih , Dept. of Electrical and Computer Engineering, North Carolina State University, USA
Anil Krishna , Systems and Technology Group, International Business Machines, Inc., USA
Yan Solihin , Dept. of Electrical and Computer Engineering, North Carolina State University, USA
pp. 125-134

Using utility prediction models to dynamically choose program thread counts (Abstract)

Bruce R. Childers , Computer Science Department, University of Pittsburgh, USA
Ryan W. Moore , Computer Science Department, University of Pittsburgh, USA
pp. 135-144

Speedup stacks: Identifying scaling bottlenecks in multi-threaded applications (Abstract)

Kristof Du Bois , ELIS Department, Ghent University, Belgium
Lieven Eeckhout , ELIS Department, Ghent University, Belgium
Stijn Eyerman , ELIS Department, Ghent University, Belgium
pp. 145-155

Performance analysis of thread mappings with a holistic view of the hardware resources (Abstract)

Wei Wang , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
Lingjia Tang , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
Mary Lou Soffa , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
Tanima Dey , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
Jack W. Davidson , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
Jason Mars , Department of Computer Science, University of Virginia, Charlottesville, 22904, USA
pp. 156-167

A single-pass cache simulation methodology for two-level unified caches (Abstract)

Wei Zang , Department of Electrical and Computer Engineering, University of Florida, Gainesville, 32611, USA
Ann Gordon-Ross , Department of Electrical and Computer Engineering, University of Florida, Gainesville, 32611, USA
pp. 168-177

Fast and cycle-accurate modeling of a multicore processor (Abstract)

Muralidaran Vijayaraghavan , Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
Silas Boyd-Wickizer , Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
Arvind , Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
Asif Khan , Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
pp. 178-187

FPGA modeling of diverse superscalar processors (Abstract)

Eric Rotenberg , Department of Electrical and Computer Engineering, North Carolina State University, USA
Brandon H. Dwiel , Department of Electrical and Computer Engineering, North Carolina State University, USA
Niket K. Choudhary , Department of Electrical and Computer Engineering, North Carolina State University, USA
pp. 188-199

Evaluating FPGA-acceleration for real-time unstructured search (Abstract)

Martin Margala , University of Massachusetts Lowell, Lowell, USA
Wim Vanderbauwhede , Hewlett Packard, Houston, TX USA
SaiRahul Chalamalasetti , University of Massachusetts Lowell, Lowell, USA
Mitch Wright , Hewlett Packard, Houston, TX USA
Parthasarathy Ranganathan , Hewlet Packard Labs, Palo Alto, CA, USA
pp. 200-209

Combined profiling: A methodology to capture varied program behavior across multiple inputs (Abstract)

Jose Nelson Amaral , Dept. of Computing Science, University of Alberta, Edmonton, T6G 2E8, Canada
Paul Berube , Dept. of Computing Science, University of Alberta, Edmonton, T6G 2E8, Canada
pp. 210-220

Architectural characterization and similarity analysis of sunspider and Google's V8 Javascript benchmarks (Abstract)

Devesh Tiwari , Department of Electrical and Computer Engineering, North Carolina State University, USA
Yan Solihin , Department of Electrical and Computer Engineering, North Carolina State University, USA
pp. 221-232

Author index (PDF)

pp. 1-2
86 ms
(Ver 3.3 (11022016))