The Community for Technology Leaders
2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2016)
Uppsala, Sweden
April 17, 2016 to April 19, 2016
ISBN: 978-1-5090-1952-6
TABLE OF CONTENTS

Table of contents (PDF)

pp. iii-v

Message from the general chair (Abstract)

Erik Hagersten , Uppsala University, Sweden
pp. vi

Message from the program chair (Abstract)

Andreas Moshovos , University of Toronto, Canada
pp. vii

Committees (PDF)

pp. viii-ix

Keynote abstracts (PDF)

pp. x-xii

Performance analysis of accelerated biophysically-meaningful neuron simulations (Abstract)

Georgios Smaragdos , Neuroscience dept., Erasmus Medical Center, The Netherlands
Georgios Chatzikostantis , MicroLab, National Technical University of Athens (NTUA), Greece
Sofia Nomikou , MicroLab, National Technical University of Athens (NTUA), Greece
Dimitrios Rodopoulos , MicroLab, National Technical University of Athens (NTUA), Greece
Ioannis Sourdis , Computer Science and Engineering dept., Chalmers University of Technology, Sweden
Dimitrios Soudris , MicroLab, National Technical University of Athens (NTUA), Greece
Chris I. De Zeeuw , Neuroscience dept., Erasmus Medical Center, The Netherlands
Christos Strydis , Neuroscience dept., Erasmus Medical Center, The Netherlands
pp. 1-11

DVFS performance prediction for managed multithreaded applications (Abstract)

Shoaib Akram , Ghent University, Belgium
Jennifer B. Sartor , Ghent University, Belgium
Lieven Eeckhout , Ghent University, Belgium
pp. 12-23

Addressing service interruptions in memory with thread-to-rank assignment (Abstract)

Manjunath Shevgoor , University of Utah, United States
Rajeev Balasubramonian , University of Utah, United States
Niladrish Chatterjee , NVIDIA, United States
Jung-Sik Kim , Samsung Electronics, South Korea
pp. 24-35

Characterization and bottleneck analysis of a 64-bit ARMv8 platform (Abstract)

Michael A. Laurenzano , University of Michigan, United States
Ananta Tiwari , EP Analytics, United States
Allyson Cauble-Chantrenne , EP Analytics, United States
Adam Jundt , EP Analytics, United States
William A. Ward , HPC Modernization Program, Department of Defense, United States
Roy Campbell , HPC Modernization Program, Department of Defense, United States
Laura Carrington , EP Analytics, United States
pp. 36-45

Analyzing the energy-efficiency of sparse matrix multiplication on heterogeneous systems: A comparative study of GPU, Xeon Phi and FPGA (Abstract)

Heiner Giefers , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland
Peter Staar , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland
Costas Bekas , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland
Christoph Hagleitner , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland
pp. 46-56

FastCap: An efficient and fair algorithm for power capping in many-core systems (Abstract)

Yanpei Liu , Facebook Inc., United States
Guilherme Cox , Rutgers University, United States
Qingyuan Deng , Facebook Inc., United States
Stark C. Draper , University of Toronto, Canada
Ricardo Bianchini , Microsoft Research, United States
pp. 57-68

Anatomy of microarchitecture-level reliability assessment: Throughput and accuracy (Abstract)

Athanasios Chatzidimitriou , Department of Informatics & Telecommunications, University of Athens, Greece
Dimitris Gizopoulos , Department of Informatics & Telecommunications, University of Athens, Greece
pp. 69-78

EmerGPU: Understanding and mitigating resonance-induced voltage noise in GPU architectures (Abstract)

Renji Thomas , Computer Science and Engineering, The Ohio State University, United States
Naser Sedaghati , Computer Science and Engineering, The Ohio State University, United States
Radu Teodorescu , Computer Science and Engineering, The Ohio State University, United States
pp. 79-89

GUFI: A framework for GPUs reliability assessment (Abstract)

Sotiris Tselonis , University of Athens, Department of Informatics & Telecommunications, Greece
Dimitris Gizopoulos , University of Athens, Department of Informatics & Telecommunications, Greece
pp. 90-100

Splash-3: A properly synchronized benchmark suite for contemporary research (Abstract)

Carl Leonardsson , Uppsala University, Sweden
Stefanos Kaxiras , Uppsala University, Sweden
Alberto Ros , Universidad de Murcia, Spain
pp. 101-111

Workload characterization and optimization of TPC-H queries on Apache Spark (Abstract)

Tatsuhiro Chiba , IBM Research - Tokyo, 19-21, Nihonbashi Hakozaki-cho, Chuo-ku, 103-8510, Japan
Tamiya Onodera , IBM Research - Tokyo, 19-21, Nihonbashi Hakozaki-cho, Chuo-ku, 103-8510, Japan
pp. 112-121

Demystifying cloud benchmarking (Abstract)

Tapti Palit , Department of Computer Science, Stony Brook University, United States
Yongming Shen , Department of Computer Science, Stony Brook University, United States
Michael Ferdman , Department of Computer Science, Stony Brook University, United States
pp. 122-132

Analysis of PARSEC workload scalability (Abstract)

Gabriel Southern , Dept. of Computer Engineering, University of California, Santa Cruz, United States
Jose Renau , Dept. of Computer Engineering, University of California, Santa Cruz, United States
pp. 133-142

MLC PCM main memory with accelerated read (Abstract)

Mohammad Arjomand , The Pennsylvania State University, 16802, USA
Amin Jadidi , The Pennsylvania State University, 16802, USA
Mahmut T. Kandemir , The Pennsylvania State University, 16802, USA
Anand Sivasubramaniam , The Pennsylvania State University, 16802, USA
Chita Das , The Pennsylvania State University, 16802, USA
pp. 143-144

Characterization and architectural implications of big data workloads (Abstract)

Lei Wang , Institute of Computing Technology, Chinese Academy of Sciences, China
Rui Ren , Institute of Computing Technology, Chinese Academy of Sciences, China
Jianfeng Zhan , Institute of Computing Technology, Chinese Academy of Sciences, China
Zhen Jia , Institute of Computing Technology, Chinese Academy of Sciences, China
pp. 145-146

Elastic traces for fast and accurate system performance exploration (Abstract)

Radhika Jagtap , ARM Research, Cambridge, U.K.
Stephan Diestelhorst , ARM Research, Cambridge, U.K.
Andreas Hansson , ARM Research, Cambridge, U.K.
pp. 147-148

CoolSim: Eliminating traditional cache warming with fast, virtualized profiling (Abstract)

Nikos Nikoleris , Department of Information Technology, Uppsala University, Sweden
Andreas Sandberg , Department of Information Technology, Uppsala University, Sweden
Erik Hagersten , Department of Information Technology, Uppsala University, Sweden
Trevor E. Carlson , Department of Information Technology, Uppsala University, Sweden
pp. 149-150

Compositional model of coherence and NUMA effects for optimizing thread and data placement (Abstract)

Hao Luo , Department of Computer Science, University of Rochester, United States
Jacob Brock , Department of Computer Science, University of Rochester, United States
Pengcheng Li , Department of Computer Science, University of Rochester, United States
Chen Ding , Department of Computer Science, University of Rochester, United States
Chencheng Ye , College of Computer Science and Technology, Huazhong University of Science and Technology, China
pp. 151-152

Characterizing Hadoop applications on microservers for performance and energy efficiency optimizations (Abstract)

Maria Malik , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
Avesta Sasan , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
Rajiv Joshi , IBM Research, Yorktown Heights, NY, United States
Setareh Rafatirah , Department of Information Sciences and Technology, George Mason University, Fairfax, VA, USA
Houman Homayoun , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
pp. 153-154

RTHpower: Accurate fine-grained power models for predicting race-to-halt effect on ultra-low power embedded systems (Abstract)

Vi Ngoc-Nha Tran , Department of Computer Science, UiT The Arctic University of Norway, Tromso, Norway
Brendan Barry , Movidius Ltd., Dublin, Ireland
Phuong Hoai Ha , Department of Computer Science, UiT The Arctic University of Norway, Tromso, Norway
pp. 155-156

Agave: A benchmark suite for exploring the complexities of the Android software stack (Abstract)

Martin K. Brown , Florida State University, United States
Zachary Yannes , Florida State University, United States
Michael Lustig , Florida State University, United States
Mazdak Sanati , Chalmers University of Technology, Sweden
Sally A. McKee , Chalmers University of Technology, Sweden
Gary S. Tyson , Florida State University, United States
Steven K. Reinhardt , AMD Research, United States
pp. 157-158

Storage consolidation: Not always a panacea, but can we ease the pain? (Abstract)

Narges Shahidi , The Pennsylvania State University, 16802, USA
Mohammad Arjomand , The Pennsylvania State University, 16802, USA
Anand Sivasubramaniam , The Pennsylvania State University, 16802, USA
Mahmut T. Kandemir , The Pennsylvania State University, 16802, USA
Chita Das , The Pennsylvania State University, 16802, USA
pp. 159-160

Observations and opportunities in architecting shared virtual memory for heterogeneous systems (Abstract)

Jan Vesely , AMD Research, Advanced Micro Devices, Inc., United States
Arkaprava Basu , AMD Research, Advanced Micro Devices, Inc., United States
Mark Oskin , AMD Research, Advanced Micro Devices, Inc., United States
Gabriel H. Loh , AMD Research, Advanced Micro Devices, Inc., United States
Abhishek Bhattacharjee , Department of Computer Science, Rutgers University, United States
pp. 161-171

GSI: A GPU Stall Inspector to characterize the sources of memory stalls for tightly coupled GPUs (Abstract)

Johnathan Alsop , University of Illinois at Urbana-Champaign, United States
Matthew D. Sinclair , University of Illinois at Urbana-Champaign, United States
Rakesh Komuravelli , Qualcomm Technologies, Inc., United States
Sarita V. Adve , University of Illinois at Urbana-Champaign, United States
pp. 172-182

A comprehensive performance analysis of HSA and OpenCL 2.0 (Abstract)

Saoni Mukherjee , Northeastern University, Boston, MA, United States
Yifan Sun , Northeastern University, Boston, MA, United States
Paul Blinzer , Advanced Micro Devices, Sunnyvale, CA, United States
Amir Kavyan Ziabari , Northeastern University, Boston, MA, United States
David Kaeli , Northeastern University, Boston, MA, United States
pp. 183-193

OpenSoC Fabric: On-chip network generator (Abstract)

Farzad Fatollahi-Fard , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, CA 94720, United States
David Donofrio , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, CA 94720, United States
George Michelogiannakis , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, CA 94720, United States
John Shalf , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, CA 94720, United States
pp. 194-203

AnyCore: A synthesizable RTL model for exploring and fabricating adaptive superscalar cores (Abstract)

Rangeen Basu Roy Chowdhury , Department of Electrical and Computer Engineering, North Carolina State University, United States
Anil K. Kannepalli , Department of Electrical and Computer Engineering, North Carolina State University, United States
Sungkwan Ku , Department of Electrical and Computer Engineering, North Carolina State University, United States
Eric Rotenberg , Department of Electrical and Computer Engineering, North Carolina State University, United States
pp. 214-224

Performance analysis of a hardware accelerator of dependence management for task-based dataflow programming models (Abstract)

Xubin Tan , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
Jaume Bosch , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
Daniel Jimenez-Gonzalez , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
Carlos Alvarez-Martinez , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
Eduard Ayguade , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
Mateo Valero , Barcelona Supercomputing Center, Universitat Politécnica de Catalunya, Spain
pp. 225-234

Evaluating asymmetric multiprocessing for mobile applications (Abstract)

Songchun Fan , Duke University, United States
Benjamin C. Lee , Duke University, United States
pp. 235-244

MofySim: A mobile full-system simulation framework for energy consumption and performance analysis (Abstract)

Minho Ju , Samsung Electronics Co., Ltd., Suwon, Korea
Hyeonggyu Kim , School of Computing, KAIST, Daejeon, Korea
Soontae Kim , School of Computing, KAIST, Daejeon, Korea
pp. 245-254

NoMali: Simulating a realistic graphics driver stack using a stub GPU (Abstract)

Rene de Jong , ARM Research, Cambridge, United States
Andreas Sandberg , ARM Research, Cambridge, United States
pp. 255-262

X-Mem: A cross-platform and extensible memory characterization tool for the cloud (Abstract)

Mark Gottscho , Electrical Engineering Department, University of California, Los Angeles, USA
Sriram Govindan , Microsoft, Redmond, WA, USA
Bikash Sharma , Microsoft, Redmond, WA, USA
Mohammed Shoaib , Microsoft Research, Redmond, WA, USA
Puneet Gupta , Electrical Engineering Department, University of California, Los Angeles, USA
pp. 263-273

Interactive visualization of cross-layer performance anomalies in dynamic task-parallel applications and systems (Abstract)

Andi Drebes , The University of Manchester, School of Computer Science, United Kingdom
Antoniu Pop , The University of Manchester, School of Computer Science, United Kingdom
Karine Heydemann , Sorbonne Universités, UPMC Paris 06, CNRS, UMR 7606, LIP6, France
Albert Cohen , INRIA and DI, École Normale Supérieure, Paris, France
pp. 274-283

JIT-assisted fast-forward embedding and instrumentation to enable fast, accurate, and agile simulation (Abstract)

Berkin Ilbeyi , School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, United States
Christopher Batten , School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, United States
pp. 284-295

TaskPoint: Sampled simulation of task-based programs (Abstract)

Thomas Grass , Universitat Politècnica de Catalunya, Spain
Alejandro Rico , ARM Inc., United Kingdom
Marc Casas , Barcelona Supercomputing Center, Spain
Miquel Moreto , Universitat Politècnica de Catalunya, Spain
Eduard Ayguade , Universitat Politècnica de Catalunya, Spain
pp. 296-306

An automated framework for characterizing and subsetting GPGPU workloads (Abstract)

Vignesh Adhinarayanan , Department of Computer Science, Virginia Tech, Blacksburg, 24061, U.S.A.
Wu-chun Feng , Department of Computer Science, Virginia Tech, Blacksburg, 24061, U.S.A.
pp. 307-317
84 ms
(Ver 3.3 (11022016))