The Community for Technology Leaders
2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2015)
San Francisco, CA, USA
Feb. 7, 2015 to Feb. 11, 2015
ISBN: 978-1-4799-8161-8
TABLE OF CONTENTS

[Front cover] (PDF)

pp. c1

[Title page] (PDF)

pp. 1

Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS (Abstract)

Qing Jiao , National University of Singapore
Mian Lu , Institute of High Performance Computing, A∗STAR
Huynh Phung Huynh , Institute of High Performance Computing, A∗STAR
Tulika Mitra , National University of Singapore
pp. 1-11

Characterizing and enhancing global memory data coalescing on GPUs (Abstract)

Naznin Fauzia , The Ohio State University
Louis-Noel Pouchet , The Ohio State University
P. Sadayappan , The Ohio State University
pp. 12-22

Automatic data placement into GPU on-chip memory resources (Abstract)

Chao Li , Department of Electrical and Computer Engineering, North Carolina State University
Yi Yang , Department of Computer Systems Architecture, NEC Labs
Zhen Lin , Department of Electrical and Computer Engineering, North Carolina State University
Huiyang Zhou , Department of Electrical and Computer Engineering, North Carolina State University
pp. 23-33

A parallel abstract interpreter for JavaScript (Abstract)

Kyle Dewey , University of California, Santa Barbara
Vineeth Kashyap , University of California, Santa Barbara
Ben Hardekopf , University of California, Santa Barbara
pp. 34-45

On performance debugging of unnecessary lock contentions on multicore processors: A replay-based approach (Abstract)

Long Zheng , Services Computing Technology and System Lab, Cluster and Grid, Computing Lab, Huazhong University of Science and Technology, China
Xiaofei Liao , Services Computing Technology and System Lab, Cluster and Grid, Computing Lab, Huazhong University of Science and Technology, China
Bingsheng He , School of Computer Engineering, Nanyang Technological University, Singapore
Song Wu , Services Computing Technology and System Lab, Cluster and Grid, Computing Lab, Huazhong University of Science and Technology, China
Hai Jin , Services Computing Technology and System Lab, Cluster and Grid, Computing Lab, Huazhong University of Science and Technology, China
pp. 56-67

Optimizing binary translation of dynamically generated code (Abstract)

Byron Hawkins , University of California, Irvine
Brian Demsky , University of California, Irvine
Derek Bruening , Google, Inc
Qin Zhao , Google, Inc
pp. 68-78

Getting in control of your control flow with control-data isolation (Abstract)

William Arthur , University of Michigan
Ben Mehne , University of California - Berkeley
Reetuparna Das , University of Michigan
Todd Austin , University of Michigan
pp. 79-90

Reactive tiling (Abstract)

Jithendra Srinivas , Intel Corporation
Wei Ding , The Pennsylvania State University-University Park, USA
Mahmut Kandemir , The Pennsylvania State University-University Park, USA
pp. 91-102

Optimizing the flash-RAM energy trade-off in deeply embedded systems (Abstract)

James Pallister , University of Bristol
Kerstin Eder , University of Bristol
Simon J. Hollis , University of Bristol
pp. 115-124

Optimizing and auto-tuning scale-free sparse matrix-vector multiplication on Intel Xeon Phi (Abstract)

Wai Teng Tang , Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore
Ruizhe Zhao , Center for Energy-Efficient Computing and Applications, School of EECS, Peking University, China
Mian Lu , Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore
Yun Liang , Center for Energy-Efficient Computing and Applications, School of EECS, Peking University, China
Huynh Phung Huyng , Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore
Xibai Li , Center for Energy-Efficient Computing and Applications, School of EECS, Peking University, China
Rick Siow Mong Goh , Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore
pp. 136-145

Data provenance tracking for concurrent programs (Abstract)

Brandon Lucia , Carnegie Mellon University, Department of Electrical and Computer Engineering
Luis Ceze , University of Washington, Department of Computer Science and Engineering
pp. 146-156

Locality aware concurrent start for stencil applications (Abstract)

Sunil Shrestha , University of Delaware
Guang R. Gao , University of Delaware
Joseph Manzano , Pacific Nothwest National Laboratory
Andres Marquez , Pacific Nothwest National Laboratory
John Feo , Pacific Nothwest National Laboratory
pp. 157-166

Checking correctness of code generator architecture specifications (Abstract)

Niranjan Hasabnis , Stony Brook University, NY
Rui Qiao , Stony Brook University, NY
R. Sekar , Stony Brook University, NY
pp. 167-178

Snapshot-based loading-time acceleration for web applications (Abstract)

JinSeok Oh , School of Electrical Engineering and Computer Science, Seoul National University, Seoul 151-744, Korea
Soo-Mook Moon , School of Electrical Engineering and Computer Science, Seoul National University, Seoul 151-744, Korea
pp. 179-189

PSLP: Padded SLP automatic vectorization (Abstract)

Vasileios Porpodas , Computer Laboratory, University of Cambridge
Alberto Magni , School of Informatics, University of Edinburgh
Timothy M. Jones , Computer Laboratory, University of Cambridge
pp. 190-201

A graph-based higher-order intermediate representation (Abstract)

Roland Leisa , Department of Computer Science, Saarland University
Marcel Koster , Department of Computer Science, Saarland University
Sebastian Hack , Department of Computer Science, Saarland University
pp. 202-212

Scalable conditional induction variables (CIV) analysis (Abstract)

Cosmin E. Oancea , Department of Computer Science University of Copenhagen
Lawrence Rauchwerger , Department of Computer Science and Engineering Texas A & M University
pp. 213-224

Approximating flow-sensitive pointer analysis using frequent itemset mining (Abstract)

Vaivaswatha Nagaraj , Indian Institute of Science
R. Govindarajan , Indian Institute of Science
pp. 225-234

HELIX-UP: Relaxing program semantics to unleash parallelization (Abstract)

Simone Campanoni , Harvard University
Glenn Holloway , Harvard University
Gu-Yeon Wei , Harvard University
David Brooks , Harvard University
pp. 235-245

Hermes: A fast cross-ISA binary translator with post-optimization (Abstract)

Xiaochun Zhang , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Qi Guo , Carnegie Mellon University, Pittsburgh, PA, USA
Yunji Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Tianshi Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Weiwu Hu , State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
pp. 246-256

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures (Abstract)

Hee-Seok Kim , University of Illinois at Urbana-Champaign
Izzat El Hajj , University of Illinois at Urbana-Champaign
John Stratton , Colgate University
Steven Lumetta , University of Illinois at Urbana-Champaign
Wen-Mei Hwu , University of Illinois at Urbana-Champaign
pp. 257-268
80 ms
(Ver 3.3 (11022016))