The Community for Technology Leaders
2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2013)
Davis, CA, USA
Dec. 7, 2013 to Dec. 11, 2013
ISBN: 978-1-5090-6603-2
TABLE OF CONTENTS

Front matters (Abstract)

pp. i-xv

Quality programmable vector processors for approximate computing (Abstract)

Swagath Venkataramani , School of Electrical and Computer Engineering, Purdue University
Vinay K. Chippa , School of Electrical and Computer Engineering, Purdue University
Srimat T. Chakradhar , Systems Architecture Department, NEC Laboratories, America
Kaushik Roy , School of Electrical and Computer Engineering, Purdue University
Anand Raghunathan , School of Electrical and Computer Engineering, Purdue University
pp. 1-12

SAGE: Self-tuning approximation for graphics engines (Abstract)

Mehrzad Samadi , Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI
Janghaeng Lee , Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI
D. Anoushe Jamshidi , Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI
Amir Hormati , Google Inc, Seattle, WA
Scott Mahlke , Advanced Computer Architecture Laboratory, University of Michigan - Ann Arbor, MI
pp. 13-24

Approximate storage in solid-state memories (Abstract)

Adrian Sampson , University of Washington
Jacob Nelson , University of Washington
Karin Strauss , Microsoft Research
Luis Ceze , University of Washington
pp. 25-36

MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP (Abstract)

Yuya Kora , Department of Computational Science, Nagoya University, Nagoya, Aichi, Japan
Kyohei Yamaguchi , Department of Electrical Engineering and Computer Science, Nagoya University, Nagoya, Aichi, Japan
Hideki Ando , Department of Electrical Engineering and Computer Science, Nagoya University, Nagoya, Aichi, Japan
pp. 37-48

TLC: A tag-less cache for reducing dynamic first level cache energy (Abstract)

Andreas Sembrant , Uppsala University, Department of Information Technology, P.O. Box 337, SE-751 05, Uppsala, Sweden
Erik Hagersten , Uppsala University, Department of Information Technology, P.O. Box 337, SE-751 05, Uppsala, Sweden
David Black-Shaffer , Uppsala University, Department of Information Technology, P.O. Box 337, SE-751 05, Uppsala, Sweden
pp. 49-61

Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching (Abstract)

Somayeh Sardashti , Computer Sciences Department, University of Wisconsin-Madison
David A. Wood , Computer Sciences Department, University of Wisconsin-Madison
pp. 62-73

Exploiting GPU peak-power and performance tradeoffs through reduced effective pipeline latency (Abstract)

Syed Zohaib Gilani , Department of Electrical and Computer Engineering, the University of Wisconsin-Madison
Nam Sung Kim , Department of Electrical and Computer Engineering, the University of Wisconsin-Madison
Michael J. Schulte , AMD Research, Advanced Micro Devices, Inc.
pp. 74-85

A locality-aware memory hierarchy for energy-efficient GPU architectures (Abstract)

Minsoo Rhu , Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, Texas 78712-1684
Michael Sullivan , Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, Texas 78712-1684
Jingwen Leng , Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, Texas 78712-1684
Mattan Erez , Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, Texas 78712-1684
pp. 86-98

Divergence-Aware Warp Scheduling (Abstract)

Timothy G. Rogers , Department of Computer and Electrical Engineering, University of British Columbia
Mike O'Connor , NVIDIA Research
Tor M. Aamodt , Department of Computer and Electrical Engineering, University of British Columbia
pp. 99-110

Warped gates: Gating aware scheduling and power gating for GPGPUs (Abstract)

Mohammad Abdel-Majeed , Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089
Daniel Wong , Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089
Murali Annavaram , Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089
pp. 111-122

Use it or lose it: Wear-out and lifetime in future chip multiprocessors (Abstract)

Hyungjun Kim , Department of Electrical and Computer Engineering, Texas A&M University
Arseniy Vitkovskiy , Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology
Paul V. Gratz , Department of Electrical and Computer Engineering, Texas A&M University
Vassos Soteriou , Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology
pp. 136-147

uDIREC: Unified diagnosis and reconfiguration for frugal bypass of NoC faults (Abstract)

Ritesh Parikh , University of Michigan, Ann Arbor, MI - 48109
Valeria Bertacco , University of Michigan, Ann Arbor, MI - 48109
pp. 148-159

Implicit-storing and redundant-encoding-of-attribute information in error-correction-codes (Abstract)

Yiannakis Sazeides , University of Cyprus
Emre Ozer , ARM
Panagiota Nikolaou , University of Cyprus
Marios Kleanthous , University of Cyprus
Jaume Abella , Barcelona Supercomputing Center
pp. 160-171

Linearly compressed pages: A low-complexity, low-latency main memory compression framework (Abstract)

Gennady Pekhimnko , Carnegie Mellon University
Vivek Seshadri , Carnegie Mellon University
Yoonqu Kim , Carnegie Mellon University
Hongyi Xin , Carnegie Mellon University
Onur Mutlu , Carnegie Mellon University
Phillip B. Gibbons , Intel Labs Pittsburqh
Michael A. Kozuch , Intel Labs Pittsburqh
Todd C. Mowry , Carnegie Mellon University
pp. 172-184

RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization (Abstract)

Vivek Seshadri , Carnegie Mellon University
Yoongu Kim , Carnegie Mellon University
Chris Fallin , Intel Corporation
Donghyuk Lee , Carnegie Mellon University
Rachata Ausavarungnirun , Carnegie Mellon University
Gennady Pekhimenko , Carnegie Mellon University
Yixin Luo , Carnegie Mellon University
Onur Mutlu , Carnegie Mellon University
Phillip B. Gibbons , Intel Pittsburgh
Michael A. Kozuch , Intel Pittsburgh
Todd C. Mowry , Carnegie Mellon University
pp. 185-197

Quantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory device (Abstract)

Manjunath Shevgoor , University of Utah
Jung-Sik Kim , Memory Division, Samsung Electronics
Niladrish Chatterjee , University of Utah
Rajeev Balasubramonian , University of Utah
Al Davis , University of Utah
pp. 198-209

Crank it up or dial it down: Coordinated multiprocessor frequency and folding control (Abstract)

Augusto Vega , IBM Corporation, Research Division
Alper Buyuktosunoglu , IBM Corporation, Research Division
Heather Hanson , IBM Corporation, Systems & Technology Group
Pradip Bose , IBM Corporation, Research Division
Srinivasan Ramani , IBM Corporation, Systems & Technology Group
pp. 210-221

Wavelength stealing: An opportunistic approach to channel sharing in multi-chip photonic interconnects (Abstract)

Arslan Zulfiqar , University of Wisconsin-Madison
Pranay Koka , Oracle Labs
Herb Schwetman , Oracle Labs
Mikko Lipasti , University of Wisconsin-Madison
Xuezhe Zheng , Oracle Labs
Ashok Krishnamoorthy , Oracle Labs
pp. 222-233

DESC: Energy-efficient data exchange using synchronized counters (Abstract)

Mahdi Nazm Bojnordi , University of Rochester, Rochester, NY 14627 USA
Engin Ipek , University of Rochester, Rochester, NY 14627 USA
pp. 234-246

Linearizing irregular memory accesses for improved correlated prefetching (Abstract)

Akanksha Jain , Department of Computer Science, The University of Texas at Austin, Austin, Texas 78712, USA
Calvin Lin , Department of Computer Science, The University of Texas at Austin, Austin, Texas 78712, USA
pp. 247-259

RDIP: Return-address-stack Directed Instruction Prefetching (Abstract)

Aasheesh Kolli , University of Michigan
Ali Saidi , ARM
Thomas F. Wenisch , University of Michigan
pp. 260-271

SHIFT: Shared history instruction fetch for lean-core server processors (Abstract)

Cansu Kaynak , EcoCloud, EPFL
Boris Grot , University of Edinburgh
Babak Falsafi , EcoCloud, EPFL
pp. 272-283

Insertion and promotion for tree-based PseudoLRU last-level caches (Abstract)

Daniel A. Jimenez , Department of Computer Science and Engineering, Texas A&M University
pp. 284-296

Imbalanced cache partitioning for balanced data-parallel programs (Abstract)

Abhisek Pan , School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
Vijay S. Pai , School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
pp. 297-309

The reuse cache: Downsizing the shared last-level cache (Abstract)

Jorge Albericio , University of Toronto
Pablo Ibanez , University of Zaragoza
Victor Vinals , University of Zaragoza
Jose M. Llaberia , UPC Barcelona Tech
pp. 310-321

Enabling datacenter servers to scale out economically and sustainably (Abstract)

Chao Li , Intelligent Design of Efficient Architectures Laboratory (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
Yang Hu , Intelligent Design of Efficient Architectures Laboratory (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
Ruijin Zhou , Intelligent Design of Efficient Architectures Laboratory (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
Ming Liu , Intelligent Design of Efficient Architectures Laboratory (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
Longjun Liu , Xi'an Jiaotong University, China
Jingling Yuan , Wuhan University of Technology, China
Tao Li , Intelligent Design of Efficient Architectures Laboratory (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
pp. 322-333

Efficient multiprogramming for multicores with SCAF (Abstract)

Timothy Creech , Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD 20742
Aparna Kotha , Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD 20742
Rajeev Barua , Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD 20742
pp. 334-345

Allocating rotating registers by scheduling (Abstract)

Hongbo Rong , Programming Systems Lab, Intel Labs
Hyunchul Park , Programming Systems Lab, Intel Labs
Cheng Wang , Programming Systems Lab, Intel Labs
Youfeng Wu , Programming Systems Lab, Intel Labs
pp. 346-358

Multi-grain coherence directories (Abstract)

Jason Zebchuk , Department of Electrical and Computer Engineering, University of Toronto
Babak Falsafi , EcoCloud, EPFL
Andreas Moshovos , Department of Electrical and Computer Engineering, University of Toronto
pp. 359-370

BulkCommit: Scalable and fast commit of atomic blocks in a lazy multiprocessor environment (Abstract)

Xuehai Qian , University of Illinois, USA
Josep Torrellas , University of Illinois, USA
Benjamin Sahelices , Universidad de Valladolid, Spain
Depei Qian , Beihang University, China
pp. 371-382

Efficient management of last-level caches in graphics processors for 3D scene rendering workloads (Abstract)

Jayesh Gaur , Intel Architecture Group, Bangalore 560103, India
Raghuram Srinivasan , The Ohio State University, Columbus, OH 43210, USA
Sreenivas Subramoney , Intel Architecture Group, Bangalore 560103, India
Mainak Chaudhuri , Indian Institute of Technology, Kanpur 208016, India
pp. 395-407

Energy efficient GPU transactional memory via space-time optimizations (Abstract)

Wilson W. L. Fung , Department of Computer and Electrical Engineering, University of British Columbia
Tor M. Aamodt , Department of Computer and Electrical Engineering, University of British Columbia
pp. 408-420

Kiln: Closing the performance gap between systems with and without persistence support (Abstract)

Jishen Zhao , Pennsylvania State University
Sheng Li , Hewlett-Packard Labs
Doe Hyun Yoon , IBM Research
Yuan Xie , Pennsylvania State University
pp. 421-432

Aegis: Partitioning data block for efficient recovery of stuck-at-faults in phase change memory (Abstract)

Jie Fan , Department of Computer Science and Technology, Tsinghua University Beijing, China
Song Jiang , Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI, USA
Jiwu Shu , Department of Computer Science and Technology, Tsinghua University Beijing, China
Youhui Zhang , Department of Computer Science and Technology, Tsinghua University Beijing, China
Weimin Zhen , Department of Computer Science and Technology, Tsinghua University Beijing, China
pp. 433-444

Trace based phase prediction for tightly-coupled heterogeneous cores (Abstract)

Shruti Padmanabha , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, MI
Andrew Lukefahr , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, MI
Reetuparna Das , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, MI
Scott Mahlke , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, MI
pp. 445-456

Heterogeneous system coherence for integrated CPU-GPU systems (Abstract)

Jason Power , Department of Computer Sciences, University of Wisconsin - Madison
Arkaprava Basu , Department of Computer Sciences, University of Wisconsin - Madison
Junli Gu , Advanced Micro Devices, Inc.
Sooraj Puthoor , Advanced Micro Devices, Inc.
Bradford M. Beckmann , Advanced Micro Devices, Inc.
Mark D. Hill , Department of Computer Sciences, University of Wisconsin - Madison
Steven K. Reinhardt , Advanced Micro Devices, Inc.
David A. Wood , Department of Computer Sciences, University of Wisconsin - Madison
pp. 457-467

Meet the walkers accelerating index traversals for in-memory databases (Abstract)

Onur Kocberber , EcoCloud, EPFL
Boris Grot , University of Edinburgh
Javier Picorel , EcoCloud, EPFL
Babak Falsafi , EcoCloud, EPFL
Kevin Lim , HP Labs
pp. 468-479

Author index (Abstract)

pp. 480-483
89 ms
(Ver 3.3 (11022016))