The Community for Technology Leaders
2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA) (2015)
Portland, OR, USA
June 13, 2015 to June 17, 2015
ISBN: 978-1-5090-0255-9
TABLE OF CONTENTS

Front matters (PDF)

pp. 1

BlueDBM: An appliance for Big Data analytics (Abstract)

Sang-Woo Jun , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
Ming Liu , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
Sungjin Lee , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
Jamey Hicks , Department of Electrical Engineering and Computer Science, Quanta Research Cambridge, USA
John Ankcorn , Department of Electrical Engineering and Computer Science, Quanta Research Cambridge, USA
Myron King , Department of Electrical Engineering and Computer Science, Quanta Research Cambridge, USA
Shuotao Xu , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
Arvind , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
pp. 1-13

Towards sustainable in-situ server systems in the big data era (Abstract)

Chao Li , Shanghai Jiao Tong University, China
Yang Hu , University of Florida, USA
Longjun Liu , Xi'an Jiaotong University, China
Juncheng Gu , University of Florida, USA
Mingcong Song , University of Florida, USA
Xiaoyao Liang , Shanghai Jiao Tong University, China
Jingling Yuan , Wuhan University of Technology, China
Tao Li , University of Florida, USA
pp. 14-26

DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers (Abstract)

Johann Hauswald , Clarity Lab, University of Michigan - Ann Arbor, USA
Yiping Kang , Clarity Lab, University of Michigan - Ann Arbor, USA
Michael A. Laurenzano , Clarity Lab, University of Michigan - Ann Arbor, USA
Quan Chen , Clarity Lab, University of Michigan - Ann Arbor, USA
Cheng Li , Clarity Lab, University of Michigan - Ann Arbor, USA
Trevor Mudge , Clarity Lab, University of Michigan - Ann Arbor, USA
Ronald G. Dreslinski , Clarity Lab, University of Michigan - Ann Arbor, USA
Jason Mars , Clarity Lab, University of Michigan - Ann Arbor, USA
Lingjia Tang , Clarity Lab, University of Michigan - Ann Arbor, USA
pp. 27-40

A case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling flexible data compression with assist warps (Abstract)

Nandita Vijaykumar , Carnegie Mellon University, USA
Gennady Pekhimenko , Carnegie Mellon University, USA
Adwait Jog , Pennsylvania State University, USA
Abhishek Bhowmick , Carnegie Mellon University, USA
Rachata Ausavarungnirun , Carnegie Mellon University, USA
Chita Das , Pennsylvania State University, USA
Mahmut Kandemir , Pennsylvania State University, USA
Todd C. Mowry , Carnegie Mellon University, USA
Onur Mutlu , Carnegie Mellon University, USA
pp. 41-53

Harmonia: Balancing compute and memory power in high-performance GPUs (Abstract)

Indrani Paul , AMD Research, USA
Wei Huang , AMD Research, USA
Manish Arora , AMD Research, USA
Sudhakar Yalamanchili , Georgia Institute of Technology, USA
pp. 54-65

Redundant Memory Mappings for fast access to large memories (Abstract)

Vasileios Karakostas , Barcelona Supercomputing Center, Spain
Jayneel Gandhi , University of Wisconsin - Madison, USA
Furkan Ayar , Dumlupinar University, Turkey
Adrian Cristal , Barcelona Supercomputing Center, Spain
Mark D. Hill , University of Wisconsin - Madison, USA
Kathryn S. McKinley , Microsoft Research, USA
Mario Nemirovsky , ICREA Senior Research Professor at Barcelona Supercomputing Center, Spain
Michael M. Swift , University of Wisconsin - Madison, USA
Osman Unsal , Barcelona Supercomputing Center, Spain
pp. 66-78

Page overlays: An enhanced virtual memory framework to enable fine-grained memory management (Abstract)

Vivek Seshadri , Carnegie Mellon University, USA
Gennady Pekhimenko , Carnegie Mellon University, USA
Olatunji Ruwase , Microsoft Research, USA
Onur Mutlu , Carnegie Mellon University, USA
Phillip B. Gibbons , Intel Labs Pittsburgh, USA
Michael A. Kozuch , Intel Labs Pittsburgh, USA
Todd C. Mowry , Carnegie Mellon University, USA
Trishul Chilimbi , Microsoft Research, USA
pp. 79-91

ShiDianNao: Shifting vision processing closer to the sensor (Abstract)

Zidong Du , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Robert Fasthuber , EPFL, Switzerland
Tianshi Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Paolo Ienne , EPFL, Switzerland
Ling Li , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Tao Luo , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Xiaobing Feng , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Yunji Chen , State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), CAS, China
Olivier Temam , Inria, France
pp. 92-104

A scalable processing-in-memory accelerator for parallel graph processing (Abstract)

Junwhan Ahn , Seoul National University, Korea
Sungpack Hong , Oracle Labs, USA
Sungjoo Yoo , Seoul National University, Korea
Onur Mutlu , Carnegie Mellon University, USA
Kiyoung Choi , Seoul National University, Korea
pp. 105-117

Efficient execution of memory access phases using dataflow specialization (Abstract)

Chen-Han Ho , University of Wisconsin-Madison, USA
Sung Jin Kim , University of Wisconsin-Madison, USA
Karthikeyan Sankaralingam , University of Wisconsin-Madison, USA
pp. 118-130

Data reorganization in memory using 3D-stacked DRAM (Abstract)

Berkin Akin , Carnegie Mellon University, USA
Franz Franchetti , Carnegie Mellon University, USA
James C. Hoe , Carnegie Mellon University, USA
pp. 131-143

Quantitative comparison of Hardware Transactional Memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8 (Abstract)

Takuya Nakaike , IBM Research - Tokyo, Japan
Rei Odaira , IBM Research - Austin, USA
Matthew Gaudet , IBM Canada
Maged M. Michael , IBM Watson Research Center, USA
Hisanobu Tomari , University of Tokyo, Japan
pp. 144-157

Profiling a warehouse-scale computer (Abstract)

Svilen Kanev , Harvard University, USA
Juan Pablo Darago , Universidad de Buenos Aires, Argentina
Kim Hazelwood , Yahoo Labs, USA
Tipp Moseley , Google, USA
Gu-Yeon Wei , Harvard University, USA
David Brooks , Harvard University, USA
pp. 158-169

Computer performance microscopy with Shim (Abstract)

Xi Yang , Australian National University, Australia
Stephen M. Blackburn , Australian National University, Australia
Kathryn S. McKinley , Microsoft Research, USA
pp. 170-184

Flexible software profiling of GPU architectures (Abstract)

Mark Stephenson , NVIDIA, USA
Yunsup Lee , University of California, Berkeley, USA
Eiman Ebrahimi , NVIDIA, USA
Daniel R. Johnson , NVIDIA, USA
David Nellans , NVIDIA, USA
Mike O'Connor , NVIDIA, USA
Stephen W. Keckler , NVIDIA, USA
pp. 185-197

BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches (Abstract)

Chiachen Chou , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
Aamer Jaleel , NVIDIA, USA
Moinuddin K. Qureshi , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
pp. 198-210

A fully associative, tagless DRAM cache (Abstract)

Yongjun Lee , Sungkyunkwan University, Suwon, Korea
Jongwon Kim , Sungkyunkwan University, Suwon, Korea
Hakbeom Jang , Sungkyunkwan University, Suwon, Korea
Hyunggyun Yang , POSTECH, Pohang, Korea
Jangwoo Kim , POSTECH, Pohang, Korea
Jinkyu Jeong , Sungkyunkwan University, Suwon, Korea
Jae W. Lee , Sungkyunkwan University, Suwon, Korea
pp. 211-222

Multiple Clone Row DRAM: A low latency and area optimized DRAM (Abstract)

Jungwhan Choi , Korea Advanced Institute of Science and Technology, Korea
Wongyu Shin , Korea Advanced Institute of Science and Technology, Korea
Jaemin Jang , Korea Advanced Institute of Science and Technology, Korea
Jinwoong Suh , Korea Advanced Institute of Science and Technology, Korea
Yongkee Kwon , SK hynix, Korea
Youngsuk Moon , SK hynix, Korea
Lee-Sup Kim , Korea Advanced Institute of Science and Technology, Korea
pp. 223-234

Flexible auto-refresh: Enabling scalable and energy-efficient DRAM refresh reductions (Abstract)

Ishwar Bhati , Oracle Corporation, USA
Zeshan Chishti , Intel Corporation, USA
Shih-Lien Lu , Intel Corporation, USA
Bruce Jacob , University of Maryland, USA
pp. 235-246

Cost-effective speculative scheduling in high performance processors (Abstract)

Arthur Perais , IRISA/INRIA, France
Andre Seznec , IRISA/INRIA, France
Pierre Michaud , IRISA/INRIA, France
Andreas Sembrant , Uppsala University, Sweden
Erik Hagersten , Uppsala University, Sweden
pp. 247-259

LaZy Superscalar (Abstract)

Gorkem Asilioglu , Department of Computer Science, Michigan Technological University, USA
Zhaoxiang Jin , Department of Computer Science, Michigan Technological University, USA
Murat Koksal , Department of Computer Science, Michigan Technological University, USA
Omkar Javeri , Department of Computer Science, Michigan Technological University, USA
Soner Onder , Department of Computer Science, Michigan Technological University, USA
pp. 260-271

The Load Slice Core microarchitecture (Abstract)

Trevor E. Carlson , Uppsala University, Sweden
Wim Heirman , Intel, ExaScience Lab, Belgium
Osman Allam , Ghent University, Belgium
Stefanos Kaxiras , Uppsala University, Sweden
Lieven Eeckhout , Ghent University, Belgium
pp. 272-284

Semantic locality and context-based prefetching using reinforcement learning (Abstract)

Leeor Peled , Electrical Engineering, Technion-Israel Institute of Technology, Israel
Shie Mannor , Electrical Engineering, Technion-Israel Institute of Technology, Israel
Uri Weiser , Electrical Engineering, Technion-Israel Institute of Technology, Israel
Yoav Etsion , Electrical Engineering, Technion-Israel Institute of Technology, Israel
pp. 285-297

Exploring the potential of heterogeneous Von Neumann/dataflow execution models (Abstract)

Tony Nowatzki , University of Wisconsin - Madison, USA
Vinay Gangadhar , University of Wisconsin - Madison, USA
Karthikeyan Sankaralingam , University of Wisconsin - Madison, USA
pp. 298-310

SHRINK: Reducing the ISA complexity via instruction recycling (Abstract)

Bruno Cardoso Lopes , University of Campinas - UNICAMP - Brazil
Rafael Auler , University of Campinas - UNICAMP - Brazil
Luiz Ramos , University of Campinas - UNICAMP - Brazil
Edson Borin , University of Campinas - UNICAMP - Brazil
Rodolfo Azevedo , University of Campinas - UNICAMP - Brazil
pp. 311-322

Branch vanguard: Decomposing branch functionality into prediction and resolution instructions (Abstract)

Daniel S. McFarlin , Carnegie Mellon University, USA
Craig Zilles , University of Illinois at Urbana-Champaign, USA
pp. 323-335

PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture (Abstract)

Junwhan Ahn , Seoul National University, Korea
Sungjoo Yoo , Seoul National University, Korea
Onur Mutlu , Carnegie Mellon University, USA
Kiyoung Choi , Seoul National University, Korea
pp. 336-348

SLIP: Reducing wire energy in the memory hierarchy (Abstract)

Subhasis Das , Stanford University, USA
Tor M. Aamodt , University of British Columbia, Canada
William J. Dally , Stanford University, USA
pp. 349-361

Reducing world switches in virtualized environment with flexible cross-world calls (Abstract)

Wenhao Li , Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, China
Yubin Xia , Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, China
Haibo Chen , Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, China
Binyu Zang , Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, China
Haibing Guan , Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, China
pp. 375-387

ArMOR: Defending against memory consistency model mismatches in heterogeneous architectures (Abstract)

Daniel Lustig , Princeton University, USA
Caroline Trippel , Princeton University, USA
Michael Pellauer , NVIDIA Research, USA
Margaret Martonosi , Princeton University, USA
pp. 388-400

Clean: A race detector with cleaner semantics (Abstract)

Cedomir Segulja , The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Canada
Tarek S. Abdelrahman , The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Canada
pp. 401-413

MiSAR: Minimalistic synchronization accelerator with resource overflow management (Abstract)

Ching-Kai Liang , Georgia Institute of Technology, USA
Milos Prvulovic , Georgia Institute of Technology, USA
pp. 414-426

Callback: Efficient synchronization without invalidation with a directory just for spin-waiting (Abstract)

Alberto Ros , Department of Computer Engineering, Universidad de Murcia, Spain
Stefanos Kaxiras , Department of Information Technology, Uppsala University, Sweden
pp. 427-438

Thermal time shifting: Leveraging phase change materials to reduce cooling costs in warehouse-scale computers (Abstract)

Matt Skach , University of Michigan, USA
Manish Arora , Advanced Micro Devices, Inc., USA
Chang-Hong Hsu , University of Michigan, USA
Qi Li , University of California, San Diego, USA
Dean Tullsen , University of California, San Diego, USA
Lingjia Tang , University of Michigan, USA
Jason Mars , University of Michigan, USA
pp. 439-449

Heracles: Improving resource efficiency at scale (Abstract)

David Lo , Stanford University, USA
Liqun Cheng , Google, Inc., USA
Rama Govindaraju , Google, Inc., USA
Parthasarathy Ranganathan , Google, Inc., USA
Christos Kozyrakis , Stanford University, USA
pp. 450-462

HEB: Deploying and managing hybrid energy buffers for improving datacenter efficiency and economy (Abstract)

Longjun Liu , School of Electrical and Information Engineering, Xi'an Jiaotong University, China
Chao Li , Department of Computer, Science and Engineering, Shanghai Jiao Tong University, China
Hongbin Sun , School of Electrical and Information Engineering, Xi'an Jiaotong University, China
Yang Hu , Department of Electrical and Computer Engineering, University of Florida, USA
Juncheng Gu , Department of Electrical and Computer Engineering, University of Florida, USA
Tao Li , Department of Electrical and Computer Engineering, University of Florida, USA
Jingmin Xin , School of Electrical and Information Engineering, Xi'an Jiaotong University, China
Nanning Zheng , School of Electrical and Information Engineering, Xi'an Jiaotong University, China
pp. 463-475

Architecting to achieve a billion requests per second throughput on a single key-value store server platform (Abstract)

Sheng Li , Intel Labs, USA
Hyeontaek Lim , Carnegie Mellon University, USA
Victor W. Lee , Intel Labs, USA
Jung Ho Ahn , Seoul National University, Korea
Anuj Kalia , Carnegie Mellon University, USA
Michael Kaminsky , Intel Labs, USA
David G. Andersen , Carnegie Mellon University, USA
Seongil O , Seoul National University, Korea
Sukhan Lee , Seoul National University, Korea
Pradeep Dubey , Intel Labs, USA
pp. 476-488

A variable warp size architecture (Abstract)

Timothy G. Rogers , University of British Columbia, Canada
Daniel R. Johnson , NVIDIA, USA
Mike O'Connor , NVIDIA, USA
Stephen W. Keckler , NVIDIA, USA
pp. 489-501

Warped-Compression: Enabling power efficient GPUs through register compression (Abstract)

Sangpil Lee , Yonsei University, Korea
Keunsoo Kim , Yonsei University, Korea
Gunjae Koo , University of Southern California, USA
Hyeran Jeon , University of Southern California, USA
Won Woo Ro , Yonsei University, Korea
Murali Annavaram , University of Southern California, USA
pp. 502-514

CAWA: Coordinated warp scheduling and Cache Prioritization for critical warp acceleration of GPGPU workloads (Abstract)

Shin-Ying Lee , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, USA
Akhil Arunkumar , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, USA
Carole-Jean Wu , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, USA
pp. 515-527

Dynamic Thread Block Launch: A lightweight execution mechanism to support irregular applications on GPUs (Abstract)

Jin Wang , Georgia Institute of Technology, USA
Norm Rubin , NVIDIA Research, USA
Albert Sidelnik , NVIDIA Research, USA
Sudhakar Yalamanchili , Georgia Institute of Technology, USA
pp. 528-540

DynaSpAM: Dynamic spatial architecture mapping using Out of Order instruction schedules (Abstract)

Feng Liu , Princeton University, USA
Heejin Ahn , Princeton University, USA
Stephen R. Beard , Princeton University, USA
Taewook Oh , Princeton University, USA
David I. August , Princeton University, USA
pp. 541-553

Rumba: An online quality management system for approximate computing (Abstract)

Daya S Khudia , University of Michigan, USA
Babak Zamirai , University of Michigan, USA
Mehrzad Samadi , University of Michigan, USA
Scott Mahlke , University of Michigan, USA
pp. 554-566

Manycore Network Interfaces for in-memory rack-scale computing (Abstract)

Alexandros Daglis , EcoCloud, EPFL, Switzerland
Stanko Novakovic , EcoCloud, EPFL, Switzerland
Edouard Bugnion , EcoCloud, EPFL, Switzerland
Babak Falsafi , EcoCloud, EPFL, Switzerland
Boris Grot , University of Edinburgh, UK
pp. 567-579

Unified address translation for memory-mapped SSDs with FlashMap (Abstract)

Jian Huang , Georgia Institute of Technology, USA
Anirudh Badam , Microsoft Research, USA
Moinuddin K. Qureshi , Georgia Institute of Technology, USA
Karsten Schwan , Georgia Institute of Technology, USA
pp. 580-591

FASE: Finding Amplitude-modulated Side-channel Emanations (Abstract)

Robert Callan , Georgia Institute of Technology, USA
Alenka Zajic , Georgia Institute of Technology, USA
Milos Prvulovic , Georgia Institute of Technology, USA
pp. 592-603

Probable cause: The deanonymizing effects of approximate DRAM (Abstract)

Amir Rahmati , University of Michigan, USA
Matthew Hicks , University of Michigan, USA
Daniel E. Holcomb , University of Michigan, USA
Kevin Fu , University of Michigan, USA
pp. 604-615

PrORAM: Dynamic prefetcher for Oblivious RAM (Abstract)

Xiangyao Yu , Massachusetts Institute of Technology, USA
Syed Kamran Haider , University of Connecticut, USA
Ling Ren , Massachusetts Institute of Technology, USA
Christopher Fletcher , Massachusetts Institute of Technology, USA
Albert Kwon , Massachusetts Institute of Technology, USA
Marten van Dijk , University of Connecticut, USA
Srinivas Devadas , Massachusetts Institute of Technology, USA
pp. 616-628

MBus: An ultra-low power interconnect bus for next generation nanopower systems (Abstract)

Pat Pannuto , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Yoonmyung Lee , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Ye-Sheng Kuo , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
ZhiYoong Foo , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Benjamin Kempke , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Gyouho Kim , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Ronald G. Dreslinski , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
David Blaauw , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
Prabal Dutta , Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, 48109, USA
pp. 629-641

Accelerating asynchronous programs through Event Sneak Peek (Abstract)

Gaurav Chadha , University of Michigan, Ann Arbor, USA
Scott Mahlke , University of Michigan, Ann Arbor, USA
Satish Narayanasamy , University of Michigan, Ann Arbor, USA
pp. 642-654

VIP: Virtualizing IP chains on handheld platforms (Abstract)

Nachiappan Chidambaram Nachiappan , The Pennsylvania State University, USA
Haibo Zhang , The Pennsylvania State University, USA
Jihyun Ryoo , The Pennsylvania State University, USA
Niranjan Soundararajan , Intel Corporation, USA
Anand Sivasubramaniam , The Pennsylvania State University, USA
Mahmut T. Kandemir , The Pennsylvania State University, USA
Ravi Iyer , Intel Corporation, USA
Chita R. Das , The Pennsylvania State University, USA
pp. 655-667

FaultHound: Value-locality-based soft-fault tolerance (Abstract)

Nitin , School of Electrical and Computer Engineering, Purdue University, USA
Irith Pomeranz , School of Electrical and Computer Engineering, Purdue University, USA
T. N. Vijaykumar , School of Electrical and Computer Engineering, Purdue University, USA
pp. 668-681

COP: To compress and protect main memory (Abstract)

David J. Palframan , Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA
Nam Sung Kim , Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA
Mikko H. Lipasti , Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA
pp. 682-693

Hi-fi playback: Tolerating position errors in shift operations of racetrack memory (Abstract)

Chao Zhang , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Guangyu Sun , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Xian Zhang , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Weiqi Zhang , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Weisheng Zhao , Spintronics Interdisciplinary Center, Beihang University, 100191, China
Tao Wang , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Yun Liang , Center for Energy-efficient Computing and Applications, Peking University, Beijing, 100871, China
Yongpan Liu , Department of Electrical Engineering, Tsinghua University, 100084, China
Yu Wang , Department of Electrical Engineering, Tsinghua University, 100084, China
Jiwu Shu , Department of Computer Science and Technology, Tsinghua University, 100084, China
pp. 694-706

Stash: Have your scratchpad and cache it too (Abstract)

Rakesh Komuravelli , University of Illinois at Urbana-Champaign, USA
Matthew D. Sinclair , University of Illinois at Urbana-Champaign, USA
Johnathan Alsop , University of Illinois at Urbana-Champaign, USA
Muhammad Huzaifa , University of Illinois at Urbana-Champaign, USA
Maria Kotsifakou , University of Illinois at Urbana-Champaign, USA
Prakalp Srivastava , University of Illinois at Urbana-Champaign, USA
Sarita V. Adve , University of Illinois at Urbana-Champaign, USA
Vikram S. Adve , University of Illinois at Urbana-Champaign, USA
pp. 707-719

Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures (Abstract)

Lluc Alvarez , Barcelona Supercomputing Center, Spain
Lluis Vilanova , Barcelona Supercomputing Center, Spain
Miquel Moreto , Barcelona Supercomputing Center, Spain
Marc Casas , Barcelona Supercomputing Center, Spain
Marc Gonzalez , Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Spain
Xavier Martorell , Barcelona Supercomputing Center, Spain
Nacho Navarro , Barcelona Supercomputing Center, Spain
Eduard Ayguade , Barcelona Supercomputing Center, Spain
Mateo Valero , Barcelona Supercomputing Center, Spain
pp. 720-732

Fusion: Design tradeoffs in coherent cache hierarchies for accelerators (Abstract)

Snehasish Kumar , School of Computing Sciences, Simon Fraser University, Canada
Arrvindh Shriraman , School of Computing Sciences, Simon Fraser University, Canada
Naveen Vedula , School of Computing Sciences, Simon Fraser University, Canada
pp. 733-745

Author index (PDF)

pp. 1-3
92 ms
(Ver 3.3 (11022016))