The Community for Technology Leaders
2009 IEEE 15th International Symposium on High Performance Computer Architecture (2009)
Raleigh, NC USA
Feb. 14, 2009 to Feb. 18, 2009
ISSN: 1530-0897
ISBN: 978-1-4244-2932-5
TABLE OF CONTENTS

Frontal (PDF)

pp. i-xiv

Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems (PDF)

Eiman Ebrahimi , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
Onur Mutlu , Computer Architecture Laboratory (CALCM), Carnegie Mellon University, USA
Yale N. Patt , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
pp. 7-17

Voltage emergency prediction: Using signatures to reduce operating margins (PDF)

Vijay Janapa Reddi , Harvard University, USA
Meeta S. Gupta , Harvard University, USA
Glenn Holloway , Harvard University, USA
Gu-Yeon Wei , Harvard University, USA
Michael D. Smith , Harvard University, USA
David Brooks , Harvard University, USA
pp. 18-29

A low-radix and low-diameter 3D interconnection network design (PDF)

Yi Xu , Dept. of Electrical and Computer Engineering, University of Pittsburgh, PA 15621, USA
Yu Du , Dept. of Computer Science, University of Pittsburgh, PA 15621, USA
Bo Zhao , Dept. of Electrical and Computer Engineering, University of Pittsburgh, PA 15621, USA
Xiuyi Zhou , Dept. of Electrical and Computer Engineering, University of Pittsburgh, PA 15621, USA
Youtao Zhang , Dept. of Computer Science, University of Pittsburgh, PA 15621, USA
Jun Yang , Dept. of Electrical and Computer Engineering, University of Pittsburgh, PA 15621, USA
pp. 30-42

Adaptive Spill-Receive for robust high-performance caching in CMPs (PDF)

Moinuddin K. Qureshi , IBM Research, T. J. Watson Research Center, Yorktown Heights NY, USA
pp. 45-54

Design and implementation of software-managed caches for multicores with local memory (PDF)

Sangmin Seo , School of Computer Science and Engineering, Seoul National University, Korea
Jaejin Lee , School of Computer Science and Engineering, Seoul National University, Korea
Zehra Sura , IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA
pp. 55-66

In-Network Snoop Ordering (INSO): Snoopy coherence on unordered interconnects (PDF)

Niket Agarwal , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Li-Shiuan Peh , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Niraj K. Jha , Department of Electrical Engineering, Princeton University, NJ, 08544, USA
pp. 67-78

Practical off-chip meta-data for temporal memory streaming (PDF)

Thomas F. Wenisch , University of Michigan, USA
Michael Ferdman , Ecole Polytechnique Fédérale de Lausanne and Carnegie Mellon University, USA
Anastasia Ailamaki , Ecole Polytechnique Fédérale de Lausanne and Carnegie Mellon University, USA
Babak Falsafi , Ecole Polytechnique Fédérale de Lausanne and Carnegie Mellon University, USA
Andreas Moshovos , University of Toronto, Canada
pp. 79-90

Soft error vulnerability aware process variation mitigation (PDF)

Xin Fu , Department of ECE, University of Florida, USA
Tao Li , Department of ECE, University of Florida, USA
Jose A. B. Fortes , Department of ECE, University of Florida, USA
pp. 93-104

Accurate microarchitecture-level fault modeling for studying hardware faults (PDF)

Man-Lap Li , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Pradeep Ramachandran , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Ulya R. Karpuzcu , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Siva Kumar Sastry Hari , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Sarita V. Adve , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
pp. 105-116

Eliminating microarchitectural dependency from Architectural Vulnerability (PDF)

Vilas Sridharan , Department of Electrical and Computer Engineering, Northeastern University, USA
David R. Kaeli , Department of Electrical and Computer Engineering, Northeastern University, USA
pp. 117-128

Versatile prediction and fast estimation of Architectural Vulnerability Factor from processor performance metrics (PDF)

Lide Duan , Louisiana State University, Baton Rouge, LA 70803, USA
Bin Li , Louisiana State University, Baton Rouge, LA 70803, USA
Lu Peng , Louisiana State University, Baton Rouge, LA 70803, USA
pp. 129-140

Opportunities beyond single-core microprocessors (PDF)

Mark D. Hill , University of Wisconsin-Madison, USA
pp. 143-144

Multi-core demands multi-interfaces (PDF)

Yale Patt , University of Texas at Austin, USA
pp. 147-148

Elastic-buffer flow control for on-chip networks (PDF)

George Michelogiannakis , Computer Systems Laboratory, Stanford University, CA 94305, USA
James Balfour , Computer Systems Laboratory, Stanford University, CA 94305, USA
William J. Dally , Computer Systems Laboratory, Stanford University, CA 94305, USA
pp. 151-162

Express Cube Topologies for on-Chip Interconnects (PDF)

Boris Grot , Department of Computer Sciences, The University of Texas at Austin, USA
Joel Hestness , Department of Computer Sciences, The University of Texas at Austin, USA
Stephen W. Keckler , Department of Computer Sciences, The University of Texas at Austin, USA
Onur Mutlu , Computer Architecture Laboratory (CALCM), Carnegie Mellon University, USA
pp. 163-174

Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs (PDF)

Reetuparna Das , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Soumya Eachempati , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Asit K. Mishra , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Vijaykrishnan Narayanan , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Chita R. Das , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
pp. 175-186

Lightweight predication support for out of order processors (PDF)

Mark Stephenson , IBM Austin Research Lab, 11501 Burnet Road, Austin TX 78759, USA
Lixin Zhang , IBM Austin Research Lab, 11501 Burnet Road, Austin TX 78759, USA
Ram Rangan , IBM Austin Research Lab, 11501 Burnet Road, Austin TX 78759, USA
pp. 201-212

Blueshift: Designing processors for timing speculation from the ground up. (PDF)

Brian Greskamp , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Lu Wan , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Ulya R. Karpuzcu , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Jeffrey J. Cook , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Josep Torrellas , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Deming Chen , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
Craig Zilles , Departments of Computer Science and of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA
pp. 213-224

PageNUCA: Selected policies for page-grain locality management in large shared chip-multiprocessor caches (PDF)

Mainak Chaudhuri , Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur 208016, INDIA
pp. 227-238

A novel architecture of the 3D stacked MRAM L2 cache for CMPs (PDF)

Guangyu Sun , Pennsylvania State University, USA
Xiangyu Dong , Pennsylvania State University, USA
Yuan Xie , Pennsylvania State University, USA
Jian Li , IBM Austin Research Lab, USA
Yiran Chen , Seagate Technology, USA
pp. 239-249

Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches (Abstract)

Manu Awasthi , School of Computing, University of Utah, USA
Kshitij Sudan , School of Computing, University of Utah, USA
Rajeev Balasubramonian , School of Computing, University of Utah, USA
John Carter , School of Computing, University of Utah, USA
pp. 250-261

Optimizing communication and capacity in a 3D stacked reconfigurable cache hierarchy (PDF)

Niti Madan , School of Computing, University of Utah, USA
Li Zhao , System Technology Lab, Intel Corporation, USA
Naveen Muralimanohar , School of Computing, University of Utah, USA
Aniruddha Udipi , School of Computing, University of Utah, USA
Rajeev Balasubramonian , School of Computing, University of Utah, USA
Ravishankar Iyer , System Technology Lab, Intel Corporation, USA
Srihari Makineni , System Technology Lab, Intel Corporation, USA
Donald Newell , System Technology Lab, Intel Corporation, USA
pp. 262-274

Reconciling specialization and flexibility through compound circuits (PDF)

Sami Yehia , Thales Research & Technology, Embedded Systems Lab, France
Sylvain Girbal , Thales Research & Technology, Embedded Systems Lab, France
Hugues Berry , Alchemy Project, INRIA Saclay, France
Olivier Temam , Alchemy Project, INRIA Saclay, France
pp. 277-288

CAMP: A technique to estimate per-structure power at run-time using a few simple parameters (PDF)

Michael D. Powell , Intel Massachusetts, USA
Arijit Biswas , Intel Massachusetts, USA
Joel S. Emer , Intel Massachusetts, USA
Shubhendu S. Mukherjee , Intel Massachusetts, USA
Basit R. Sheikh , Computer Systems Laboratory, Cornell University, USA
Shrirang Yardi , Department of Electrical and Computer Engineering, Virginia Tech, USA
pp. 289-300

Variation-aware dynamic voltage/frequency scaling (PDF)

Sebastian Herbert , Department of Electrical and Computer Engineering, Carnegie Mellon University, USA
Diana Marculescu , Department of Electrical and Computer Engineering, Carnegie Mellon University, USA
pp. 301-312

Bridging the computation gap between programmable processors and hardwired accelerators (PDF)

Kevin Fan , Parakinetics, Inc., USA
Manjunath Kudlur , NVIDIA Corporation, USA
Ganesh Dasika , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
Scott Mahlke , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
pp. 313-322

Industrial perspectives panel (PDF)

Parthasarathy Ranganathan , Hewlett Packard Labs, USA
pp. 325-326

A first-order fine-grained multithreaded throughput model (PDF)

Xi E. Chen , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Tor M. Aamodt , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
pp. 329-340

Characterization of Direct Cache Access on multi-core systems and 10GbE (PDF)

Amit Kumar , Intel Corporation, USA
Ram Huggahalli , Intel Corporation, USA
Srihari Makineni , Intel Corporation, USA
pp. 341-352

MRR: Enabling fully adaptive multicast routing for CMP interconnection networks (PDF)

Pablo Abad , University of Cantabria, Spain
Valentin Puente , University of Cantabria, Spain
Jose-Angel Gregorio , University of Cantabria, Spain
pp. 355-366

Prediction router: Yet another low latency on-chip router architecture (PDF)

Hiroki Matsutani , Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama, JAPAN 223-8522
Michihiro Koibuchi , National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo, JAPAN 101-8430
Hideharu Amano , Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama, JAPAN 223-8522
Tsutomu Yoshinaga , The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi, Tokyo, JAPAN 182-8585
pp. 367-378

Fast complete memory consistency verification (PDF)

Yunji Chen , Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Yi Lv , Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Weiwu Hu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Tianshi Chen , University of Science and Technology of China, Hefei, Anhui 230027, China
Haihua Shen , Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Pengyu Wang , Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Hong Pan , Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
pp. 381-392

Hardware-software integrated approaches to defend against software cache-based side channel attacks (PDF)

Jingfei Kong , University of Central Florida, USA
Onur Aciicmez , Samsung Electronics, Korea
Jean-Pierre Seifert , TU Berlin & Deutsche Telekom Laboratories, Germany
Huiyang Zhou , University of Central Florida, USA
pp. 393-404

Dacota: Post-silicon validation of the memory subsystem in multi-core designs (PDF)

Andrew DeOrio , University of Michigan; Ann Arbor, USA
Ilya Wagner , University of Michigan; Ann Arbor, USA
Valeria Bertacco , University of Michigan; Ann Arbor, USA
pp. 405-416

Criticality-based optimizations for efficient load processing (PDF)

Samantika Subramaniam , Georgia Institute of Technology, College of Computing, Atlanta, USA
Anne Bracy , Intel Corporation, Microarchitecture Research Laboratory, Santa Clara, CA, USA
Hong Wang , Intel Corporation, Microarchitecture Research Laboratory, Santa Clara, CA, USA
Gabriel H. Loh , Georgia Institute of Technology, College of Computing, Atlanta, USA
pp. 419-430

iCFP: Tolerating all-level cache misses in in-order processors (PDF)

Andrew Hilton , Department of Computer and Information Science, University of Pennsylvania, USA
Santosh Nagarakatte , Department of Computer and Information Science, University of Pennsylvania, USA
Amir Roth , Department of Computer and Information Science, University of Pennsylvania, USA
pp. 431-442

Feedback mechanisms for improving probabilistic memory prefetching (PDF)

Ibrahim Hur , IBM Corporation, Systems and Technology Group, Austin, TX, USA
Calvin Lin , The University of Texas at Austin, Department of Computer Sciences, USA
pp. 443-454
92 ms
(Ver 3.3 (11022016))