The Community for Technology Leaders
2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2009) (2009)
New York, NY
Dec. 12, 2009 to Dec. 16, 2009
ISSN: 1072-4451
ISBN: 978-1-60558-798-1
TABLE OF CONTENTS

Characterizing and mitigating the impact of process variations on phase change based memory systems (PDF)

Wangyuan Zhang , Intelligent Design of Efficient Architecture Lab (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
Tao Li , Intelligent Design of Efficient Architecture Lab (IDEAL), Department of Electrical and Computer Engineering, University of Florida, USA
pp. 2-13

Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling (PDF)

Moinuddin K. Qureshi , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
John Karidis , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
Michele Franceschini , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
Vijayalakshmi Srinivasan , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
Luis Lastras , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
Bulent Abali , IBM Research, T. J. Watson Research Center, Yorktown Heights NY 10598, USA
pp. 14-23

Characterizing flash memory: Anomalies, observations, and applications (PDF)

Laura M. Grupp , The Department of Computer Science and Engineering, University of California, San Diego, USA
Adrian M. Caulfield , The Department of Computer Science and Engineering, University of California, San Diego, USA
Joel Coburn , The Department of Computer Science and Engineering, University of California, San Diego, USA
Steven Swanson , The Department of Computer Science and Engineering, University of California, San Diego, USA
Eitan Yaakobi , The Center for Magnetic Recording Research, University of California, San Diego, USA
Paul H. Siegel , The Center for Magnetic Recording Research, University of California, San Diego, USA
Jack K. Wolf , The Center for Magnetic Recording Research, University of California, San Diego, USA
pp. 24-33

Complexity effective memory access scheduling for many-core accelerator architectures (PDF)

George L. Yuan , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Ali Bakhoda , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
Tor M. Aamodt , Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, CANADA
pp. 34-44

Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping (PDF)

Chi-Keung Luk , Software Pathfinding and Innovations, Software and Services Group, Intel Corporation, Hudson, MA 01749, USA
Sunpyo Hong , Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, USA
Hyesoon Kim , College of Computing, School of Computer Science, Georgia Institute of Technology, Atlanta, 30332, USA
pp. 45-55

DDT: Design and evaluation of a dynamic program analysis for optimizing data structure usage (PDF)

Changhee Jung , College of Computing, Georgia Institute of Technology, USA
Nathan Clark , College of Computing, Georgia Institute of Technology, USA
pp. 56-66

Tree register allocation (PDF)

Hongbo Rong , Microsoft Corporation, USA
pp. 67-77

Portable compiler optimisation across embedded programs and microarchitectures using machine learning (PDF)

Christophe Dubach , HiPEAC, School of Informatics, University of Edinburgh, UK
Timothy M. Jones , HiPEAC, School of Informatics, University of Edinburgh, UK
Edwin V. Bonilla , HiPEAC, School of Informatics, University of Edinburgh, UK
Grigori Fursin , HiPEAC, INRIA Saclay, France
Michael F.P. O'Boyle , HiPEAC, School of Informatics, University of Edinburgh, UK
pp. 78-88

Improving cache lifetime reliability at ultra-low voltages (PDF)

Zeshan Chishti , Oregon Microarchitecture Research, Intel Labs, USA
Alaa R. Alameldeen , Oregon Microarchitecture Research, Intel Labs, USA
Chris Wilkerson , Oregon Microarchitecture Research, Intel Labs, USA
Wei Wu , Oregon Microarchitecture Research, Intel Labs, USA
Shih-Lien Lu , Oregon Microarchitecture Research, Intel Labs, USA
pp. 89-99

ZerehCache: Armoring cache architectures in high defect density technologies (PDF)

Amin Ansari , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
Shantanu Gupta , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
Shuguang Feng , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
Scott Mahlke , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
pp. 100-110

Low Vccmin fault-tolerant cache with highly predictable performance (Abstract)

Jaume Abella , Intel Barcelona Research Center, Intel Labs Barcelona - UPC (Spain)
Javier Carretero , Intel Barcelona Research Center, Intel Labs Barcelona - UPC (Spain)
Pedro Chaparro , Intel Barcelona Research Center, Intel Labs Barcelona - UPC (Spain)
Xavier Vera , Intel Barcelona Research Center, Intel Labs Barcelona - UPC (Spain)
Antonio Gonzalez , Intel Barcelona Research Center, Intel Labs Barcelona - UPC (Spain)
pp. 111-121

mSWAT: Low-cost hardware fault detection and diagnosis for multicore systems (PDF)

Siva Kumar Sastry Hari , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Man-Lap Li , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Pradeep Ramachandran , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Byn Choi , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Sarita V. Adve , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
pp. 122-132

BulkCompiler: High-performance Sequential Consistency through cooperative compiler and hardware support (PDF)

W. Ahn , University of Illinois at Urbana-Champaign, USA
S. Qi , University of Illinois at Urbana-Champaign, USA
M. Nicolaides , University of Illinois at Urbana-Champaign, USA
J. Torrellas , University of Illinois at Urbana-Champaign, USA
J.-W. Lee , Purdue University, USA
X. Fang , Purdue University, USA
S. Midkiff , Purdue University, USA
David Wong , Intel Corporation, USA
pp. 133-144

EazyHTM: EAger-LaZY hardware Transactional Memory (Abstract)

Sasa Tomic , BSC-Microsoft Research Centre, USA
Cristian Perfumo , BSC-Microsoft Research Centre, USA
Chinmay Kulkarni , BSC-Microsoft Research Centre, USA
Adria Armejach , BSC-Microsoft Research Centre, USA
Adrian Cristal , BSC-Microsoft Research Centre, USA
Osman Unsal , BSC-Microsoft Research Centre, USA
Tim Harris , Microsoft Research Cambridge, UK
Mateo Valero , BSC-Microsoft Research Centre, USA
pp. 145-155

Proactive transaction scheduling for contention management (PDF)

Geoffrey Blake , University of Michigan, Ann Arbor, USA
Ronald G. Dreslinski , University of Michigan, Ann Arbor, USA
Trevor Mudge , University of Michigan, Ann Arbor, USA
pp. 156-167

Into the wild: Studying real user activity patterns to guide power optimizations for mobile architectures (PDF)

Alex Shye , Northwestern University, Electrical Engineering and Computer Science Department, USA
Benjamin Scholbrock , Northwestern University, Electrical Engineering and Computer Science Department, USA
Gokhan Memik , Northwestern University, Electrical Engineering and Computer Science Department, USA
pp. 168-178

A microarchitecture-based framework for pre- and post-silicon power delivery analysis (PDF)

Mahesh Ketkar , Strategic CAD Labs, Intel Corporation, USA
Eli Chiprout , Strategic CAD Labs, Intel Corporation, USA
pp. 179-188

Reducing peak power with a table-driven adaptive processor core (PDF)

Vasileios Kontorinis , University of California, San Diego, USA
Amirali Shayan , University of California, San Diego, USA
Dean M. Tullsen , University of California, San Diego, USA
Rakesh Kumar , University of Illinois, Urbana-Champaign, USA
pp. 189-200

Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy (PDF)

Gabriel H. Loh , Georgia Institute of Technology, College of Computing, Atlanta, USA
pp. 201-212

An hybrid eDRAM/SRAM macrocell to implement first-level data caches (Abstract)

Alejandro Valero , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
Julio Sahuquillo , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
Salvador Petit , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
Vicente Lorente , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
Ramon Canal , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
Pedro Lopez , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
Jose Duato , Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, Spain
pp. 213-221

Variation-tolerant non-uniform 3D cache management in die stacked multicore processor (PDF)

Bo Zhao , Electrical and Computer Engineering Department, University of Pittsburgh, PA 15261, USA
Yu Du , Department of Computer Science, University of Pittsburgh, PA 15261, USA
Youtao Zhang , Department of Computer Science, University of Pittsburgh, PA 15261, USA
Jun Yang , Electrical and Computer Engineering Department, University of Pittsburgh, PA 15261, USA
pp. 222-231

In-Network Coherence Filtering: Snoopy coherence without broadcasts (PDF)

Niket Agarwal , Department of Electrical Engineering, Princeton University, USA
Li-Shiuan Peh , Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA
Niraj K. Jha , Department of Electrical Engineering, Princeton University, USA
pp. 232-243

SCARAB: A single cycle adaptive routing and bufferless network (Abstract)

Mitchell Hayenga , University of Wisconsin-Madison, USA
Natalie Enright Jerger , University of Toronto, USA
Mikko Lipasti , University of Wisconsin-Madison, USA
pp. 244-254

Low-cost router microarchitecture for on-chip networks (PDF)

John Kim , KAIST, Department of Computer Science, Daejeon, Korea
pp. 255-266

Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip (PDF)

Boris Grot , Department of Computer Sciences, The University of Texas at Austin, USA
Stephen W. Keckler , Department of Computer Sciences, The University of Texas at Austin, USA
Onur Mutlu , Computer Architecture Laboratory (CALCM), Carnegie Mellon University, USA
pp. 268-279

Application-aware prioritization mechanisms for on-chip networks (PDF)

Reetuparna Das , Pennsylvania State University, USA
Onur Mutlu , Carnegie Mellon University, USA
Thomas Moscibroda , Microsoft Research, USA
Chita R. Das , Pennsylvania State University, USA
pp. 280-291

A case for dynamic frequency tuning in on-chip networks (PDF)

Asit K. Mishra , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Reetuparna Das , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Soumya Eachempati , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Ravi Iyer , Integrated Platforms Lab, Intel Corporation, Hillsboro, OR 97124, USA
N. Vijaykrishnan , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
Chita R. Das , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16801, USA
pp. 292-303

Light speed arbitration and flow control for nanophotonic interconnects (PDF)

Dana Vantrease , Univ of Wisconsin - Madison, USA
Nathan Binkert , HP Laboratories, UK
Robert Schreiber , HP Laboratories, UK
Mikko H. Lipasti , Univ of Wisconsin - Madison, USA
pp. 304-315

Coordinated control of multiple prefetchers in multi-core systems (PDF)

Eiman Ebrahimi , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
Onur Mutlu , Computer Architecture Laboratory (CALCM), Carnegie Mellon University, USA
Chang Joo Lee , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
Yale N. Patt , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
pp. 316-326

Improving memory Bank-Level Parallelism in the presence of prefetching (PDF)

Chang Joo Lee , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
Veynu Narasiman , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
Onur Mutlu , Computer Architecture Laboratory (CALCM), Carnegie Mellon University, USA
Yale N. Patt , Department of Electrical and Computer Engineering, The University of Texas at Austin, USA
pp. 327-336

ESKIMO - energy savings using semantic knowledge of inconsequential memory occupancy for DRAM subsystem (PDF)

Ciji Isen , ECE Department, University of Texas at Austin, USA
Lizy John , ECE Department, University of Texas at Austin, USA
pp. 337-346

Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance (PDF)

Sangyeun Cho , Computer Science Department, University of Pittsburgh, USA
Hyunjin Lee , Computer Science Department, University of Pittsburgh, USA
pp. 347-357

Using a configurable processor generator for computer architecture prototyping (PDF)

Alex Solomatnikov , Hicamp Systems, Inc., USA
Amin Firoozshahian , Hicamp Systems, Inc., USA
Ofer Shacham , Stanford University, USA
Zain Asgar , Stanford University, USA
Megan Wachs , Stanford University, USA
Wajahat Qadeer , Stanford University, USA
Stephen Richardson , Stanford University, USA
Mark Horowitz , Stanford University, USA
pp. 358-369

Polymorphic Pipeline Array: A flexible multicore accelerator with virtualized execution for mobile multimedia applications (PDF)

Hyunchul Park , Texas Instruments, Inc., USA
Yongjun Park , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
Scott Mahlke , Advanced Computer Architecture Laboratory, University of Michigan, Ann Arbor, 48109, USA
pp. 370-380

Ordering decoupled metadata accesses in multiprocessors (PDF)

Hari Kannan , Computer Systems Laboratory, Stanford University, USA
pp. 381-390

Control flow obfuscation with information flow tracking (PDF)

Haibo Chen , Parallel Processing Institute, Fudan University, China
Liwei Yuan , Parallel Processing Institute, Fudan University, China
Xi Wu , Parallel Processing Institute, Fudan University, China
Binyu Zang , Parallel Processing Institute, Fudan University, China
Bo Huang , Intel China Software Center, China
Pen-chung Yew , Department of Computer Science and Engineering, University of Minnesota, USA
pp. 391-400

Pseudo-LIFO: The foundation of a new family of replacement policies for last-level caches (PDF)

Mainak Chaudhuri , Indian Institute of Technology, Kanpur 208016, INDIA
pp. 401-412

Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems (PDF)

Daniel Hackenberg , Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, 01062, Germany
Daniel Molka , Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, 01062, Germany
Wolfgang E. Nagel , Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, 01062, Germany
pp. 413-422

A Tagless Coherence Directory (PDF)

Jason Zebchuk , Dept. of Electrical and Computer Engineering, University of Toronto, Canada
Moinuddin K. Qureshi , Dept. of Electrical and Computer Engineering, University of Toronto, Canada
Vijayalakshmi Srinivasan , T.J. Watson Research Center, IBM, USA
Andreas Moshovos , T.J. Watson Research Center, IBM, USA
pp. 423-434

Tribeca: Design for PVT variations with local recovery and fine-grained adaptation (PDF)

Meeta S. Gupta , School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
Jude A. Rivers , IBM T.J Watson Research Center Yorktown Heights, NY 10598, USA
Pradip Bose , IBM T.J Watson Research Center Yorktown Heights, NY 10598, USA
Gu-Yeon Wei , School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
David Brooks , School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
pp. 435-446

The BubbleWrap many-core: Popping cores for sequential acceleration (PDF)

Ulya R. Karpuzcu , University of Illinois at Urbana-Champaign, USA
Brian Greskamp , University of Illinois at Urbana-Champaign, USA
Josep Torrellas , University of Illinois at Urbana-Champaign, USA
pp. 447-458

Multiple clock and Voltage Domains for chip multi processors (PDF)

Efraim Rotem , Intel Corporation, Israel
Avi Mendelson , Microsoft R&D, Israel
Ran Ginosar , Technion, Israel Institute of Technology, Israel
Uri Weiser , Technion, Israel Institute of Technology, Israel
pp. 459-468

McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures (PDF)

Sheng Li , University of Notre Dame, USA
Jung Ho Ahn , Seoul National University, Korea
Richard D. Strong , University of California, San Diego, USA
Jay B. Brockman , University of Notre Dame, USA
Dean M. Tullsen , University of California, San Diego, USA
Norman P. Jouppi , Hewlett-Packard Labs, USA
pp. 469-480

Characterizing the resource-sharing levels in the UltraSPARC T2 processor (Abstract)

Vladimir Cakarevic , Barcelona Supercomputing Center (BSC), Spain
Petar Radojkovic , Barcelona Supercomputing Center (BSC), Spain
Javier Verdu , Universitat Politecnica de Catalunya (UPC), Spain
Alex Pajuelo , Universitat Politecnica de Catalunya (UPC), Spain
Francisco J. Cazorla , Barcelona Supercomputing Center (BSC), Spain
Mario Nemirovsky , Barcelona Supercomputing Center (BSC), Spain
Mateo Valero , Barcelona Supercomputing Center (BSC), Spain
pp. 481-492

Execution leases: A hardware-supported mechanism for enforcing strong non-interference (PDF)

Mohit Tiwari , Department of Computer Science, University of California, Santa Barbara, USA
Xun Li , Department of Computer Science, University of California, Santa Barbara, USA
Hassan M G Wassel , Department of Computer Science, University of California, Santa Barbara, USA
Frederic T Chong , Department of Computer Science, University of California, Santa Barbara, USA
Timothy Sherwood , Department of Computer Science, University of California, Santa Barbara, USA
pp. 493-504

Optimizing shared cache behavior of chip multiprocessors (PDF)

Mahmut Kandemir , Pennsylvania State University, USA
Sai Prashanth Muralidhara , Pennsylvania State University, USA
Sri Hari Krishna Narayanan , Pennsylvania State University, USA
Yuanrui Zhang , Pennsylvania State University, USA
Ozcan Ozturk , Bilkent University, Turkey
pp. 505-516

SHARP control: Controlled shared cache management in chip multiprocessors (PDF)

Shekhar Srikantaiah , Department of CSE, The Pennsylvania State University, University Park, 16802, USA
Mahmut Kandemir , Department of CSE, The Pennsylvania State University, University Park, 16802, USA
Qian Wang , Department of MNE, The Pennsylvania State University, University Park, 16802, USA
pp. 517-528

Adaptive line placement with the set balancing cache (PDF)

Dyer Rolan , Depto. de Electron. e Sist., Univ. da Coruna, A Corua, Spain
Basilio B. Fraguela , Depto. de Electron. e Sist., Univ. da Coruna, A Corua, Spain
Ramon Doallo , Depto. de Electron. e Sist., Univ. da Coruna, A Corua, Spain
pp. 529-540

Light64: Lightweight hardware support for data race detection during Systematic Testing of parallel programs (PDF)

Adrian Nistor , University of Illinois at, Urbana-Champaign, USA
Darko Marinov , University of Illinois at, Urbana-Champaign, USA
Josep Torrellas , University of Illinois at, Urbana-Champaign, USA
pp. 541-552

Finding concurrency bugs with context-aware communication graphs (PDF)

Brandon Lucia , University of Michigan, Ann Arbor, USA
Luis Ceze , University of Michigan, Ann Arbor, USA
pp. 553-563

Offline symbolic analysis for multi-processor execution replay (PDF)

Dongyoon Lee , University of Michigan, Ann Arbor, USA
Mahmoud Said , Western Michigan University, USA
Satish Narayanasamy , University of Michigan, Ann Arbor, USA
Zijiang Yang , Western Michigan University, USA
Cristiano Pereira , Intel, USA
pp. 564-575

Architecting a chunk-based memory race recorder in Modern CMPs (PDF)

Gilles Pokam , Intel Corporation, USA
Cristiano Pereira , Intel Corporation, USA
Klaus Danne , Intel Corporation, USA
Rolf Kassa , Intel Corporation, USA
Ali-Reza Adl-Tabatabai , Intel Corporation, USA
pp. 576-586
89 ms
(Ver 3.3 (11022016))