The Community for Technology Leaders
Proceedings of MICRO'95: 28th Annual IEEE/ACM International Symposium on Microarchitecture (1995)
Ann Arbor, MI, USA
Nov. 29, 1995 to Dec. 1, 1995
ISSN: 1072-4451
ISBN: 0-8186-7349-4
TABLE OF CONTENTS

[Front matter] (PDF)

pp. 3-14

Dynamic path-based branch correlation (PDF)

Ravi Nair , IBM Thomas J. Watson Research Center, P. O. Box 704, Yorktown Heights, NY
pp. 15-23

The predictability of branches in libraries (Abstract)

Brad Calder , Department of Computer Science, University of Colorado, Campus Box 430, Boulder, CO
Dirk Grunwald , Department of Computer Science, University of Colorado, Campus Box 430, Boulder, CO
Amitabh Srivastava , Digital Equipment Corporation, Western Research Laboratory, WRL-2, 250 University Avenue, Palo Alto, CA
pp. 24-34

The performance impact of incomplete bypassing in processor pipelines (Abstract)

Pritpal S. Ahuja , Department of Computer Science, Princeton University, 35 Olden Street, Princeton, New Jersey
Douglas W. Clark , Department of Computer Science, Princeton University, 35 Olden Street, Princeton, New Jersey
Anne Rogers , Department of Computer Science, Princeton University, 35 Olden Street, Princeton, New Jersey
pp. 36-45

Efficient instruction scheduling using finite state automata (Abstract)

Vasanth Bala , Hewlett Packard Labs
Norman Rubin , Digital Equipment Corp.
pp. 46-56

Critical path reduction for scalar programs (Abstract)

Michael Schlansker , Hewlett-Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA
Vinod Kathail , Hewlett-Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA
pp. 57-69

A limit study of local memory requirements using value reuse profiles (Abstract)

Andrew S. Huang , Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
John P. Shen , Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
pp. 71-81

Zero-cycle loads: microarchitecture support for reducing load latency (Abstract)

Todd M. Austin , University of Wisconsin-Madison, 1210 W. Dayton Street, Madison, WI
Gurindar S. Sohi , University of Wisconsin-Madison, 1210 W. Dayton Street, Madison, WI
pp. 82-92

A modified approach to data cache management (Abstract)

Gary Tyson , Department of Computer Science, University of California, Riverside, Riverside, CA
Matthew Farrens , Computer Science Department, University of California, Davis, Davis, CA
John Matthews , Computer Science Department, University of California, Davis, Davis, CA
Andrew R. Pleszkun , Department of Electrical and Computer Engineering, University of Colorado-Boulder, Boulder, CO
pp. 93-103

Petri net versus modulo scheduling for software pipelining (Abstract)

Vicki H. Allan , Department of Computer Science, Utah State University, Logan, Utah
U. R. Shah , Department of Computer Science, Utah State University, Logan, Utah
K. M. Reddy , Department of Computer Science, Utah State University, Logan, Utah
pp. 105-110

Modulo scheduling with multiple initiation intervals (Abstract)

Nancy J. Warter-Perez , Department of Electrical and Computer Engineering, California State University, Los Angeles
Noubar Partamian , Department of Electrical and Computer Engineering, California State University, Los Angeles
pp. 111-119

Spill-free parallel scheduling of basic blocks (Abstract)

B. Natarajan , Hewlett Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA
M. Schlansker , Hewlett Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA
pp. 119-124

Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation (Abstract)

Jack W. Davidson , Department of Computer Science, Thornton Hall, University of Virginia, Charlottesville, VA
Sanjay Jinturkar , Department of Computer Science, Thornton Hall, University of Virginia, Charlottesville, VA
pp. 125-132

Self-regulation of workload in the Manchester Data-Flow computer (Abstract)

John R. Gurd , Centre for Novel Computing, Department of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
David F. Snelling , Centre for Novel Computing, Department of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
pp. 135-145

The M-Machine multicomputer (Abstract)

Marco Fillo , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
Stephen W. Keckler , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
William J. Dally , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
Nicholas P. Carter , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
Andrew Chang , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
Yevgeny Gurevich , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
Whay S. Lee , Artificial Intelligence Laboratory, Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA
pp. 146-156

Region-based compilation: an introduction and motivation (Abstract)

Richard E. Hank , Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL
Wen-Mei W. Hwu , Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL
B. Ramakrishna Rau , Hewlett Packard Laboratories, Palo Alto, CA
pp. 158-168

An experimental study of several cooperative register allocation and instruction scheduling strategies (Abstract)

Cindy Norris , Mathematical Sciences, Appalachian State University, Boone, NC
Lori L. Pollock , Computer and Information Sciences, University of Delaware, Newark, DE
pp. 169-179

Register allocation for predicated code (Abstract)

Alexandre E. Eichenberger , Advanced Computer Architecture Laboratory, EECS Department, University of Michigan, Ann Arbor, MI
Edward S. Davidson , Advanced Computer Architecture Laboratory, EECS Department, University of Michigan, Ann Arbor, MI
pp. 180-191

Partial resolution in branch target buffers (Abstract)

Barry Fagin , Dept of Computer Science, US Air Force Academy
Kathryn Russell , Lockheed Sanders Corporation
pp. 193-198

A system level perspective on branch architecture performance (Abstract)

Brad Calder , Department of Computer Science, University of Colorado, Campus Box 430, Boulder, CO
Dirk Grunwald , Department of Computer Science, University of Colorado, Campus Box 430, Boulder, CO
Joel Emer , Digital Semiconductor, 77 Reed Road (HLO2-3/J3), Hudson, MA
pp. 199-206

Dynamic rescheduling: a technique for object code compatibility in VLIW architectures (Abstract)

Thomas M. Conte , Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina
Sumedh W. Sathaye , Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina
pp. 208-218

Improving CISC instruction decoding performance using a fill unit (Abstract)

Mark Smotherman , Dept. of Computer Science, Clemson University, Clemson, SC
Manoj Franklin , Dept. of Elect. and Computer Eng., Clemson University, Clemson, SC
pp. 219-229

SPAID: software prefetching in pointer- and call-intensive environments (Abstract)

Mikko H. Lipasti , Carnegie Mellon University and IBM Corporation, 3705 Highway 52 North, Rochester, MN
William J. Schmidt , IBM Corporation, 3705 Highway 52 North, Rochester, MN
Steven R. Kunkel , IBM Corporation, 3705 Highway 52 North, Rochester, MN
Robert R. Roediger , IBM Corporation, 3705 Highway 52 North, Rochester, MN
pp. 231-236

An effective programmable prefetch engine for on-chip caches (Abstract)

Tien-Fu Chen , Department of Computer Science, National Chung Cheng University, Chiayi, Taiwan 621, ROC
pp. 237-242

Cache miss heuristics and preloading techniques for general-purpose programs (Abstract)

Toshihiro Ozawa , Fujitsu Laboratories Ltd., 1015, Kamikodanaka, Nakahara-ku, Kawasaki 211, Japan
Yasunori Kimura , Fujitsu Laboratories Ltd., 1015, Kamikodanaka, Nakahara-ku, Kawasaki 211, Japan
Shin'ichiro Nishizaki , Fujitsu Social Science Laboratory Ltd., 1-403, Kosugi-cho, Nakahara-ku, Kawasaki 211, Japan
pp. 243-248

Alternative implementations of hybrid branch predictors (Abstract)

Po-Ying Chang , Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, Michigan
Eric Hao , Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, Michigan
Yale N. Patt , Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, Michigan
pp. 252-257

Control flow prediction with tree-like subgraphs for superscalar processors (Abstract)

Simonjit Dutta , Texas Instruments, Semiconductor Group, P.O. Box 655303, M.S. 8316, Dallas, TX
Manoj Franklin , Dept. of Electrical and Computer Engineering, Clemson University, 221-C Riggs Hall, Clemson, SC
pp. 258-263

The role of adaptivity in two-level adaptive branch prediction (Abstract)

Stuart Sechrest , EECS Department, University of Michigan, 1301 Beal Ave., Ann Arbor, Michigan
Chih-Chieh Lee , EECS Department, University of Michigan, 1301 Beal Ave., Ann Arbor, Michigan
Trevor Mudge , EECS Department, University of Michigan, 1301 Beal Ave., Ann Arbor, Michigan
pp. 264-269

Design of storage hierarchy in multithreaded architectures (Abstract)

Lucas Roh , Math. & Computer Science Division, Argonne National Laboratory, Argonne, IL
Walid A. Najjar , Department of Computer Science, Colorado State University, Fort Collins, CO
pp. 271-278

An investigation of the performance of various instruction-issue buffer topologies (Abstract)

Stéphan Jourdan , Institut de Recherche en Informatique de Toulouse, Université Toulouse III, 118 route de Narbonne, 31062 Toulouse, France
Pascal Sainrat , Institut de Recherche en Informatique de Toulouse, Université Toulouse III, 118 route de Narbonne, 31062 Toulouse, France
Daniel Litaize , Institut de Recherche en Informatique de Toulouse, Université Toulouse III, 118 route de Narbonne, 31062 Toulouse, France
pp. 279-284

Decoupling integer execution in superscalar processors (Abstract)

Subbarao Palacharla , Computer Sciences Department, University of Wisconsin-Madison, Madison, WI
J. E. Smith , Department of Electrical and Computer Engineering, University of Wisconsin-Madison, Madison, WI
pp. 285-290

Exploiting short-lived variables in superscalar processors (Abstract)

Luis A. Lozano , Hewlett-Packard, California Language Laboratory and School of Computer Science, McGill University, Montreal, Quebec, Canada H3A 2A7
Guang R. Gao , School of Computer Science, McGill University, Montreal, Quebec, Canada H3A 2A7
pp. 292-302

Partitioned register file for TTAs (Abstract)

Johan Janssen , Department of Electrical Engineering, Delft University of Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
Henk Corporaal , Department of Electrical Engineering, Delft University of Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
pp. 303-312

Disjoint eager execution: an optimal form of speculative execution (Abstract)

Augustus K. Uht , Department of Electrical and Computer Engineering, University of Rhode Island, Kingston, RI
Vijay Sindagi , Department of Electrical and Computer Engineering, University of Rhode Island, Kingston, RI
Kelley Hall , University of Rhode Island, Kingston, RI
pp. 313-325

Unrolling-based optimizations for modulo scheduling (Abstract)

Daniel M. Lavery , Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL
Wen-Mei W. Hwu , Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL
pp. 327-337

Stage scheduling: a technique to reduce the register requirements of a modulo schedule (Abstract)

Alexandre E. Eichenberger , Advanced Computer Architecture Laboratory, EECS Department, University of Michigan, Ann Arbor, MI
Edward S. Davidson , Advanced Computer Architecture Laboratory, EECS Department, University of Michigan, Ann Arbor, MI
pp. 338-349

Hypernode reduction modulo scheduling (Abstract)

Josep Llosa , Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, Spain
Mateo Valero , Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, Spain
Eduard Ayguadé , Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, Spain
Antonio González , Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord, Mòdul D6, Gran Capità s/n, 08071, Barcelona, Spain
pp. 350-360
93 ms
(Ver 3.3 (11022016))