The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2007)
Brasov, Romania
Sept. 15, 2007 to Sept. 19, 2007
ISSN: 1089-795X
ISBN: 0-7695-2944-5
TABLE OF CONTENTS
Introduction

Program Committee (PDF)

pp. xiii

Keynotes (PDF)

pp. xiv-xvii

Sponsors (PDF)

pp. xviii
Hardware Track (Session 1): Systems

Architectural Support for the Stream Execution Model on General-Purpose Processors (Abstract)

William J. Dally , Stanford University, USA
Jayanth Gummaraju , Stanford University, USA
Joel Coburn , Stanford University, USA
Mattan Erez , University of Texas at Austin, USA
Mendel Rosenblum , Stanford University, USA
pp. 3-12

A Flexible Heterogeneous Multi-Core Architecture (Abstract)

Miquel Pericas , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
Mateo Valero , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
Adrian Cristal , Barcelona Supercomputing Center, Spain
Francisco J. Cazorla , Barcelona Supercomputing Center, Spain
Daniel A. Jimenez , The University of Texas at San Antonio, USA
Ruben Gonzalez , Universitat Politecnica de Catalunya, Spain
pp. 13-24

Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler (Abstract)

Margo Seltzer , Harvard University, USA
Michael D. Smith , Harvard University, USA
Alexandra Fedorova , Simon Fraser University, Canada
pp. 25-38
Software Track (Session 2): Pipelining

Speculative Decoupled Software Pipelining (Abstract)

Ram Rangan , Princeton University, USA
Neil Vachharajani , Princeton University, USA
Easwaran Raman , Princeton University, USA
Matthew J. Bridges , Princeton University, USA
Guilherme Ottoni , Princeton University, USA
David I. August , Princeton University, USA
pp. 49-59

Rotating Register Allocation for Enhanced Pipeline Scheduling (Abstract)

Suhyun Kim , IBM T.J. Watson Research Center, USA
Soo-Mook Moon , Seoul National University, Korea
pp. 60-72
Hardware Track (Session 3): Verification & Security

Unified Architectural Support for Soft-Error Protection or Software Bug Detection (Abstract)

Huiyang Zhou , University of Central Florida, USA
Martin Dimitrov , University of Central Florida, USA
pp. 73-82

Verification-Aware Microprocessor Design (Abstract)

Anita Lungu , Duke University
Daniel J. Sorin , Duke University
pp. 83-93

I2SEMS: Interconnects-Independent Security Enhanced Shared Memory Multiprocessor Systems (Abstract)

Minseon Ahn , Texas A&M University, USA
Manhee Lee , Texas A&M University, USA
Eun Jung Kim , Texas A&M University, USA
pp. 94-103

Error Detection Using Dynamic Dataflow Verification (Abstract)

Daniel J. Sorin , Duke University
Albert Meixner , Duke University
pp. 104-118
Software Track (Session 4): Optimizations

Extending Object-Oriented Optimizations for Concurrent Programs (Abstract)

David Tarditi , Microsoft Corporation
Michael D. Smith , Harvard University
Kelly Heffner , Harvard University
pp. 119-129

Language and Virtual Machine Support for Efficient Fine-Grained Futures in Java (Abstract)

Chandra Krintz , University of California, Santa Barbara, USA
Priya Nagpurkar , University of California, Santa Barbara, USA
Lingli Zhang , University of California, Santa Barbara, USA
pp. 130-139

Call-chain Software Instruction Prefetching in J2EE Server Applications (Abstract)

Mauricio Serrano , IBM T.J. Watson Research Center, USA
Harold W. Cain , IBM T.J. Watson Research Center, USA
Chandra Krintz , University of California, Santa Barbara, USA
Priya Nagpurkar , University of California, Santa Barbara, USA
Jong-Deok Choi , Samsung Electronics, Korea
pp. 140-149

Detecting Change in Program Behavior for Adaptive Optimization (Abstract)

Nitzan Peleg , IBM Haifa Research Lab
Bilha Mendelson , IBM Haifa Research Lab
pp. 150-162
Hardware Track (Session 5): Saving Energy

Reducing Energy Consumption of On-Chip Networks Through a Hybrid Compiler-Runtime Approach (Abstract)

Guangyu Chen , Microsoft Corporation
Mahmut Kandemir , Pennsylvania State University, USA
Feihui Li , Pennsylvania State University, USA
pp. 163-174

An Energy Efficient Parallel Architecture Using Near Threshold Operation (Abstract)

Trevor Mudge , University of Michigan-Ann Arbor, USA
Ronald G. Dreslinkski , University of Michigan-Ann Arbor, USA
Bo Zhai , University of Michigan-Ann Arbor, USA
David Blaauw , University of Michigan-Ann Arbor, USA
Dennis Sylvester , University of Michigan-Ann Arbor, USA
pp. 175-188
Software Track (Session 6): Algorithms

AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors (Abstract)

Hiroshi Inoue , IBM Tokyo Research Laboratory, Japan
Toshio Nakatani , IBM Tokyo Research Laboratory, Japan
Hideaki Komatsu , IBM Tokyo Research Laboratory, Japan
Takao Moriyama , IBM Tokyo Research Laboratory, Japan
pp. 189-198

The Fault Tolerant Parallel Algorithm: the Parallel Recomputing Based Failure Recovery (Abstract)

Zhiyuan Wang , National University of Defense Technology, China
Jia Jia , National University of Defense Technology, China
Xuejun Yang , National University of Defense Technology, China
Panfeng Wang , National University of Defense Technology, China
Yunfei Du , National University of Defense Technology, China
Hongyi Fu , National University of Defense Technology, China
Guang Suo , National University of Defense Technology, China
pp. 199-212
Hardware Track (Session 7): Processors

Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking (Abstract)

Brian Greskamp , University of Illinois at Urbana-Champaign, USA
Josep Torrellas , University of Illinois at Urbana-Champaign, USA
pp. 213-224

Early Register Release for Out-of-Order Processors with RegisterWindows (Abstract)

Eduardo Quinones , Universitat Politecnica de Catalunya, Spain
Antonio Gonzalez , Universitat Politecnica de Catalunya, Spain; Intel Barcelona Research Center, Spain
Joan-Manuel Parcerisa , Universitat Politecnica de Catalunya, Spain
pp. 225-234

L1 Cache Filtering Through Random Selection of Memory References (Abstract)

Yoav Etsion , The Hebrew University of Jerusalem, Israel
Dror G. Feitelson , The Hebrew University of Jerusalem, Israel
pp. 235-244

Effective Management of DRAM Bandwidth in Multicore Processors (Abstract)

Mithuna Thottethodi , Purdue University, USA
Won-Taek Lim , Purdue University, USA
Nauman Rafique , Purdue University, USA
pp. 245-258
Software Track (Session 8): Compilers

A Loop Correlation Technique to Improve Performance Auditing (Abstract)

Brad Calder , University of California, San Diego, USA
Michael Hind , IBM T.J. Watson Research Center, USA
Jeremy Lau , University of California, San Diego, USA
Matthew Arnold , IBM T.J. Watson Research Center, USA
pp. 259-269

Latency Hiding in Multi-Threading and Multi-Processing of Network Applications (Abstract)

Jinquan Dai , Intel China Software Center, China
Xiaofeng Guo , Google Inc
Zhiyuan Lv , Intel China Software Center, China
Prashant R. Chandra , Intel Corporation
Long Li , Intel China Software Center, China
pp. 270-279

Introducing Control Flow into Vectorized Code (Abstract)

Jaewook Shin , Argonne National Laboratory, USA
pp. 280-291

Automatic Correction of Loop Transformations (Abstract)

Albert Cohen , INRIA, Paris-Sud 11 University, France
Nicolas Vasilache , INRIA, Paris-Sud 11 University, France
Louis-Noel Pouchet , INRIA, Paris-Sud 11 University, France
pp. 292-304
Hardware Track (Session 9): Modeling & Measurement

FAME: FAirly MEasuring Multithreaded Architectures (Abstract)

Alex Pajuelo , Universitat Politecnica de Catalunya, Spain
Enrique Fernandez , Universidad de Las Palmas de Gran Canaria, Spain
Mateo Valero , Barcelona Supercomputing Center, Spain; Universitat Politecnica de Catalunya, Spain
Francisco J. Cazorla , Barcelona Supercomputing Center, Spain
Oliverio J. Santana , Universidad de Las Palmas de Gran Canaria, Spain
Javier Vera , Barcelona Supercomputing Center, Spain
pp. 305-316

CIGAR: Application Partitioning for a CPU/Coprocessor Architecture (Abstract)

Mark J. Murphy , University of Illinois at Urbana-Champaign, USA
Nacho Navarro , Universitat Politecnica de Catalunya, Spain
Steve Lumetta , University of Illinois at Urbana-Champaign, USA
Isaac Gelado , Universitat Politecnica de Catalunya, Spain
Wen-mei Hwu , University of Illinois at Urbana-Champaign, USA
John H. Kelm , University of Illinois at Urbana-Champaign, USA
pp. 317-326

Using PredictiveModeling for Cross-Program Design Space Exploration in Multicore Systems (Abstract)

Salman Khan , University of Edinburgh, UK
John Cavazos , University of Edinburgh, UK
Polychronis Xekalakis , University of Edinburgh, UK
Marcelo Cintra , University of Edinburgh, UK
pp. 327-338

CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms (Abstract)

Don Newell , Intel Corporation, USA
Srihari Makineni , Intel Corporation, USA
Li Zhao , Intel Corporation, USA
Ramesh Illikkal , Intel Corporation, USA
Jaideep Moses , Intel Corporation, USA
Ravi Iyer , Intel Corporation, USA
pp. 339-352
Software Track (Session 10): Transactional Memory & Locks

Component-Based Lock Allocation (Abstract)

Richard L. Halpert , McGill University, Canada
Christopher J.F. Pickett , McGill University, Canada
Clark Verbrugge , McGill University, Canada
pp. 353-364

JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory (Abstract)

Marek Olszewski , University of Toronto, Canada
Jeremy Cutler , University of Toronto, Canada
J. Gregory Steffan , University of Toronto, Canada
pp. 365-375

The OpenTM Transactional Application Programming Interface (Abstract)

Martin Trautmann , Stanford University, USA
Chi Cao Minh , Stanford University, USA
Woongki Baek , Stanford University, USA
Christos Kozyrakis , Stanford University, USA
Kunle Olukotun , Stanford University, USA
pp. 376-387

A Study of a Transactional Parallel Routing Algorithm (Abstract)

Chris Kirkham , The University of Manchester, UK
Ian Watson , The University of Manchester, UK
Mikel Luj? , The University of Manchester, UK
pp. 388-398
Poster Abstracts

Ring Prediction for Non-Uniform Cache Architectures (PDF)

Mary Jane Irwin , The Pennsylvania State University, USA
Mahmut Kandemir , The Pennsylvania State University, USA
Padma Raghavan , The Pennsylvania State University, USA
Sayaka Akioka , The Pennsylvania State University, USA
Feihui Li , The Pennsylvania State University, USA
pp. 401

Source Level Merging of Independent Programs (PDF)

Yosi Ben Asher , Haifa University, Israel
Moshe Yuda , Haifa University, Israel
pp. 402

Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems (PDF)

A.T. Chronopoulos , University of Texas at San Antonio, USA
T. Andronikos , Ionian University, Greece
F.M. Ciorba , University of Athens, Greece
I. Riakiotakis , University of Athens, Greece
G. Papakonstantinou , University of Athens, Greece
pp. 403

Studying Compiler-Microarchitecture Interactions through Interval Analysis (PDF)

Stijn Eyerman , Ghent University, Belgium
Lieven Eeckhout , Ghent University, Belgium
James E. Smith , University of Wisconsin-Madison, USA
pp. 406

FastForward for Efficient Pipeline Parallelism (PDF)

Tipp Moseley , University of Colorado at Boulder, USA
Manish Vachharajani , University of Colorado at Boulder, USA
John Giacomoni , University of Colorado at Boulder, USA
pp. 407

The Automatic Transformation of Linked List Data Structures (PDF)

Sven Groot , Leiden University, The Netherlands
Erwin M. Bakker , Leiden University, The Netherlands
Harry A.G. Wijshoff , Leiden University, The Netherlands
Harmen L.A. van der Spek , Leiden University, The Netherlands
pp. 408

Trace-based Automatic Padding for Locality Improvement with Correlative Data Visualization Interface (PDF)

Thomas Rauber , University of Bayreuth, Germany
Marco Hobbel , University of Bayreuth, Germany
Carsten Scholtes , University of Bayreuth, Germany
pp. 409

A New Parallel Gauss-Seidel Method by Iteration Space Alternate Tiling (PDF)

Jue Wang , University of Science and Technology Beijing, China
Changjun Hu , University of Science and Technology Beijing, China
Jianjiang Li , University of Science and Technology Beijing, China
Jilin Zhang , University of Science and Technology Beijing, China
Liang Ding , University of Science and Technology Beijing, China
pp. 410

Performance Portable Optimizations for Loops Containing Communication Operations (PDF)

Katherine Yelick , University of California at Berkeley, USA
Costin Iancu , Lawrence Berkeley National Laboratory, USA
Wei Chen , University of California at Berkeley, USA
pp. 411

Exploring the Application Behavior Space Using Parameterized Synthetic Benchmarks (PDF)

Ajay M. Joshi , The University of Texas at Austin, USA
Lizy K. John , The University of Texas at Austin, USA
Lieven Eeckhout , Ghent University, Belgium
pp. 412

Studying Asynchronous Shared Memory Computations (PDF)

Simo Juvaste , University of Joensuu, Finland
pp. 413

Fast Track: Supporting Unsafe Optimizations with Software Speculation (PDF)

Chen Ding , University of Rochester, USA
Kirk Kelsey , University of Rochester, USA
Chengliang Zhang , University of Rochester, USA
pp. 414

Hybrid Specialization: A Trade-off Between Static and Dynamic Specialization (PDF)

Henri-Pierre Charles , University of Versailles-Saint-Quentin-en-Yvelines, France
Denis Barthou , University of Versailles-Saint-Quentin-en-Yvelines, France
Minhaj Ahmad Khan , University of Versailles-Saint-Quentin-en-Yvelines, France
pp. 415

Rate-Driven Control of Resizable Caches for Highly Threaded SMT Processors (PDF)

Steve Dropsho , EPFL, Switzerland
Sonia Lopez , Universidad Comlutense de Madrid, Spain
Juan Lanchares , Universidad Comlutense de Madrid, Spain
Oscar Garnica , Universidad Comlutense de Madrid, Spain
David H. Albonesi , Cornell University, USA
pp. 416

Redesigning Parallel Symbolic Computations Packages (PDF)

Marc Frincu , Institute e-Austria Timisoara, Romania
Dana Petcu , Western University of Timisoara, Romania
Alexandru Carstea , Institute e-Austria Timisoara, Romania
Andrei Eckstein , Western University of Timisoara, Romania
Georgiana Macariu , Institute e-Austria Timisoara, Romania
pp. 417

MLP-Aware Dynamic Cache Partitioning (PDF)

Francisco J. Cazorla , Universitat Politecnica de Catalunya, Spain
Mateo Valero , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
Alex Ramirez , Universitat Politecnica de Catalunya, Spain; Barcelona Supercomputing Center, Spain
Miquel Moreto , Universitat Politecnica de Catalunya, Spain
pp. 418

A Lightweight Model for Software Thread-Level Speculation (TLS) (PDF)

Cosmin E. Oancea , University of Cambridge, UK
Alan Mycroft , University of Cambridge, UK
pp. 419

HelperCore_DB: Exploiting Multicore Technology for Databases (PDF)

Kostas Papadopoulos , University of Cyprus, Cyprus
Pedro Trancoso , University of Cyprus, Cyprus
Kyriakos Stavrou , University of Cyprus, Cyprus
pp. 420

Data Structure Exploration of Dynamic Applications (PDF)

L. Papadopoulos , Democritus University of Thrace, Greece
C. Baloukas , Democritus University of Thrace, Greece
N. Voros , Intracom Telecom Solutions, Greece
K. Potamianos , Intracom Telecom Solutions, Greece
D. Soudris , Democritus University of Thrace, Greece
pp. 421

Dynamic Cache Placement with Two-level Mapping to Reduce Conflict Misses (PDF)

Bharadwaj Amrutur , Indian Institute of Science, India
Kaushik Rajan , Indian Institute of Science, India
R. Govindarajan , Indian Institute of Science, India
pp. 422

Drug Design on the Cell BroadBand Engine (PDF)

Xavier Aguilar , Barcelona Supercomputing Center, Spain
Daniel Jimenez , Universitat Politecnica de Catalunya, Spain
Daniel Cabrera , Universitat Politecnica de Catalunya, Spain
Harald Servat , Barcelona Supercomputing Center, Spain
Cecilia Gonzalez , Universitat Politecnica de Catalunya, Spain
pp. 425

Bridging Inputs and Program Dynamic Behavior (PDF)

Feng Mao , The College of William and Mary, USA
Xipeng Shen , The College of William and Mary, USA
pp. 426

Power-Aware Compiler Controllable Chip Multiprocessor (PDF)

Jun Shirako , Waseda University, Japan
Keiji Kimura , Waseda University, Japan
Hiroaki Shikano , Waseda University, Japan; Hitachi, Ltd., Japan
Hironori Kasahara , Waseda University, Japan
Yasutaka Wada , Waseda University, Japan
pp. 427

RSTM : A Relaxed Consistency Software Transactional Memory for Multicores (PDF)

Tushar Kumar , Georgia Institute of Technology, USA
Jaswanth Sreeram , Georgia Institute of Technology, USA
Santosh Pande , Georgia Institute of Technology, USA
Romain Cledat , Georgia Institute of Technology, USA
pp. 428

VB-MT: Design Issues and Performance of the Validation Buffer Microarchitecture for Multithreaded Processors (PDF)

P. Lopez , Universidad Politecnica de Valencia, Spain
J. Sahuquillo , Universidad Politecnica de Valencia, Spain
S. Petit , Universidad Politecnica de Valencia, Spain
R. Ubal , Universidad Politecnica de Valencia, Spain
J. Duato , Universidad Politecnica de Valencia, Spain
pp. 429

A Scalable Low Power Store Queue for Large InstructionWindow Processors (PDF)

Rajesh Vivekanandharn , Indian Institute of Science, India
R. Govindarajan , Indian Institute of Science, India
pp. 430

Adapting to Intermittent Faults in Future Multicore Systems (PDF)

Koushik Chakraborty , University of Wisconsin, Madison, USA
Gurindar S. Sohi , University of Wisconsin, Madison, USA
Philip M. Wells , University of Wisconsin, Madison, USA
pp. 431

A Phase-Adaptive Approach to Increasing Cache Performance (PDF)

Lambert Schaelicke , Intel Corporation
Sally A. McKee , Cornell University, USA
Matthew A. Watkins , Cornell University, USA
pp. 432

Compiler Optimizations for Fault Tolerance Software Checking (PDF)

Jing Yu , University of Illinois at Urbana-Champaign, USA
Maria Jesus Garzaran , University of Illinois at Urbana-Champaign, USA
pp. 433

Optimizing Bandwidth Constraint through Register Interconnection for Stream Processors (PDF)

Binyu Zang , Fudan University, China
Chuanqi Zhu , Fudan University, China
Tao Bao , Fudan University, China
Weihua Zhang , Fudan University, China
pp. 434
Author Index

Author Index (PDF)

pp. 435
91 ms
(Ver )