The Community for Technology Leaders
Parallel Processing Symposium, International (1996)
Honolulu, HI
Apr. 15, 1996 to Apr. 19, 1996
ISSN: 1063-7133
ISBN: 0-8186-7255-2
TABLE OF CONTENTS

Reviewers (PDF)

pp. xxiii
Keynote Address - "Can Multithreaded Programming Save Massively Parallel Computing?"
Session 1 - Compiler Optimization

Commutativity Analysis: A Technique for Automatically Parallelizing Pointer-Based Computations (Abstract)

Pedro Diniz , University of California at Santa Barbara
Martin Rinard , University of California at Santa Barbara
pp. 14

Profiling Dependence Vectors for Loop Parallelization (Abstract)

Chung-Ta King , Department of Computer Science National Tsing Hua University
Chuan-Yi Tang , Department of Computer Science National Tsing Hua University
Shaw-Yen Tseng , Department of Computer Science National Tsing Hua University
pp. 23

The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching (Abstract)

Jacqueline Chame , Computer Science Department, University of Southern California
Sungdo Moon , Computer Science Department, University of Southern California
Daeyeon Park , Computer Science Department, University of Southern California
Weihua Mao , Computer Science Department, University of Southern California
Rafael H. Saavedra , Computer Science Department, University of Southern California
pp. 39
Session 2 - Scientific/Engineering Applications

Ocean Circulation on the Intel Paragon: Modeling and Implementation (Abstract)

Ka-Cheong Leung , The Hong Kong University of Science and Technology
Hsiao-Ming Hsu , National Center for Atmospheric Research
Ishfaq Ahmad , The Hong Kong University of Science and Technology
Ishfaq Ahmad , The Hong Kong University of Science and Technology
Ka-Cheong Leung , The Hong Kong University of Science and Technology
pp. 47

Performance Modeling and Composition: A Case Study in Cell Simulation (Abstract)

Jun Yang , Computer Science Division, University of California at Berkeley
Katherine Yelick , Computer Science Division, University of California at Berkeley
Steve G. Steinberg , Computer Science Division, University of California at Berkeley
pp. 68
Session 3 - Distributed Memory Systems

A Study of High Performance Communication Mechanism for Multicomputer Systems (Abstract)

Hideki Murayama , Information Systems R&D Division, Hitachi, Ltd.
Takehisa Hayashi , Information Systems R&D Division, Hitachi, Ltd.
Shooichi Murase , Information Systems R&D Division, Hitachi, Ltd.
Hidenori Inouchi , Information Systems R&D Division, Hitachi, Ltd.
Satoshi Yoshizawa , Information Systems R&D Division, Hitachi, Ltd.
Hiroshi Iwamoto , Information Systems R&D Division, Hitachi, Ltd.
Takeshi Aimoto , Information Systems R&D Division, Hitachi, Ltd.
pp. 76

A TeraFLOP Supercomputer in 1996: the ASCI TFLOP System (Abstract)

Stephen Wheat , Intel Corporation, Enterprise Server Group
David Scott , Intel Corporation, Enterprise Server Group
Timothy G. Mattson , Intel Corporation, Enterprise Server Group
pp. 84

Achieving a Balanced Low-Cost Architecture for Mass Storage Management through Multiple Fast Ethernet Channels on the Beowulf Parallel Workstation (Abstract)

Chance Reschke , Center of Excellence in Space Data and Information Sciences
Michael R. Berry , The George Washington University
Daniel Savarese , University of Maryland
Thomas Sterling , Center of Excellence in Space Data and Information Sciences
Donald J. Becker , Center of Excellence in Space Data and Information Sciences
pp. 104

Exploiting the capabilities of communications co-processors (Abstract)

J.M. Ferguson , Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
P.Z. Kolano , Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
C.J. Scheiman , Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
K.E. Schauser , Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
pp. 109

Effects of Multithreading on Data and Workload Distribution for Distributed-Memory Multiprocessors (Abstract)

Namhoon Yoo , University of Southern California,
Jean-Luc Gaudiot , University of Southern California,
Andrew Sohn , New Jersey Institute of Technology
Mitsuhisa Sato , Electrotechnical Laboratory
pp. 116
Session 4 - Shared Memory Systems

Dag-Consistent Distributed Shared Memory (Abstract)

Matteo Frigo , MIT Laboratory for Computer Science
Robert D. Blumofe , MIT Laboratory for Computer Science
Christopher F. Joerg , MIT Laboratory for Computer Science
Charles E. Leiserson , MIT Laboratory for Computer Science
Keith H. Randall , MIT Laboratory for Computer Science
pp. 132

Categorizing Network Traffic in Update-Based Protocols on Scalable Multiprocessors (Abstract)

Thomas J. LeBlanc , University of Rochester
Ricardo Bianchini , Universidade Federal do Rio de Janeiro
Jack E. Veenstra , Silicon Graphics
pp. 142

Implementing the Data Diffusion Machine using Crossbar Routers (Abstract)

Paul W.A. Stallard , University of Bristol, UK.
David H.D. Warren , University of Bristol, UK.
Henk L. Muller , University of Bristol, UK.
pp. 152
Session 5 - Algorithms

Approximate Compaction and Padded-Sorting on Exclusive Write PRAMs (Abstract)

Tomasz Wierzbicki , Uniwersytet Wroclawski
Miroslaw Kutylowski , Universitat-GH Paderborn
pp. 174

A parallel solution to the extended set union problem with unlimited backtracking (Abstract)

M.C. Pinotti , Istituto di Elaborazione dell'Inf., CNR, Pisa, Italy
V.A. Crupi , Istituto di Elaborazione dell'Inf., CNR, Pisa, Italy
S.K. Das , Istituto di Elaborazione dell'Inf., CNR, Pisa, Italy
pp. 182

A Parallel Algorithm for Minimization of Finite Automata (Abstract)

X. Xiong , University of Rhode Island
B. Ravikumar , University of Rhode Island
pp. 187

A randomized algorithm for Voronoi diagram of line segments on coarse grained multiprocessors (Abstract)

Xiaotie Deng , Dept. of Comput. Sci., York Univ., North York, Ont., Canada
Binhai Zhu , Dept. of Comput. Sci., York Univ., North York, Ont., Canada
pp. 192

Constructing the Spanners of Graphs in Parallel (Abstract)

Weifa liang , Australian National University
Richard P. Brent , Australian National University
pp. 206
Session 6 - Programming Languages

Nested Parallel Call Optimization (Abstract)

Enrico Pontelli , New Mexico State University
Gopal Gupta , New Mexico State University
pp. 225

The Parallel Break Construct, or How to Kill an Activity Tree (Abstract)

Yair I. Friedman , Institute of Computer Science
Dror G. Feitelson , Institute of Computer Science
Iaakov Exman , Institute of Computer Science
pp. 230

Support for Extensibility and Reusability in a Concurrent Object-Oriented Programming Language (Abstract)

J.C. Browne , Department of Computer Sciences
Raju Pandey , Computer Science Department
pp. 241
Session 7 - Communication I

Modeling the Communication Performance of the IBM SP2 (Abstract)

Edward S. Davidson , Advanced Computer Architecture Laboratory
Gheith A. Abandah , Advanced Computer Architecture Laboratory
pp. 249

The Effects of Network Contention on Processor Allocation Strategies (Abstract)

Lionel M. Ni , Michigan State University
Sherry Q. Moore , Michigan State University
pp. 268

ServerNet deadlock avoidance and fractahedral topologies (Abstract)

R. Horst , Tandem Comput. Inc., Cupertino, CA, USA
pp. 274

Analysis of memory interference in buffered multiprocessor systems in presence of hot spots and favorite memories (Abstract)

S.K. Sen , Dept. of Comput. Sci., North Texas Univ., Denton, TX, USA
S.K. Das , Dept. of Comput. Sci., North Texas Univ., Denton, TX, USA
pp. 281
Session 8 - Implementation of Primitive Operations

Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection (Abstract)

David A. Bader , Institute for Advanced Computer Studies, and Department of Electrical Engineering
Joseph JaJa , Institute for Advanced Computer Studies, and Department of Electrical Engineering
pp. 292

Practical algorithms for selection on coarse-grained parallel computers (Abstract)

I. Al-Furaih , Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA
S. Aluru , Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA
S. Goil , Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA
S. Ranka , Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA
pp. 309

Parallel Multilevel Graph Partitioning (Abstract)

Vipin Kumar , University of Minnesota, Department of Computer Science
George Karypis , University of Minnesota, Department of Computer Science
pp. 314

PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines (Abstract)

Seungjo Bae , Syracuse University
Sanjay Ranka , University of Florida
pp. 320
Session 9 - Resource Allocation and Management

Resource Placement in Torus-Based Networks (Abstract)

Bella Bose , Oregon State University
Myung M. Bae , Oregon State University
pp. 327

Simultaneous Compression of Makespan and Number of Processors Using CRP (Abstract)

Yiqun Ge , University of Hawaii at Manoa (UHM)
David Y. Y. Yun , University of Hawaii at Manoa (UHM)
pp. 332

Implementation of Scalable Blocking Locks using an Adaptive Thread Scheduler (Abstract)

Karsten Schwan , Georgia Institute of Technology
Bodhisattwa Mukherjee , T. J. Watson Research Center
pp. 339

Hector: Automated Task Allocation for MPI (Abstract)

Bjorn Heckel , NSF Engineering Research Center for Computational Field Simulation
Jonathan Robinson , NSF Engineering Research Center for Computational Field Simulation
Samuel H. Russ , NSF Engineering Research Center for Computational Field Simulation
Brian Flachs , NSF Engineering Research Center for Computational Field Simulation
pp. 344
Keynote Address - "MPPs versus Clusters"
Session 10 - Communication II

Software support for virtual memory-mapped communication (Abstract)

E.W. Felten , Dept. of Comput. Sci., Princeton Univ., NJ, USA
Kai Li , Dept. of Comput. Sci., Princeton Univ., NJ, USA
L. Iftode , Dept. of Comput. Sci., Princeton Univ., NJ, USA
C. Dubnicki , Dept. of Comput. Sci., Princeton Univ., NJ, USA
pp. 372

A Comparative Study of Methods for Time-Deterministic Message Delivery in a Multiprocessor Architecture (Abstract)

Jan Jonsson , Chalmers University of Technology
Jonas Vasell , Chalmers University of Technology
pp. 392

ECO: Efficient Collective Operations for Communication on Heterogeneous Networks (Abstract)

Adam Beguelin , Carnegie Mellon Computer Science
Bruce Lowekamp , Carnegie Mellon Computer Science
pp. 399

Software techniques for improving MPP bulk-transfer performance (Abstract)

A. Fox , California Univ., Berkeley, CA, USA
E.A. Brewer , California Univ., Berkeley, CA, USA
A. Schuett , California Univ., Berkeley, CA, USA
P. Gauthier , California Univ., Berkeley, CA, USA
pp. 406
Session 11 - Algorithms: Implementation

Parallel Algorithms for Image Enhancement and Segmentation by Region Growing with an Experimental Study (Abstract)

David A. Bader , Institute for Advanced Computer Studies
and Larry S. Davis , Institute for Advanced Computer Studies
David Harwood , Institute for Advanced Computer Studies
Joseph JaJa , Institute for Advanced Computer Studies
pp. 414

The chessboard distance transform and the medial axis transform are interchangeable (Abstract)

Yu-Hua Lee , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
Shi-Jinn Horng , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
pp. 424

Study of Scalable Declustering Algorithms for Parallel Grid Files (Abstract)

Anurag Acharya , UMIACS and Dept. of Computer Science
Bongki Moon , UMIACS and Dept. of Computer Science
Joel Saltz , UMIACS and Dept. of Computer Science
pp. 434
Session 12 - Performance Evaluation and Prediction

The Relation of Scalability and Execution Time (Abstract)

Xian-He Sun , Louisiana State University
pp. 457

Maximizing Speedup through Self-Tuning of Processor Allocation (Abstract)

John Zahorjan , Department of Computer Science and Engineering
Thu D. Nguyen , Department of Computer Science and Engineering
Raj Vaswani , Department of Computer Science and Engineering
pp. 463

Profiling Optimized Code: a Profiling System for an HPF Compiler (Abstract)

Tatsuya Shindo , High Performance Computing Group
Shaun Kaneshiro , High Performance Computing Group
pp. 469

Toward Symbolic Performance Prediction of Parallel Programs (Abstract)

Thomas Fahringer , Institute for Software Technology and Parallel Systems
pp. 474
Session-I: Parallel Architectures - Implementation, Programming, and Performance

Overview of IBM System/390 Parallel Sysplex- A Commercial Parallel Processing System (Abstract)

Jeffrey M. Nick , IBM System/390 Division
Nicholas S. Bowen , IBM Thomas J. Watson Research Center
Jen-Yao Chung , IBM Thomas J. Watson Research Center
pp. 488
Session-II: Networking and Distributed Computing

Performance Modeling of ServerNet(TM} Topologies (Abstract)

B. Horst , Tandem Computers Incorporated
W. Watson , Tandem Computers Incorporated
C. Cunningham , Texas A&M University
D. Avresky , Texas A&M University
L. Young , Tandem Computers Incorporated
R. Wilkinson , Texas A&M University
D. Jewett , Tandem Computers Incorporated
pp. 518
Session 13 - Synchronization, Virtual Memory, and Runtime System Support

Tulip: A Portable Run-Time System for Object-Parallel Systems (Abstract)

Dennis Gannon , Computer Science Department, Indiana University
Peter Beckman , Computer Science Department, Indiana University
pp. 532

A Virtual Memory Model for Parallel Supercomputers (Abstract)

Veronica L. M. Reis , University of California, Irvine
Isaac D. Scherson , University of California, Irvine
pp. 537
Session 14 - Arrays and Hypercubes

Determining Asynchronous Acyclic Pipeline Execution Times (Abstract)

Jeanne Ferrante , University of California, San Diego
Val Donaldson , University of California, San Diego
pp. 568

On Some Global Operations in Faulty SIMD Hypercubes (Abstract)

C. S. Raghavendra , School of Electrical Engineering and Computer Science
Amit Sengupta , School of Electrical Engineering and Computer Science
pp. 579

An Improved Approximation Algorithm for Scheduling Task Trees on Linear Arrays (Abstract)

Hari Krishna Tadepalli , Department of Computer and Information Sciences
Errol L. Lloyd , Department of Computer and Information Sciences
pp. 584
Session 15 - Mathematical Methods

Jacobi-like Algorithms for Eigenvalue Decomposition of a Real Normal Matrix Using Real Arithmetic (Abstract)

R. P. Brent , The Australian National University
B. B. Zhou , The Australian National University
pp. 593

An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes (Abstract)

Hong Q. Ding , Jet Propulsion Laboratory, MS 168-522
Robert D. Ferraro , Jet Propulsion Laboratory, MS 168-522
pp. 601

Analysis of the Numerical Effects of Parallelism on a Parallel Genetic Algorithm (Abstract)

R.K. Belew , Sandia National Labs
S. Baden , Sandia National Labs
W.E. Hart , Sandia National Labs
S. Kohn , Sandia National Labs
pp. 606

Compiling MATLAB Programs to ScaLAPACK: Exploiting Task and Data Parallelism (Abstract)

Eugene W. Hodges Iv , University of Illinois at Urbana-Champaign
Prithviraj Banerjee , University of Illinois at Urbana-Champaign
Shankar Ramaswamy , University of Illinois at Urbana-Champaign
pp. 613

Mapping Techniques for Parallel Evaluation of Chains of Recurrences (Abstract)

E.V. Zima , Moscow State University, Moscow
Karthi R. Vadivelu , University of Iowa, Iowa City
Thomas L. Casavant , Univercity of Iowa, Iowa City
pp. 620

Performance of Asynchronous Linear Iterations with Random Delays (Abstract)

Michel Dubois , University of Southern California
Adrian C. Moga , University of Southern California
pp. 625
Panel -- For a Massive Number of Massively Parallel Machines
Keynote Address - "Clusters for Commercial Computing: An Invisible Architecture"
Session 16 - Interconnection Networks

Generic Methodologies for Deadlock-Free Routing (Abstract)

Hyunmin Park , Myongji University
Dharma P. Agrawal , North Carolina State University
pp. 638

Partitionability of the multistage interconnection networks (Abstract)

Yeimkuan Chang , Dept. of Inf. Manage., Chun-hua Polytech. Inst., Hsinchu, Taiwan
pp. 644

On embedding various networks into the hypercube using matrix transformations (Abstract)

S.W. Song , Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon, Hong Kong
M. Hamdi , Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon, Hong Kong
pp. 650

An Optical Interconnect Model for k-ary n-cube Wormhole Networks (Abstract)

Timothy Mark Pinkston , SMART Interconnects Group
Mongkol Raksapatcharawong , SMART Interconnects Group
pp. 666
Session 17 - Bus-Based Algorithms

Fault-Tolerant Multiple Bus Networks for Fan-in Algorithms (Abstract)

S. Nadella , SoftSol Resources, Inc.
R. Vaidyanathan , Louisiana State University
pp. 674

Parallel Algorithms using Unreliable Broadcasts (Abstract)

John Matthews , University of California at Davis, Davis, CA
Charles Martel , University of California at Davis, Davis, CA
pp. 692

Efficient Algorithms for the Hough Transform on Arrays with Reconfigurable Optical Buses (Abstract)

Sandy Pavel , Department of Computing and Information Science
Selim G. Akl , Department of Computing and Information Science
pp. 697
Session 18 - Image and Radar Processing

Some image processing algorithms on a RAP with wider bus networks (Abstract)

Shi-Jinn Horng , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
Yu-Hua Lee , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
Shung-Shing Lee , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
Horng-Ren Tsai , Dept. of Electr. Eng., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
pp. 708

Parallel Synthetic Aperture Radar Processing on Workstation Networks (Abstract)

Peter G. Meisl , University of British Columbia
Mabo R. Ito , University of British Columbia
Ian G. Cumming , University of British Columbia
pp. 716

Space-Time Adaptive Processing on the Mesh Synchronous Processor (Abstract)

Ken Teitelbaum , MIT Lincoln Laboratory
Janice S. McMahon , MIT Lincoln Laboratory
pp. 734

An Experimental Study of Input/Output Characteristics of NASA Earth and Space Sciences Applications (Abstract)

Michael R. Berry , Department of Electrical Engineering and Computer Science
Tarek A. El-Ghazawi , Department of Electrical Engineering and Computer Science
pp. 741
Session 19 - Special-Purpose Applications

Designing Adaptable Real-Time Fault-Tolerant Parallel Systems (Abstract)

Celio Estevan Moron , Universidade Federal de Sao Carlos
pp. 754
Session 20 - Communication III

Broadcasting Multiple Messages in the Multiport Model (Abstract)

Amotz Bar-Noy , Tel Aviv University
Ching-Tien Ho , IBM Almaden Research Center
pp. 781

The Necessary Conditions for Clos-Type Nonblocking Multicast Networks (Abstract)

Yuanyuan Yang , University of Vermont, Burlington, VT 05405
Gerald M. Masson , Johns Hopkins University, Baltimore, MD 21218
pp. 789

A Class of Interconnection Networks for Multicasting (Abstract)

Yuanyuan Yang , University of Vermont, Burlington, VT 05405
pp. 796

Performance Prediction of PVM Programs (Abstract)

Michael R. Steed , Brigham Young University
Mark J. Clement , Brigham Young University
pp. 803

Algorithms for All-to-All Personalized Exchange in 2D and 3D Tori (Abstract)

Young-Joo Suh , Georgia Institute of Technology
Sudhakar Yalamanchili , Georgia Institute of Technology
pp. 808

Generalized theory for deadlock-free adaptive wormhole routing and its application to Disha Concurrent (Abstract)

J. Duato , Pyramid Technol. Corp., San Jose, CA, USA
T.M. Pinkston , Pyramid Technol. Corp., San Jose, CA, USA
K.V. Anjan , Pyramid Technol. Corp., San Jose, CA, USA
pp. 815
Session 21 - Clusters and Domain Decomposition

Efficient Run-time Support for Irregular Task Computations with Mixed Granularities (Abstract)

Tao Yang , University of California
Cong Fu , University of California
pp. 823

Native ATM application programmer interface testbed for cluster-based computing (Abstract)

T.M. Carrozzi , Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
A. Xin Chen , Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
F.A. Pellegrino , Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
P.W. Dowd , Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
pp. 843

SWEB: Towards a Scalable World Wide Web Server on Multicomputers (Abstract)

Daniel Andresen , University of California, Santa Barbara, CA, USA
Vegard Holmedahl , University of California, Santa Barbara, CA, USA
Tao Yang , University of California, Santa Barbara, CA, USA
Oscar Ibarra , University of California, Santa Barbara, CA, USA
pp. 850

Parallel Implementations of Irregular Problems using High-level Actor Language (Abstract)

R. B. Panwar , IBM Santa Teresa Labs
G. A. Agha , University of Illinois at Urbana-Champaign
W. Kim , IBM Santa Teresa Labs
pp. 857

Auhor Index (PDF)

pp. 899
100 ms
(Ver 3.3 (11022016))