Search For:

Displaying 1-19 out of 19 total
A Parallel Software Infrastructure for Structured Adaptive Mesh Methods
Found in: SC Conference
By Scott R. Kohn, Scott B. Baden
Issue Date:December 1995
pp. 36
Structured adaptive mesh algorithms dynamically allocate computational resources to accurately resolve interesting portions of a numerical calculation. Such methods are difficult to implement and parallelize because they rely on dynamic, irregular data str...
SCALLOP: A Highly Scalable Parallel Poisson Solver in Three Dimensions
Found in: SC Conference
By Gregory T. Balls, Scott B. Baden, Phillip Colella
Issue Date:November 2003
pp. 23
SCALLOP is a highly scalable solver and library for elliptic partial differential equations on regular block-structured domains. SCALLOP avoids high communication overheads algorithmically by taking advantage of the locality properties inherent to solution...
A Large Scale Monte Carlo Simulator for Cellular Microphysiology
Found in: Parallel and Distributed Processing Symposium, International
By Gregory T. Balls, Scott B. Baden, Tilman Kispersky, Thomas M. Bartol, Terrence J. Sejnowski
Issue Date:April 2004
pp. 42a
Biological structures are extremely complex at the cellular level. The MCell project has been highly successful in simulating the microphysiology of systems of modest size, but many larger problems require too much storage and computation time to be simula...
A Programming Methodology for Dual-Tier Multicomputers
Found in: IEEE Transactions on Software Engineering
By Scott B. Baden, Stephen J. Fink
Issue Date:March 2000
pp. 212-226
<p><b>Abstract</b>—Hierarchically organized ensembles of shared memory multiprocessors possess a richer and more complex model of locality than previous generation multicomputers with single processor nodes. These <it>dual-tier comp...
Latency Hiding and Performance Tuning with Graph-Based Execution
Found in: Data-Flow Execution Models for Extreme Scale Computing, Workshop on
By Pietro Cicotti,Scott B. Baden
Issue Date:October 2011
pp. 28-37
In the current practice, scientific programmer and HPC users are required todevelop code that exposes a high degree of parallelism, exhibits high locality,dynamically adapts to the available resources, and hides communication latency.Hiding communication l...
A Scalable Parallel Poisson Solver in Three Dimensions with Infinite-Domain Boundary Conditions
Found in: Parallel Processing Workshops, International Conference on
By Peter McCorquodale, Phillip Colella, Gregory T. Balls, Scott B. Baden
Issue Date:June 2005
pp. 163-172
We present an elliptic free space solver that offers vastly improved performance over a previous variant of the algorithm. We currently scale up to 1024 processors of an IBM SP system, and we are planning to port the solver to Blue Gene/L. The solver emplo...
Dynamic Partitioning of Non-Uniform Structured Workloads with Spacefilling Curves
Found in: IEEE Transactions on Parallel and Distributed Systems
By John R. Pilkington, Scott B. Baden
Issue Date:March 1996
pp. 288-300
<p><b>Abstract</b>—We discuss Inverse Spacefilling Partitioning (ISP), a partitioning strategy for non-uniform scientific computations running on distributed memory MIMD parallel computers. We consider the case of a dynamic workload distr...
Parallel Cluster Identification for Multidimensional Lattices
Found in: IEEE Transactions on Parallel and Distributed Systems
By Stephen J. Fink, Craig Huston, Scott B. Baden, Karl Jansen
Issue Date:November 1997
pp. 1089-1097
<p><b>Abstract</b>—The cluster identification problem is a variant of connected component labeling that arises in cluster algorithms for spin models in statistical physics. We present a multidimensional version of Belkhale and Banerjee's ...
Accelerating a 3D Finite-Difference Earthquake Simulation with a C-to-CUDA Translator
Found in: Computing in Science & Engineering
By Didem Unat,Jun Zhou,Yifeng Cui,Scott B. Baden,Xing Cai
Issue Date:May 2012
pp. 48-59
GPUs provide impressive computing power, but GPU programming can be challenging. Here, an experience in porting real-world earthquake code to Nvidia GPUs is described. Specifically, an annotation-based programming model, called Mint, and its accompanying s...
Redefining the Role of the CPU in the Era of CPU-GPU Integration
Found in: IEEE Micro
By Manish Arora,Siddhartha Nath,Subhra Mazumdar,Scott B. Baden,Dean M. Tullsen
Issue Date:November 2012
pp. 4-16
In an integrated CPU-GPU system, the CPU executes code that is profoundly different than in past CPU-only environments. This new code's characteristics should drive future CPU design and architecture. Post-GPU code has lower instruction-level parallelism, ...
The Saaz Framework for Turbulent Flow Queries
Found in: eScience, IEEE International Conference on
By Alden King,Eric Arobone,Scott B. Baden,Sutanu Sarkar
Issue Date:December 2011
pp. 325-331
In many respects, numerical simulations involving solutions to partial differential equations have replaced physical experimentation. However, few tools are available to sift through the deluge of data. We present Saaz, a query framework to analyze the sim...
Accelerating Viola-Jones Face Detection to FPGA-Level Using GPUs
Found in: Field-Programmable Custom Computing Machines, Annual IEEE Symposium on
By Daniel Hefenbrock, Jason Oberg, Nhat Tan Nguyen Thanh, Ryan Kastner, Scott B. Baden
Issue Date:May 2010
pp. 11-18
Face detection is an important aspect for biometrics, video surveillance and human computer interaction. We present a multi-GPU implementation of the Viola-Jones face detection algorithm that meets the performance of the fastest known FPGA implementation. ...
An Adaptive Sub-sampling Method for In-memory Compression of Scientific Data
Found in: Data Compression Conference
By Didem Unat, Theodore Hromadka III, Scott B. Baden
Issue Date:March 2009
pp. 262-271
Advancing supercomputer performance through interconnection topology synthesis
Found in: Computer-Aided Design, International Conference on
By Yi Zhu, Michael Taylor, Scott B. Baden, Chung-Kuan Cheng
Issue Date:November 2008
pp. 555-558
In today’s many-core era, the interconnection networks have been the key factor that dominates the performance of a computer system. In this paper, we propose a design flow to discover the best topology in terms of the communication latency and physical co...
Communication overlap in multi-tier parallel algorithms
Found in: SC Conference
By Scott B. Baden, Stephen J. Fink
Issue Date:November 1998
pp. 33
Hierarchically organized multicomputers such as SMP clusters offer new opportunities and new challenges for high-performance computation, but realizing their full potential remains a formidable task. We present a hierarchical model of communication targete...
Bamboo -- Translating MPI applications to a latency-tolerant, data-driven form
Found in: 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
By Tan Nguyen,Pietro Cicotti,Eric Bylaska,Dan Quinlan,Scott B. Baden
Issue Date:November 2012
pp. 1-11
We present Bamboo, a custom source-to-source translator that transforms MPI C source into a data-driven form that automatically overlaps communication with available computation. Running on up to 98304 processors of NERSC's Hopper system, we observe that B...
A parallel software infrastructure for structured adaptive mesh methods
Found in: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM) (Supercomputing '95)
By Scott B. Baden, Scott R. Kohn
Issue Date:December 1995
pp. 36-es
This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalab...
Programming language requirements for the next millennium
Found in: ACM Computing Surveys (CSUR)
By Richard Wolski, Scott B. Baden, Scott R. Kohn, Stephen J. Fink, William G. Griswold
Issue Date:March 1988
pp. 194-es
Floating-point divide and square-root operations are essential to many scientific and engineering applications, and are required in all computer systems that support the IEEE floating-point standard. Yet many current microprocessors provide only weak supp...
Mint: realizing CUDA performance in 3D stencil methods with annotated C
Found in: Proceedings of the international conference on Supercomputing (ICS '11)
By Didem Unat, Scott B. Baden, Xing Cai
Issue Date:May 2011
pp. 214-224
We present Mint, a programming model that enables the non-expert to enjoy the performance benefits of hand coded CUDA without becoming entangled in the details. Mint targets stencil methods, which are an important class of scientific applications. We have ...