Program (PDF)
Author index (PDF)
A multi-GPU algorithm for communication in neuronal network simulations (Abstract)
Comparing archival policies for Blue Waters (Abstract)
Comparing archival policies for Blue Waters (Abstract)
Hybrid algorithms for list ranking and graph connected components (Abstract)
Parallel multiple precision division by a single precision divisor (Abstract)
Scalable clustering using multiple GPUs (Abstract)
Hybrid implementation of error diffusion dithering (Abstract)
Porting irregular reductions on heterogeneous CPU-GPU configurations (Abstract)
Building algorithmically nonstop fault tolerant MPI programs (Abstract)
High-level template for the task-based parallel wavefront pattern (Abstract)
Enabling CUDA acceleration within virtual machines using rCUDA (Abstract)
Parallel implementation of MOPSO on GPU using OpenCL and CUDA (Abstract)
Coordination mechanisms for selfish multi-organization scheduling (Abstract)
Maximizing throughput of jobs with multiple resource requirements (Abstract)
Weighted locality-sensitive scheduling for mitigating noise on multi-core clusters (Abstract)
Scheduling diverse high performance computing systems with the goal of maximizing utilization (Abstract)
A dynamic scheduling framework for emerging heterogeneous systems (Abstract)
GVT algorithms and discrete event dynamics on 129K+ processor cores (Abstract)
Improving graph coloring on distributed-memory parallel computers (Abstract)
Modelling and analyzing the authorization and execution of video workflows (Abstract)
Multi-model prediction for enhancing content locality in elastic server infrastructures (Abstract)
Highly scalable barriers for future high-performance computing clusters (Abstract)
Spectral evolution simulation on leading multi-socket, multicore platforms (Abstract)
Dynamic hosting management of web based applications over clouds (Abstract)
A fast centralized computation routing algorithm for self-configuring NoC systems (Abstract)
Partial globalization of partitioned address spaces for zero-copy communication with shared memory (Abstract)
Increasing the energy efficiency of TLS systems using intermediate checkpointing (Abstract)
A machine learning-based approach for thread mapping on transactional memory applications (Abstract)
Robust thread-level speculation (Abstract)
Implementing a hybrid SRAM / eDRAM NUCA architecture (Abstract)
High performance cache block replication using re-reference probability in CMPs (Abstract)
Adaptive memory power management techniques for HPC workloads (Abstract)
Compute & memory optimizations for high-quality speech recognition on low-end GPU processors (Abstract)
Dynamic selection of tile sizes (Abstract)
The impact of hyper-threading on processor resource utilization in production applications (Abstract)
Optimizing multicore performance with message driven execution: A case study (Abstract)
Reliable and randomized data distribution strategies for large scale storage systems (Abstract)
Supporting computational data model representation with high-performance I/O in parallel netCDF (Abstract)
A multiresolution data model for improving simulation I/O performance (Abstract)