Search For:

Displaying 1-3 out of 3 total
“Cool” Load Balancing for High Performance Computing Data Centers
Found in: IEEE Transactions on Computers
By Osman Sarood,Phil Miller,Ehsan Totoni,Laxmikant V. Kalé
Issue Date:December 2012
pp. 1752-1764
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a ...
ACR: automatic checkpoint/restart for soft and hard error protection
Found in: Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis (SC '13)
By Nikhil Jain, Xiang Ni, Esteban Meneses, Laxmikant V. Kalé
Issue Date:November 2013
pp. 1-12
As machines increase in scale, many researchers have predicted that failure rates will correspondingly increase. Soft errors do not inhibit execution, but may silently generate incorrect results. Recent trends have shown that soft error rates are increasin...
G-Charm: an adaptive runtime system for message-driven parallel applications on hybrid systems
Found in: Proceedings of the 27th international ACM conference on International conference on supercomputing (ICS '13)
By Laxmikant V. Kalé, R. Vasudevan, Sathish S. Vadhiyar
Issue Date:June 2013
pp. 349-358
The effective use of GPUs for accelerating applications depends on a number of factors including effective asynchronous use of heterogeneous resources, reducing memory transfer between CPU and GPU, increasing occupancy of GPU kernels, overlapping data tran...