Search For:

Displaying 1-6 out of 6 total
Toward Runtime Power Management of Exascale Networks by on/off Control of Links
Found in: 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)
By Ehsan Totoni,Nikhil Jain,Laxmikant V. Kale
Issue Date:May 2013
pp. 915-922
Higher radix networks, such as high-dimensional tori and multi-level directly connected networks, are being used for supercomputers as they become larger but need lower diameter. These networks have more resources (e.g. links) in order to provide good perf...
 
“Cool” Load Balancing for High Performance Computing Data Centers
Found in: IEEE Transactions on Computers
By Osman Sarood,Phil Miller,Ehsan Totoni,Laxmikant V. Kalé
Issue Date:December 2012
pp. 1752-1764
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a ...
 
Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs
Found in: Performance Analysis of Systems and Software, IEEE International Symmposium on
By Ehsan Totoni,Babak Behzad,Swapnil Ghike,Josep Torrellas
Issue Date:April 2012
pp. 78-87
Power dissipation and energy consumption are becoming increasingly important architectural design constraints in different types of computers, from embedded systems to large-scale supercomputers. To continue the scaling of performance, it is essential that...
 
Simulation-Based Performance Analysis and Tuning for a Two-Level Directly Connected System
Found in: Parallel and Distributed Systems, International Conference on
By Ehsan Totoni,Abhinav Bhatele,Eric J. Bohm,Nikhil Jain,Celso L. Mendes,Ryan M. Mokos,Gengbin Zheng,Laxmikant V. Kale
Issue Date:December 2011
pp. 340-347
Hardware and software co-design is becoming increasingly important due to complexities in supercomputing architectures. Simulating applications before there is access to the real hardware can assist machine architects in making better design decisions that...
 
Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures
Found in: ACM Transactions on Architecture and Code Optimization (TACO)
By Ehsan Totoni, María Jesús Garzarán, Mert Dikmen
Issue Date:December 2013
pp. 1-25
We optimize a visual object detection application (that uses Vision Video Library kernels) and show that OpenCL is a unified programming paradigm that can provide high performance when running on the Ivy Bridge heterogeneous on-chip architecture. We evalua...
     
ACM SRC poster: optimizing all-to-all algorithm for PERCS network using simulation
Found in: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion (SC '11 Companion)
By Ehsan Totoni, Laxmikant V. Kale
Issue Date:November 2011
pp. 123-124
Communication algorithms play a crucial role in the performance of large-scale parallel systems. They are implemented in runtime systems and used in most parallel applications as a critical component. As vendors are willing to design new custom networks wi...
     
 1