The Community for Technology Leaders
2014 21st International Conference on High Performance Computing (HiPC) (2014)
Goa, India
Dec. 17, 2014 to Dec. 20, 2014
ISBN: 978-1-4799-5975-4
TABLE OF CONTENTS

Optimizing the performance of parallel applications on a 5D torus via task mapping (Abstract)

Abhinav Bhatele , Lawrence Livermore National Laboratory, Livermore, California 94551 USA
Nikhil Jain , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801 USA
Katherine E. Isaacs , Department of Computer Science, University of California, Davis, California 95616 USA
Ronak Buch , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801 USA
Todd Gamblin , Lawrence Livermore National Laboratory, Livermore, California 94551 USA
Steven H. Langer , Lawrence Livermore National Laboratory, Livermore, California 94551 USA
Laxmikant V. Kale , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801 USA
pp. 1-10

Balancing context switch penalty and response time with elastic time slicing (Abstract)

Nagakishore Jammula , Georgia Institute of Technology, Atlanta, Georgia, USA
Moinuddin Qureshi , Georgia Institute of Technology, Atlanta, Georgia, USA
Ada Gavrilovska , Georgia Institute of Technology, Atlanta, Georgia, USA
Jongman Kim , Georgia Institute of Technology, Atlanta, Georgia, USA
pp. 1-10

Scaling graph community detection on the Tilera many-core architecture (Abstract)

Daniel Chavarria-Miranda , High Performance Computing, Pacific Northwest National Laboratory, Richland, WA
Mahantesh Halappanavar , High Performance Computing, Pacific Northwest National Laboratory, Richland, WA
Ananth Kalyanaraman , School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA
pp. 1-11

Matrix-matrix multiplication on a large register file architecture with indirection (Abstract)

Dheeraj Sreedhar , IBM Research, Bangalore, India
J. H Derby , IBM Research Raleigh, NC
R. K. Montoye , IBM Research, Yorktown Heights, NY
C. L. Johnson , IBM Research, Yorktown Heights, NY
pp. 1-10

TriKon: A hypervisor aware manycore processor (Abstract)

Rohan Bhalla , Computer Science Department, Indian Institute of Technology, Hauz Khas, New Delhi, India
Prathmesh Kallurkar , Computer Science Department, Indian Institute of Technology, Hauz Khas, New Delhi, India
Nitin Gupta , Amazon Development Centre, Bangalore, India
Smruti R. Sarangi , Computer Science Department, Indian Institute of Technology, Hauz Khas, New Delhi, India
pp. 1-10

Optical overlay NUCA: A high speed substrate for shared L2 caches (Abstract)

Eldhose Peter , Computer Science and Engineering, Indian Institute of Technology Delhi, New Delhi, India
Anuj Arora , Computer Science and Engineering, Indian Institute of Technology Delhi, New Delhi, India
Akriti Bagaria , Computer Science and Engineering, Indian Institute of Technology Delhi, New Delhi, India
Smruti R Sarangi , Computer Science and Engineering, Indian Institute of Technology Delhi, New Delhi, India
pp. 1-10

On the suitability of MPI as a PGAS runtime (Abstract)

Jeff Daily , Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
Abhinav Vishnu , Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
Bruce Palmer , Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
Hubertus van Dam , Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
Darren Kerbyson , Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA
pp. 1-10

Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters (Abstract)

Rong Shi , Department of Computer Science and Engineering, The Ohio State University
Sreeram Potluri , Department of Computer Science and Engineering, The Ohio State University
Khaled Hamidouche , Department of Computer Science and Engineering, The Ohio State University
Jonathan Perkins , Department of Computer Science and Engineering, The Ohio State University
Mingzhe Li , Department of Computer Science and Engineering, The Ohio State University
Davide Rossetti , NVIDIA Corporation
Dhabaleswar K. D K Panda , Department of Computer Science and Engineering, The Ohio State University
pp. 1-10

Combining HoL-blocking avoidance and differentiated services in high-speed interconnects (Abstract)

Pedro Yebenes , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Jesus Escudero-Sahuquillo , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Crispin Gomez , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Pedro J. Garcia , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Francisco J. Alfaro , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Francisco J. Quiles , Dept. of Computing Sytems, University of Castilla-La Mancha, Spain
Jose Duato , Dept. of Computer Engineering, Politècnica de València, Spain
pp. 1-10

A high performance broadcast design with hardware multicast and GPUDirect RDMA for streaming applications on Infiniband clusters (Abstract)

A. Venkatesh , Department of Computer Science and Engineering, The Ohio State University
H. Subramoni , Department of Computer Science and Engineering, The Ohio State University
K. Hamidouche , Department of Computer Science and Engineering, The Ohio State University
Dhabaleswar K. Panda , Department of Computer Science and Engineering, The Ohio State University
pp. 1-10

High performance MPI library over SR-IOV enabled infiniband clusters (Abstract)

Jie Zhang , Department of Computer Science and Engineering, The Ohio State University
Xiaoyi Lu , Department of Computer Science and Engineering, The Ohio State University
Jithin Jose , Department of Computer Science and Engineering, The Ohio State University
Mingzhe Li , Department of Computer Science and Engineering, The Ohio State University
Rong Shi , Department of Computer Science and Engineering, The Ohio State University
Dhabaleswar K. D K Panda , Department of Computer Science and Engineering, The Ohio State University
pp. 1-10

DRIVE: Using implicit caching hints to achieve disk I/O reduction in virtualized environments (Abstract)

Sujesha Sudevalayam , Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Rahul Balani, Akshat Verma IBM India Research Lab
Purushottam Kulkarni , Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Rahul Balani, Akshat Verma IBM India Research Lab
pp. 1-10

An improved recursive graph bipartitioning algorithm for well balanced domain decomposition (Abstract)

Astrid Casadei , University of Bordeaux Inria, CNRS (LaBRI UMR 5800), Bordeaux, France
Pierre Ramet , University of Bordeaux Inria, CNRS (LaBRI UMR 5800), Bordeaux, France
Jean Roman , Inria, Bordeaux Institute of Technology (IPB), CNRS (LaBRI UMR 5800), Bordeaux, France
pp. 1-10

Coupling-aware graph partitioning algorithms: Preliminary study (Abstract)

Maria Predari , Univ. Bordeaux, LaBRI, UMR 5800, F-33400 Talence, France
Aurelien Esnard , Univ. Bordeaux, LaBRI, UMR 5800, F-33400 Talence, France
pp. 1-10

Reducing elimination tree height for parallel LU factorization of sparse unsymmetric matrices (Abstract)

Enver Kayaaslan , INRIA and LIP (UMR 5668: ENS Lyon, CNRS, UCBL, Université de Lyon, INRIA), 46 allee d'Italie, 69364, Lyon, France
Bora Ucar , CNRS and LIP (UMR 5668: ENS Lyon, CNRS, UCBL, Université de Lyon, INRIA), 46 allee d'Italie, 69364, Lyon, France
pp. 1-10

Analysis and tuning of libtensor framework on multicore architectures (Abstract)

Khaled Z. Ibrahim , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Samuel W. Williams , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Evgeny Epifanovsky , Department of Chemistry, University of Southern California, Los Angeles, CA, USA
Anna I. Krylov , Department of Chemistry, University of Southern California, Los Angeles, CA, USA
pp. 1-10

A multilevel compressed sparse row format for efficient sparse computations on multicore processors (Abstract)

Humayun Kabir , Department of Computer Science & Engineering, The Pennsylvania State Univeristy, University Park, Pennsylvania 16802
Joshua Dennis Booth , Department of Computer Science & Engineering, The Pennsylvania State Univeristy, University Park, Pennsylvania 16802
Padma Raghavan , Department of Computer Science & Engineering, The Pennsylvania State Univeristy, University Park, Pennsylvania 16802
pp. 1-10

Optimization of scan algorithms on multi- and many-core processors (Abstract)

Qiao Sun , Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Chao Yang , Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
pp. 1-10

Performance evaluation of multi core systems for high throughput medical applications involving model predictive control (Abstract)

Madhurima Pore , Impact Laboratory, Arizona State University, Tempe, AZ
Ayan Banerjee , Impact Laboratory, Arizona State University, Tempe, AZ
Sandeep K. S. Gupta , Impact Laboratory, Arizona State University, Tempe, AZ
pp. 1-10

Interface for heterogeneous kernels: A framework to enable hybrid OS designs targeting high performance computing on manycore architectures (Abstract)

Taku Shimosawa , Hitachi, Ltd.
Balazs Gerofi , Graduate School of Information Science and Technology, The University of Tokyo
Masamichi Takagi , RIKEN Advanced Institute for Computational Science
Gou Nakamura , Hitachi Solutions, Ltd.
Tomoki Shirasawa , Hitachi Solutions East Japan, Ltd.
Yuji Saeki , Hitachi, Ltd.
Masaaki Shimizu , Hitachi, Ltd.
Atsushi Hori , RIKEN Advanced Institute for Computational Science
Yutaka Ishikawa , Graduate School of Information Science and Technology, The University of Tokyo
pp. 1-10

Premonition of storage response class using Skyline ranked Ensemble method (Abstract)

Kumar Dheenadayalan , Qualcomm India Pvt Ltd. Bangalore, India
V N Muralidhara , International Institute of Information Technology-Bangalore, India
Pushpa Datla , Qualcomm India Pvt Ltd. Bangalore, India
G Srinivasaraghavan , International Institute of Information Technology-Bangalore, India
Maulik Shah , Qualcomm India Pvt Ltd. Bangalore, India
pp. 1-10

Queueing-based storage performance modeling and placement in OpenStack environments (Abstract)

Yang Song , IBM Almaden Research Center 650 Harry Road, San Jose, CA
Rakesh Jain , IBM Almaden Research Center 650 Harry Road, San Jose, CA
Ramani Routray , IBM Almaden Research Center 650 Harry Road, San Jose, CA
pp. 1-10

A fast implementation of MLR-MCL algorithm on multi-core processors (Abstract)

Qingpeng Niu , Department of Computer Science and Engineering, The Ohio State University
Pai-Wei Lai , Department of Computer Science and Engineering, The Ohio State University
S M Faisal , Department of Computer Science and Engineering, The Ohio State University
Srinivasan Parthasarathy , Department of Computer Science and Engineering, The Ohio State University
P. Sadayappan , Department of Computer Science and Engineering, The Ohio State University
pp. 1-10

Optimizing shared data accesses in distributed-memory X10 systems (Abstract)

Jeeva Paudel , Computing Sc., University of Alberta
Olivier Tardieu , IBM T. J. Watson Research Center
Jose Nelson Amarai , Computing Sc., University of Alberta
pp. 1-10

A proactive approach for coping with uncertain resource availabilities on desktop grids (Abstract)

Louis-Claude Canon , Université de Franche-Comté, DISC, FEMTO-ST 16, route de Gray, 25000 Besancon, France
Adel Essafi , LaTICE research Laboratory, University of Tunis, Bab Mnara, Tunis, Tunisia
Denis Trystram , Univ. Grenoble Alpes, 655 avenue de l'Europe, 38334 St Ismier, France, Institut Universitaire de France
pp. 1-9

Algorithms for power-aware resource activation (Abstract)

Sonika Arora , University of Delhi, New Delhi, India
Archita Agarwal , IBM Research Lab, New Delhi, India
Venkatesan T. Chakaravarthy , IBM Research Lab, New Delhi, India
Yogish Sabharwal , IBM Research Lab, New Delhi, India
pp. 1-10

A flexible scheduling framework for heterogeneous CPU-GPU clusters (Abstract)

Kittisak Sajjapongse , Department of Electrical and Computer Engineering, University of Missouri - Columbia
Tejaswi Agarwal , Department of Electrical and Computer Engineering, University of Missouri - Columbia
Michela Becchi , Department of Electrical and Computer Engineering, University of Missouri - Columbia
pp. 1-11

Cache-conscious scheduling of streaming pipelines on parallel machines with private caches (Abstract)

Kunal Agrawal , Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
Jordyn Maglalang , Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
Jeremy T. Fineman , Department of Computer Science, Georgetown University, Washington, District of Columbia, USA
pp. 1-12

Efficient and robust allocation algorithms in clouds under memory constraints (Abstract)

Olivier Beaumont , Inria Bordeaux, France
Juan-Angel Lorenzo , Inria Bordeaux, France
Lionel Eyraud-Dubois , Inria Bordeaux, France
Paul Renaud-Goud , Inria Bordeaux, France
pp. 1-10

Saving energy by exploiting residual imbalances on iterative applications (Abstract)

Edson L. Padoin , Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS) - Porto Alegre, RS - Brazil
Marcio Castro , Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS) - Porto Alegre, RS - Brazil
Laercio L. Pilla , Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS) - Porto Alegre, RS - Brazil
Philippe O. A. Navaux , Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS) - Porto Alegre, RS - Brazil
Jean-Francois Mehaut , Laboratoire d'Informatique de Grenoble (LIG), Grenoble University - Grenoble - France
pp. 1-10

GPU parallelization of the stochastic on-time arrival problem (Abstract)

Maleen Abeydeera , Electronic and Telecommunication Engineering, University of Moratuwa
Samitha Samaranayake , Systems Engineering, University of California Berkeley
pp. 1-8

GpuTejas: A parallel simulator for GPU architectures (Abstract)

Geetika Malhotra , Department of Computer Science, Indian Institute of Technology, Hauz Khas, New Delhi, India
Seep Goel , Department of Computer Science, Indian Institute of Technology, Hauz Khas, New Delhi, India
Smruti R. Sarangi , Department of Computer Science, Indian Institute of Technology, Hauz Khas, New Delhi, India
pp. 1-10

Mixed-precision models for calculation of high-order virial coefficients on GPUs (Abstract)

Chao Feng , Department of Computer Science and Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260
Andrew Schultz , Department of Chemical and Biological Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260
Vipin Chaudhary , Department of Computer Science and Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260
David Kofke , Department of Chemical and Biological Engineering, University at Buffalo, The State University of New York, Buffalo, NY 14260
pp. 1-10

Parallel AMG solver for three dimensional unstructured grids using GPU (Abstract)

K Ravi Tej , Dept. of Computer Science and Engineering, Indian Institute of Technology Hyderabad, Hyderabad, India
Naveen Sivadasan , Dept. of Computer Science and Engineering, Indian Institute of Technology Hyderabad, Hyderabad, India
Vatsalya Sharma , Dept. of Mechanical Engineering, Indian Institute of Technology Hyderabad, Hyderabad, India
Raja Banerjee , Dept. of Mechanical Engineering, Indian Institute of Technology Hyderabad, Hyderabad, India
pp. 1-10

Particle advection performance over varied architectures and workloads (Abstract)

Hank Childs , University of Oregon
Scott Biersdorff , University of Oregon
David Poliakoff , University of Oregon
David Camp , Lawrence Berkeley National Laboratory
Allen D. Malony , University of Oregon
pp. 1-10

Relax-Miracle: GPU parallelization of semi-analytic fourier-domain solvers for earthquake modeling (Abstract)

Sagar Shrishailappa Masuti , Earth Observatory of Singapore Nanyang Technological University Singapore
Sylvain Barbot , Earth Observatory of Singapore Nanyang Technological University Singapore
Nachiket Kapre , School of Computer Engineering Nanyang Technological University Singapore
pp. 1-10

Xevolver: An XML-based code translation framework for supporting HPC application migration (Abstract)

Hiroyuki Takizawa , Tohoku University, Sendai, Miyagi 980-8579, Japan
Shoichi Hirasawa , Tohoku University, Sendai, Miyagi 980-8579, Japan
Yasuharu Hayashi , NEC Corporation
Ryusuke Egawa , Tohoku University, Sendai, Miyagi 980-8579, Japan
Hiroaki Kobayashi , Tohoku University, Sendai, Miyagi 980-8579, Japan
pp. 1-11

Online failure prediction for HPC resources using decentralized clustering (Abstract)

Alejandro Pelaez , Rutgers, The State University of New Jersey, RDI2
Andres Quiroz , Xerox Research, Webster, NY
James C. Browne , University of Texas at Austin
Edward Chuah , University of Texas at Austin
Manish Parashar , Rutgers, The State University of New Jersey, RDI2
pp. 1-9

CQA: A code quality analyzer tool at binary level (Abstract)

Andres S. Charif-Rubial , Exascale Computing Research Laboratory, FR
Emmanuel Oseret , Exascale Computing Research Laboratory, FR
Jose Noudohouenou , Exascale Computing Research Laboratory, FR
William Jalby , Exascale Computing Research Laboratory, FR
Ghislain Lartigue , Normandie Universite, FR
pp. 1-10

Towards realizing the potential of malleable jobs (Abstract)

Abhishek Gupta , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Bilge Acun , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Osman Sarood , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Laxmikant V. Kale , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
pp. 1-10

Improving Multi-dimensional query processing with data migration in distributed cache infrastructure (Abstract)

Youngmoon Eom , Dept. of Computer Science Engineering, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea
Jinwoong Kim , Dept. of Computer Science Engineering, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea
Deukyeon Hwang , Dept. of Computer Science Engineering, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea
Jaewon Kwak , Dept. of Computer Science Engineering, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea
Minho Shin , Dept. of Computer Engineering, Myongji University, Republic of Korea
Beomseok Nam , Dept. of Computer Science Engineering, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea
pp. 1-10

An early experience of regional ocean modelling on intel many integrated core architecture (Abstract)

Srikanth Yalavarthi , Computational Earth Sciences Group, Centre for Development of Advanced Computing, Pune, India
Akshara Kaginalkar , Computational Earth Sciences Group, Centre for Development of Advanced Computing, Pune, India
pp. 1-6

RADIR: Lock-free and wait-free bandwidth allocation models for solid state drives (Abstract)

Pooja Aggarwal , Department of Computer Science & Engineering, Indian Institute of Technology, New Delhi, India
Giridhar Yasa , Netapp India Pvt. Ltd, EGL Software Park, Domlur, Bangalore, India
Smruti R. Sarangi , Department of Computer Science & Engineering, Indian Institute of Technology, New Delhi, India
pp. 1-10

Design and evaluation of parallel hashing over large-scale data (Abstract)

Long Cheng , National University of Ireland Maynooth, Ireland
Spyros Kotoulas , IBM Research, Ireland
Tomas E Ward , National University of Ireland Maynooth, Ireland
Georgios Theodoropoulos , Durham University, UK
pp. 1-10

Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms (Abstract)

Yuan Wen , School of Informatics, The University of Edinburgh
Zheng Wang , School of Computing and Communications, Lancaster University
Michael F. P. O'Boyle , School of Informatics, The University of Edinburgh
pp. 1-10

Software based ultrasound B-mode/beamforming optimization on GPU and its performance prediction (Abstract)

Thi Yen Phuong , Department of Computer Engineering, Hallym University, South Korea
Jeong-Gun Lee , Department of Computer Engineering, Hallym University, South Korea
pp. 1-10

Fine-grained GPU parallelization of pairwise local sequence alignment (Abstract)

Chirag Jain , Department of Computer Science and Engineering, IIT Delhi
Subodh Kumar , Department of Computer Science and Engineering, IIT Delhi
pp. 1-10

Distance threshold similarity searches on spatiotemporal trajectories using GPGPU (Abstract)

Michael Gowanlock , Dept. of Information and Computer Sciences and NASA Astrobiology Institute, University of Hawai'i, Honolulu, HI, U.S.A.
Henri Casanova , Dept. of Information and Computer Sciences, University of Hawai'i, Honolulu, HI, U.S.A.
pp. 1-10

Simple parallel biconnectivity algorithms for multicore platforms (Abstract)

George M. Slota , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
Kamesh Madduri , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
pp. 1-10
92 ms
(Ver 3.3 (11022016))