1st IEEE Computer Society International Workshop on Cluster Computing
Design and Analysis of the Alliance/University of New Mexico Roadrunner Linux SMP SuperCluster
Melbourne, Australia
December 02-December 03
ISBN: 0-7695-0343-8
This paper will discuss high performance clustering from a series of critical topics: architectural design, system software infrastructure, and programming environment. This will be accomplished through an overview of a large scale, high performance SuperCluster (named Roadrunner) in production at The University of New Mexico (UNM) Albuquerque High Performance Computing Center (AHPCC). This SuperCluster, sponsored by the U.S. National Science Foundation (NSF) and the National Computational Science Alliance (NCSA), is based almost entirely on freely-available, vendor-independent software. For example, its operating system (Linux), job scheduler (PBS), compilers (GNU/EGCS), and parallel programming libraries (MPI). The Globus toolkit, also available for this platform, allows high performance distributed computing applications to use geographically distributed resources such as this SuperCluster. In addition to describing the design and analysis of the Roadrunner SuperCluster, we provide experimental analyses from grand challenge applications and future directions for SuperClusters.
Index Terms:
cluster hardware, high performance communications networks and interfaces, lightweight communication protocols, issues in building scalable services, job and resource management, message passing systems for clusters, tools for operating and managing clusters, algorithms for solving problems on clusters, non-local clusters, symmetric multiprocessors
Citation:
David A. Bader, Arthur B. Maccabe, Jason R. Mastaler, John K. McIver III, Patricia A. Kovatch, "Design and Analysis of the Alliance/University of New Mexico Roadrunner Linux SMP SuperCluster," iwcc, pp.9, 1st IEEE Computer Society International Workshop on Cluster Computing, 1999