2003 Proceedings IEEE International Conference on Cluster Computing (2003)
Dec. 1, 2003 to Dec. 4, 2003
Federico D. Sacerdoti , San Diego Supercomputing Center
Mason J. Katz , San Diego Supercomputing Center
Matthew L. Massie , University of California at Berkeley
David E. Culler , University of California at Berkeley
In this paper, we present a structure for monitoring a large set of computational clusters. We illustrate methods for scaling a monitor network comprised of many clusters while keeping processing requirements low. A design for presenting high-level web-based summaries of the monitor network is provided, along with a generalization to a distributed, multiple-resolution monitoring tree. Emphasis is placed on scalability, fast query response, fault tolerance, and grid compatibility. Experimental evidence is presented that demonstrates the performance of our design.
M. L. Massie, M. J. Katz, F. D. Sacerdoti and D. E. Culler, "Wide Area Cluster Monitoring with Ganglia," 2003 Proceedings IEEE International Conference on Cluster Computing(CLUSTER), Hong Kong, 2003, pp. 289.