Parallel and Distributed Systems, International Conference on (2005)
July 20, 2005 to July 22, 2005
Walt Truszkowski , NASA Goddard Space Flight Center Information Systems Division Greenbelt, MD, USA
Mike Hinchey , NASA Goddard Space Flight Center Information Systems Division Greenbelt, MD, USA
Roy Sterritt , University of Ulster School of Computing and Mathematics, Jordanstown Campus Northern Ireland
<p>Cluster computing, whereby a large number of simple processors or nodes are combined together to apparently function as a single powerful computer, has emerged as a research area in its own right. The approach offers a relatively inexpensive means of providing a fault-tolerant environment and achieving significant computational capabilities for highperformance computing applications. However, the task of manually managing and configuring a cluster quickly becomes daunting as the cluster grows in size. Autonomic computing, with its vision to provide selfmanagement, can potentially solve many of the problems inherent in cluster management. We describe the development of a prototype Autonomic Cluster Management System (ACMS) that exploits autonomic properties in automating cluster management and its evolution to include reflex reactions via pulse monitoring.</p>
W. Truszkowski, R. Sterritt and M. Hinchey, "Towards an Autonomic Cluster Management System (ACMS) with Reflex Autonomicity," Parallel and Distributed Systems, International Conference on(ICPADS), Fukuoka, Japan, 2005, pp. 478-482.