loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
First International Conference on Availability, Reliability and Security (ARES'06)
Availability Modeling and Analysis on High Performance Cluster Computing Systems
Vienna, Austria
April 20-April 22
ISBN: 0-7695-2567-9
Hertong Song, Louisiana Tech University
Chokchai "box" Leangsuksun, Louisiana Tech University
Raja Nassar, Louisiana Tech University
Cluster computing has been attracting more and more attention from both the industry and the academia for its enormous computing power, cost effectiveness, and scalability. Availability is a key system attribute that needs to be considered both at system design stage and must reflect the actuality. System monitoring and logging enables identifying unplanned events to reflect the actual system's availability. This paper proposes a single framework that coordinates event monitoring, filtering, data analysis and dynamic availability modeling. The availability model is abstracted and categorized based on functionality. We describe the proposed architecture, and a sample analysis of real time event logs from a 512 node cluster from Lawrence Livermore National Laboratory.
Citation:
Hertong Song, Chokchai "box" Leangsuksun, Raja Nassar, "Availability Modeling and Analysis on High Performance Cluster Computing Systems," ares, pp.305-313, First International Conference on Availability, Reliability and Security (ARES'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.