|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2009 International Conference on Availability, Reliability and Security
Blue Gene/L Log Analysis and Time to Interrupt Estimation
Fukuoka Institute of Technology, Fukuoka, Japan
March 16-March 19
ISBN: 978-0-7695-3564-7
| ASCII Text | x | ||
| Narate Taerat, Nichamon Naksinehaboon, Clayton Chandler, James Elliott, Chokchai Leangsuksun, George Ostrouchov, Stephen L. Scott, Christian Engelmann, "Blue Gene/L Log Analysis and Time to Interrupt Estimation," 2012 Seventh International Conference on Availability, Reliability and Security, pp. 173-180, 2009 International Conference on Availability, Reliability and Security, 2009. | |||
| BibTex | x | ||
| @article{ 10.1109/ARES.2009.105, author = {Narate Taerat and Nichamon Naksinehaboon and Clayton Chandler and James Elliott and Chokchai Leangsuksun and George Ostrouchov and Stephen L. Scott and Christian Engelmann}, title = {Blue Gene/L Log Analysis and Time to Interrupt Estimation}, journal ={2012 Seventh International Conference on Availability, Reliability and Security}, volume = {0}, year = {2009}, isbn = {978-0-7695-3564-7}, pages = {173-180}, doi = {http://doi.ieeecomputersociety.org/10.1109/ARES.2009.105}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - 2012 Seventh International Conference on Availability, Reliability and Security TI - Blue Gene/L Log Analysis and Time to Interrupt Estimation SN - 978-0-7695-3564-7 SP173 EP180 A1 - Narate Taerat, A1 - Nichamon Naksinehaboon, A1 - Clayton Chandler, A1 - James Elliott, A1 - Chokchai Leangsuksun, A1 - George Ostrouchov, A1 - Stephen L. Scott, A1 - Christian Engelmann, PY - 2009 VL - 0 JA - 2012 Seventh International Conference on Availability, Reliability and Security ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ARES.2009.105
System- and application-level failures could be characterized by analyzing relevant log files. The resulting data might then be used in numerous studies on and future developments for the mission-critical and large scale computational architecture, including fields such as failure prediction, reliability modeling, performance modeling and power awareness. In this paper, system logs covering a six month period of the Blue Gene/L supercomputer were obtained and subsequently analyzed. Temporal filtering was applied to remove duplicated log messages. Optimistic and pessimistic perspectives were exerted on filtered log information to observe failure behavior within the system. Further, various time to repair factors were applied to obtain application time to interrupt, which will be exploited in further resilience modeling research.
Citation:
Narate Taerat, Nichamon Naksinehaboon, Clayton Chandler, James Elliott, Chokchai Leangsuksun, George Ostrouchov, Stephen L. Scott, Christian Engelmann, "Blue Gene/L Log Analysis and Time to Interrupt Estimation," ares, pp.173-180, 2009 International Conference on Availability, Reliability and Security, 2009
Usage of this product signifies your acceptance of the Terms of Use.
