2004 International Symposium on Applications and the Internet Workshops. 2004 Workshops. (2004)
Jan. 26, 2004 to Jan. 30, 2004
Ken?ichiro Shirose , Tokyo Institute of Technology
Satoshi Matsuoka , Tokyo Institute of Technology and National Institute of Informatics
Hidemoto Nakada , Tokyo Institute of Technology and National Institute of Advanced Industrial Science and Technology
Hirotaka Ogawa , National Institute of Advanced Industrial Science and Technology
The problem with practical, large-scale deployment of Grid monitoring system is that, it takes considerable management cost and skills to maintain the level of quality required by production usage, since the monitoring system will be fundamentally be distributed, need to be running continuously, and will itself likely be affected by the various faults and dynamic reconfigurations of the Grid itself. Although their automated management would be desirable, there are several difficulties, distributed faults and reconfigurations, component interdependencies, and scaling to maintain performance while minimizing probing effect. Given our goal to develop a generalized autonomous management framework for Grid monitoring, we have built a prototype, on top of NWS, featuring automatic configuration of its "clique" groups as well as coping with single-node faults without user intervention. An experimental deployment on the Tokyo Institute of Technology?s Campus Grid (The Titech Grid) consisting of over 15 sites and 800 processors has shown the system to be robust in handling faults and reconfigurations, automatically deriving an ideal clique configuration for the head login nodes of each PC cluster in less than two minutes.
K. Shirose, S. Matsuoka, H. Ogawa and H. Nakada, "Autonomous Con.guration of Grid Monitoring Systems," 2004 International Symposium on Applications and the Internet Workshops. 2004 Workshops.(SAINT-W), Tokyo, Japan, 2004, pp. 651.