Cluster Computing and the Grid, IEEE International Symposium on (2008)
May 19, 2008 to May 22, 2008
The optical network based distributed computing system has been thought as a promising technology to support large-scale data-intensive distributed applications. For such a system with so many heterogeneous resources and middlewares involved, faults seem to be inevitable. However, for those applications that need to be finished before the given deadline, a fault in the system will lead to the failure of the application. Therefore, fault-tolerant policy is necessary to improve the performance of the system when faults could happen. In this paper, we address to the fault-tolerant problem for the optical network based distributed computing system. We first propose an overlay approach which applies the existing fault-tolerant policies for distributed computing and optical network. Then we present a joint fault-tolerant policy which takes into account the fault tolerance for computing resource and network resource in the same time. We compare the performances of different polices by simulation. The simulation results show that the joint fault-tolerant policy achieves much better performances compared to overlay approaches.
Optical Network, Distributed Computing, Fault-tolerance
Y. Jin, W. Guo, W. Sun, Z. Sun and W. Hu, "Fault-Tolerant Policy for Optical Network Based Distributed Computing System," 2008 8th International Symposium on Cluster Computing and the Grid (CCGRID '08)(CCGRID), Lyon, 2008, pp. 704-709.