The Community for Technology Leaders
Parallel and Distributed Systems, International Conference on (2007)
Hsinchu, Taiwan
Dec. 5, 2007 to Dec. 7, 2007
ISBN: 978-1-4244-1889-3
pp: 1-8
Agustin Caminero , Department of Computing Systems, The University of Castilla, La Mancha, Spain
Carmen Carrion , Department of Computing Systems, The University of Castilla, La Mancha, Spain
Anthony Sulistio , Dept. of Computer Sc.&Software Eng., The University of Melbourne, Australia
Rajkumar Buyya , Dept. of Computer Sc.&Software Eng., The University of Melbourne, Australia
Blanca Caminero , Department of Computing Systems, The University of Castilla, La Mancha, Spain
ABSTRACT
Grid technologies are emerging as the next generation of distributed computing, allowing the aggregation of resources that are geographically distributed across different locations. However, these resources are independent and managed separately by various organizations with different policies. This will have a major impact to users who submit their jobs to the Grid, as they have to deal with issues such as policy heterogeneity, security and fault tolerance. Moreover, the changes of Grid conditions, such as resources that may become unavailable for a period of time due to maintenance and/or suffer failures, would significantly affect the Quality of Service (QoS) requirements of users. Therefore, it is essential for users to take into account the effects of resource failures during jobs execution.In this paper, we present our work on introducing resource failures and failure detection into the GridSim simulation toolkit. As we need to conduct repeatable and controlled experiments, it is easier to use simulation as a means of studying complex scenarios. We also give a detailed description of the overall design and a use case scenario demonstrating the conditions of resources varied over time.
INDEX TERMS
null
CITATION
Agustin Caminero, Carmen Carrion, Anthony Sulistio, Rajkumar Buyya, Blanca Caminero, "Extending GridSim with an architecture for failure detection", Parallel and Distributed Systems, International Conference on, vol. 01, no. , pp. 1-8, 2007, doi:10.1109/ICPADS.2007.4447756
102 ms
(Ver )