Analysis and optimization of service availability in a HA cluster with load-dependent machine availability
Issue No. 09 - September (2007 vol. 18)
Calculations of service availability of a High- Availability (HA) cluster are usually based on the assumption of load-independent machine availabilities. In this paper, we study the issues and show how the service availabilities can be calculated under the assumption that machine availabilities are load-dependent. we present a Markov chain analysis to derive the steady-state service availabilities of a load-dependentmachine- availability HA cluster. We show that, with loaddependent machine-availability, the attained service availability is now policy-dependent. After formulating the problem as a Markov Decision Process, we proceed to determine the optimal policy to achieve the maximum service availabilities using the method of policy iteration. Two greedy assignment algorithms are studied: least-load and FDL-based, where leastload corresponds to some load-balancing algorithms.We carry out analysis and simulations on two cases of load profiles: in the first profile, a single machine has the capacity to host all services in the HA cluster; in the second profile, a single machine does not have enough capacity to host all services. We show that the service availabilities achieved under the first load profile are the same, while the service availabilities achieved under the second load profile are different. Since the service availabilities achieved are different in the second load profile, we proceed to investigate how the distribution of service availabilities across the services can be controlled by adjusting the rewards vector.
High Availability, cluster computing, Markov chains, Markov decision processes, dynamic programming, neuro-dynamic programming
Chee-Wei Ang, Chen-Khong Tham, "Analysis and optimization of service availability in a HA cluster with load-dependent machine availability", IEEE Transactions on Parallel & Distributed Systems, vol. 18, no. , pp. 1307-1319, September 2007, doi:10.1109/TPDS.2007.1071