|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Chee-Wei Ang, Chen-Khong Tham, "Analysis and optimization of service availability in a HA cluster with load-dependent machine availability," IEEE Transactions on Parallel and Distributed Systems, vol. 18, no. 9, pp. 1307-1319, September, 2007. | |||
| BibTex | x | ||
| @article{ 10.1109/TPDS.2007.1071, author = {Chee-Wei Ang and Chen-Khong Tham}, title = {Analysis and optimization of service availability in a HA cluster with load-dependent machine availability}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {18}, number = {9}, issn = {1045-9219}, year = {2007}, pages = {1307-1319}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2007.1071}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Analysis and optimization of service availability in a HA cluster with load-dependent machine availability IS - 9 SN - 1045-9219 SP1307 EP1319 EPD - 1307-1319 A1 - Chee-Wei Ang, A1 - Chen-Khong Tham, PY - 2007 KW - High Availability KW - cluster computing KW - Markov chains KW - Markov decision processes KW - dynamic programming KW - neuro-dynamic programming VL - 18 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
[1] D. Scott, “NSM: Often the Weakest Link in Business Availability,” http://www.gartner.comDisplayDocument?id=334197 , July 2001.
[2] M. Loney, “The Magic That Makes Google Tick,” http://www. zdnet.com.au/insight/software 0,39023769,39168647,00.htm, Dec. 2004.
[3] The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, eds. Morgan Kaufmann, July 1999.
[4] Y.S. Dai and G. Levitin, “Reliability and Performance of Tree-Structured Grid Services,” IEEE Trans. Reliability, vol. 55, pp. 337-349, June 2006.
[5] K. Trivedi, Probability and Statistics with Reliability, Queuing and Computer Science Applications. John Wiley & Sons, 2001.
[6] A. Sathaye, S. Ramani, and K. Trivedi, “Availability Models in Practice,” Proc. Int'l Workshop Fault-Tolerant Control and Computing (FTCC-1), May 2000.
[7] Y.S. Dai, M. Xie, K.L. Poh, and G.Q. Liu, “A Study of Service Reliability and Availability for Distributed Systems,” Reliability Eng. and System Safety, vol. 79, pp. 103-112, Jan. 2003.
[8] G. Ciardo, K.S. Trivedi, and J.K. Muppala, “SPNP: Stochastic Petri Net Package,” Proc. Third Int'l Workshop Petri Nets and Performance Models (PNPM '89), Dec. 1989.
[9] K. Trivedi and C. Hirel, “Sharpe—Symbolic Hierarchical Automated Reliability and Performance Evaluator,” http://amod.ee. duke.edusoftware_packages.htm , Dec. 2004.
[10] K. Iyer, E. Butner, and E.J. McCluskey, “An Exponential Failure/Load Relationship: Results of a Multi-Computer Statistical Study,” Technical Report CSL-TR-81-214, Computer Systems Laboratory, Stanford Univ., July 1981.
[11] B. Schroeder and G.A. Gibson, “A Large-Scale Study of Failures in High-Performance Computing Systems,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '06), June 2006.
[12] D. Heimann, N. Mittal, and K.S. Trivedi, “Availability and Reliability Modeling for Computer Systems,” Advances in Computers, M. Yovitts, ed., vol. 31, pp. 175-233. Academic Press, 1990.
[13] R. Robinson and A. Polozoff, “IBM WebSphere Developer Technical J.: Planning for Availability in the Enterprise,” http://www-128.ibm.com/developerworks/websphere/ techjournal/0312_polozoffpolozoff.html , Oct. 2003.
[14] J. Tian, S. Rudraraju, and Z. Li, “Evaluating Web Software Reliability Based on Workload and Failure Data Extracted from Server Logs,” IEEE Trans. Software Eng., vol. 30, no. 11, pp. 754-769, Nov. 2004.
[15] R.K. Iyer and D.J. Rossetti, “A Statistical Load Dependency Model for CPU Errors at SLAC,” Proc. 12th Int'l Symp. Fault-Tolerant Computing (FTCS-12), pp. 363-372, June 1982.
[16] K. Vaidyanathan and K.S. Trivedi, “A Measurement-Based Model for Estimation of Resource Exhaustion in Operational Software Systems,” Proc. 10th Int'l Symp. Software Reliability Eng. (ISSRE '99), 1999.
[17] “IBM DB2 V7 Administration Guide Part 12 Chapter 35: DB2 and High Availability on SUN Cluster 2.2,” http://publib.boulder. ibm.com/infocenter/ db2v7luw/topic/com.ibm.db2v7.doc/db2d0 db2d0273.htm, 2001.
[18] “Linux-HA Heartbeat Program,” http://www.linux-ha.orgHeartbeatProgram, 1999.
[19] D.A. Patterson, G.A. Gibson, and R.H. Katz, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '88), June 1988.
[20] A. Heddaya and A. Helal, “Reliability, Availability, Dependability and Performability: A User-Centered View,” technical report, Boston Univ., 1997.
[21] K. Nagaraja, G. Gama, R. Bianchini, R.P. Martin, W. Meira Jr., and T.D. Nguyen, “Quantifying the Performability of Cluster-Based Services,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 5, pp. 456-467, May 2005.
[22] D. Bertsekas and R. Gallager, Data Networks, second ed. Prentice Hall, 1992.
[23] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C: The Art of Scientific Computing. Cambridge Univ. Press, 2002.
[24] R.S. Sutton and A.G. Barto, Reinforcement Learning—An Introduction. MIT Press, 1998.
[25] M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 1994.
[26] B. Van Roy, “Neuro-Dynamic Programming: Overview and Recent Trends,” Handbook of Markov Decision Processes: Methods and Applications, E. Feinberg and A. Shwartz, eds. Kluwer Academic Publishers, 2001.
[27] G. Tesauro, N.K. Jong, R. Das, and M.N. Bennani, “A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation,” Proc. Third Int'l Conf. Autonomic Computing (ICAC '06), pp. 65-73, June 2006.
[28] J. Guo and L.N. Bhuyan, “Load Balancing in a Cluster-Based Web Server for Multimedia Applications,” IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 11, pp. 1321-1334, Nov. 2006.
[29] M. Adler, S. Chakrabarti, M. Mitzenmacher, and L. Rasmussen, “Parallel Randomized Load Balancing,” Proc. 27th Ann. ACM Symp. Theory of Computing (STOC '95), pp. 238-247, 1995.
[30] B.A. Shirazi, A.R. Hurson, and K.M. Kavi, Scheduling and Load Balancing in Parallel and Distributed Systems. Wiley–IEEE CS Press, May 1995.
[31] Q. Zhang, A. Riska, W. Sun, E. Smirni, and G. Ciardo, “Workload-Aware Load Balancing for Clustered Web Servers,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 3, pp. 219-233, Mar. 2005.
[32] D.P. Bertsekas and J.N. Tsitsiklis, Neuro-Dynamic Programming. Athena Scientific, 1996.
[33] L.P. Kaelbling, M.L. Littman, and A.P. Moore, “Reinforcement Learning: A Survey,” J. Artificial Intelligence Research, vol. 4, pp.237-285, 1996.
[34] Service Availability Forum, http:/www.saforum.org, 2006
[35] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Congestion Avoidance,” IEEE/ACM Trans. Networking, vol. 1, pp.397-413, Aug. 1993.
[36] M. MacDougall, Simulating Computer Systems. MIT Press, 1987.

