|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Jorge E. Pezoa, Sagar Dhakal, Majeed M. Hayat, "Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation," IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 10, pp. 1531-1544, October, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/TPDS.2010.34, author = {Jorge E. Pezoa and Sagar Dhakal and Majeed M. Hayat}, title = {Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {21}, number = {10}, issn = {1045-9219}, year = {2010}, pages = {1531-1544}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.34}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation IS - 10 SN - 1045-9219 SP1531 EP1544 EPD - 1531-1544 A1 - Jorge E. Pezoa, A1 - Sagar Dhakal, A1 - Majeed M. Hayat, PY - 2010 KW - Renewal theory KW - queuing theory KW - reliability KW - distributed computing KW - load balancing. VL - 21 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
[1] R. Shah, B. Veeravalli, and M. Misra, "On the Design of Adaptive and Decentralized Load Balancing Algorithms with Load Estimation for Computational Grid Environments," IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 12, pp. 1675-1686, Dec. 2007.
[2] L. Tassiulas and A. Ephremides, "Stability Properties of Constrained Queuing Systems and Scheduling Policies for Maximum Throughput in Multihop Radio Networks," IEEE Trans. Automatic Control, vol. 37, no. 12, pp. 1936-1948, Dec. 1992.
[3] M. Neely, E. Modiano, and C. Rohrs, "Dynamic Power Allocation and Routing for Time Varying Wireless Networks," Proc. IEEE INFOCOM, 2003.
[4] G. Koole, P. Sparaggis, and D. Towsley, "Minimizing Response Times and Queue Lengths in Systems of Parallel Queues," J. Applied Probability, vol. 36, pp. 1185-1193, 1999.
[5] L. Golubchik, J. Lui, and R. Muntz, "Chained Declustering: Load Balancing and Robustness to Skew and Failures," Proc. Workshop Research Issues on Data Eng., pp. 88-95, 1992.
[6] A. Brandt and M. Brandt, "On a Two-Queue Priority System with Impatience and Its Application to a Call Center," Methodology and Computing in Applied Probability, vol. 1, pp. 191-210, 1999.
[7] M. Hayat, S. Dhakal, C. Abdallah, J. Birdwell, and J. Chiasson, "Advances in Time Delay Systems" Dynamic Time Delay Models for Load Balancing. Part II: Stochastic Analysis of the Effect of Delay Uncertainty, pp. 355-368, Springer-Verlag, 2004.
[8] S. Dhakal, M. Hayat, J. Pezoa, C. Yang, and D. Bader, "Dynamic Load Balancing in Distributed Systems in the Presence of Delays: A Regeneration-Theory Approach," IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 4, pp. 485-497, Apr. 2007.
[9] S. Dhakal, M. Hayat, J. Pezoa, C. Abdallah, J. Birdwell, and J. Chiasson, "Load Balancing in the Presence of Random Node Failure and Recovery," Proc. IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), 2006.
[10] Y.-S. Dai and G. Levitin, "Optimal Resource Allocation for Maximizing Performance and Reliability in Tree-Structured Grid Services," IEEE Trans. Reliability, vol. 56, no. 3, pp. 444-453, Sept. 2007.
[11] Y.-S. Dai, G. Levitin, and K. Trivedi, "Performance and Reliability of Tree-Structured Grid Services Considering Data Dependence and Failure Correlation," IEEE Trans. Computers, vol. 56, no. 7, pp. 925-936, July 2007.
[12] G. Attiya and Y. Hamam, "Reliability Oriented Task Allocation in Heterogeneous Distributed Computing Systems," Proc. Ninth Int'l Symp. Computers and Comm., pp. 68-73, 2004.
[13] C.-I. Chen, "Task Allocation and Reallocation for Fault Tolerance in Multicomputer Systems," Trans. Aerospace and Electronic Systems, vol. 30, pp. 1094-1104, 1994.
[14] S. Dhakal, "Load Balancing in Communication Constrained Distributed Systems: A Probabilistic Approach," PhD dissertation, Univ. of New Mexico, 2006.
[15] J. Pezoa, S. Dhakal, and M. Hayat, "Decentralized Load Balancing for Improving Reliability in Heterogeneous Distributed Systems," Proc. Int'l Conf. Parallel Processing (ICPP), 2009.
[16] V. Shestak, J. Smith, A. Maciejewski, and H. Siegel, "Stochastic Robustness Metric and Its Use for Static Resource Allocations," J. Parallel and Distributed Computing, vol. 68, pp. 1157-1173, 2008.
[17] M. Trehel, C. Balayer, and A. Alloui, "Modeling Load Balancing Inside Groups Using Queuing Theory," Proc. 10th Int'l Conf. Parallel and Distributed Computing Systems, 1997.
[18] C. Hui and S. Chanson, "Hydrodynamic Load Balancing," IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 11, pp. 1118-1137, Nov. 1999.
[19] Z. Lan, V. Taylor, and G. Bryan, "Dynamic Load Balancing for Adaptive Mesh Refinement Application," Proc. Int'l Conf. Parallel Processing (ICPP), 2001.
[20] S. Dhakal, B. Paskaleva, M. Hayat, E. Schamiloglu, and C. Abdallah, "Dynamical Discrete-Time Load Balancing in Distributed Systems in the Presence of Time Delays," Proc. IEEE Conf. Decision and Control (CDC), 2003.
[21] H. Lee, S. Chin, J. Lee, D. Lee, K. Chung, S. Jung, and H. Yu, "A Resource Manager for Optimal Resource Selection and Fault Tolerance Service in Grids," Proc. IEEE Int'l Symp. Cluster Computing and the Grid (ISCCG), 2004.
[22] M. Litzkow, M. Livny, and M. Mutka, "Condor—A Hunter of Idle Workstations," Proc. Int'l Conf. Distrbuted Computing Systems (ICDCS), pp. 104-111, 1988.
[23] R. Sheahan, L. Lipsky, and P. Fiorini, "The Effect of Different Failure Recovery Procedures on the Distribution of Task Completion Times," Proc. Workshop Dependable Parallel Distributed and Network-Centric Systems (DPDNS), 2005.
[24] J. Palmer and I. Mitrani, "Empirical and Analytical Evaluation of Systems with Multiple Unreliable Servers," Proc. Int'l Conf. Dependable Systems and Networks, pp. 517-525, 2006.
[25] S. Shatz and J.-P. Wang, "Models and Algorithms for Reliability-Oriented Task-Allocation in Redundant Distributed-Computer Systems," IEEE Trans. Reliability, vol. 38, no. 1, pp. 16-27, Apr. 1989.
[26] V. Ravi, B. Murty, and J. Reddy, "Nonequilibrium Simulated-Annealing Algorithm Applied to Reliability Optimization of Complex Systems," IEEE Trans. Reliability, vol. 46, no. 2, pp. 233-239, June 1997.
[27] S. Srinivasan and N. Jha, "Safety and Reliability Driven Task Allocation in Distributed Systems," IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 3, pp. 238-251, Mar. 1999.
[28] D. Vidyarthi and A. Tripathi, "Maximizing Reliability of a Distributed Computing System with Task Allocation Using Simple Genetic Algorithm," J. Systems Architecture, vol. 47, pp. 549-554, 2001.
[29] G. Attiya and Y. Hamam, "Task Allocation for Maximizing Reliability of Distributed Systems: A Simulated Annealing Approach," J. Parallel and Distributed Computing, vol. 66, pp. 1259-1266, 2006.
[30] Y. Hamam and K. Hindi, "Assignment of Program Tasks to Processors: A Simulated Annealing Approach," European J. Operational Research, vol. 122, pp. 509-513, 2000.

