This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Task Allocation for Maximizing Reliability of Distributed Computer Systems
September 1992 (vol. 41 no. 9)
pp. 1156-1168

For distributed systems, system reliability is defined as the probability that the system can run an entire task successfully. When the system's hardware configuration is fixed, the system reliability is mainly dependent on the software design. The task allocation problem is addressed with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and simulation results are provided and discussed.

[1] M. Alam and U. M. Al-Saggaf, "Quantitative reliability evaluation of repairable phased-mission systems using Markov approach,"IEEE Trans. Reliability, vol. R-35, no. 5, pp. 498-503, Dec. 1986.
[2] J. A. Bannister and K. S. Trivedi, "Task allocation in fault-tolerant distributed systems,"Acta Informatica, vol. 20, pp. 261-281, 1983.
[3] S. J. Bavusoet al., "Analysis of typical fault-tolerant architectures using HARP,"IEEE Trans. Reliability, vol. 36, no. 2, pp. 176-185, June 1987.
[4] W. W. Chuet al., "Task allocation in distributed data processing,"IEEE Comput. Mag., vol. 13, no. 11, pp. 57-69, Nov. 1980.
[5] W. W. Chuet al., "Estimation of intermodule communication (IMC) and its applications in distributed processing systems,"IEEE Trans. Comput., vol. 33, no. 8, pp. 691-699, Aug. 1984.
[6] W. W. Chu and L. M.-T. Lan, "Task allocation and precedence relations for distributed real-time systems,"IEEE Trans. Comput., vol. C-36, pp. 667-679, June 1987.
[7] A. Hac, "A system reliability model with classes of failures,"IEEE Trans. Reliability, vol. 34, no. 1, pp. 29-32, Apr. 1985.
[8] J. S. Hariri and C. S. Raghavendra, "Distributed functions allocation for reliability and delay optimization, " inProc. IEEE/ACM 1986 Fall Joint Comput. Conf., Dallas, TX, Nov. 1986, pp. 344-352.
[9] R. M. Kieckhaferet al., "The MAFT achitecture for distributed fault tolerance,"IEEE Trans. Comput., vol. 37, no. 4, pp. 398-405, Apr. 1988.
[10] J. F. Lawless,Statistical Models and Methods for Lifetime Data. New York: Wiley, 1982.
[11] V. M. Lo, "Task assignment in distributed systems," Ph.D. dissertation, Dep. Comput. Sci., Univ. Illinois at Urbana-Champaign, 1983.
[12] P-Y. R. Maet al., "A task allocation model for distributed computing systems,"IEEE Trans. Comput., vol. 31, no. 1, pp. 41-47, Jan. 1982.
[13] N. J. Nilsson,Problem Solving Methods in Artificial Intelligence. New York: McGraw-Hill, 1971.
[14] C. S. Raghavendra and S. V. Maram, "Reliability modeling and analysis of computer networks,"IEEE Trans. Reliability, vol. 35, no. 2, pp. 156-160, June 1986.
[15] C. H. Sauer and K. M. Chandy,Computer Systems Performance Modeling. Englewood Cliffs, NJ: Prentice-Hall, 1981.
[16] S. M. Shatz and S. S. Yau, "A partitioning algorithm for distributed software systems design,"Inform. Sci., vol. 28, no. 2, pp. 165-180, Apr. 1986.
[17] S. M. Shatz and J.-P. Wang, "An introduction to distributed-software engineering,"IEEE Comput. Mag., vol. 20, no. 10, pp. 23-31, Oct. 1987.
[18] S. M. Shatz and J.-P. Wang, "Models and algorithms for reliability-oriented task allocation in distributed computer systems with redundancy,"IEEE Trans. Reliability, vol. 38, no. 1, pp. 16-27, Apr. 1989.
[19] C.-C. Shen and W.-H. Tsai, "A graph matching approach to optimal task assignment in distributed computing systems using a minimax criterion,"IEEE Trans. Comput., vol. 34, no. 3, pp. 197-203, Mar. 1985.
[20] C. Singh, "Calculating the time-specific frequency of system failures,"IEEE Trans. Reliability, vol. 28, no. 2, 124-126, June 1979.

Index Terms:
distributed computer systems; system reliability; software design; task allocation; quantitative problem model; distributed processing; performance evaluation; reliability theory; storage allocation.
Citation:
S.M. Shatz, J.-P. Wang, M. Goto, "Task Allocation for Maximizing Reliability of Distributed Computer Systems," IEEE Transactions on Computers, vol. 41, no. 9, pp. 1156-1168, Sept. 1992, doi:10.1109/12.165396
Usage of this product signifies your acceptance of the Terms of Use.