
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Y.B. Shieh, D. Ghosal, P.R. Chintamaneni, S.K. Tripathi, "Modeling of Hierarchical Distributed Systems with FaultTolerance," IEEE Transactions on Software Engineering, vol. 16, no. 4, pp. 444457, April, 1990.  
BibTex  x  
@article{ 10.1109/32.54296, author = {Y.B. Shieh and D. Ghosal and P.R. Chintamaneni and S.K. Tripathi}, title = {Modeling of Hierarchical Distributed Systems with FaultTolerance}, journal ={IEEE Transactions on Software Engineering}, volume = {16}, number = {4}, issn = {00985589}, year = {1990}, pages = {444457}, doi = {http://doi.ieeecomputersociety.org/10.1109/32.54296}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Software Engineering TI  Modeling of Hierarchical Distributed Systems with FaultTolerance IS  4 SN  00985589 SP444 EP457 EPD  444457 A1  Y.B. Shieh, A1  D. Ghosal, A1  P.R. Chintamaneni, A1  S.K. Tripathi, PY  1990 KW  hierarchical distributed systems modelling; faulttolerance; stochastic Petri net; parameterized subnet primitives; centralized; checkpointing strategies; arbitrary checkpointing strategy; planned strategy; distributed processing; fault tolerant computing; Petri nets. VL  16 JA  IEEE Transactions on Software Engineering ER   
Since each of the levels in a hierarchical system could have various characteristics, different faulttolerant schemes could be appropriate at different levels. A stochastic Petri net (SPN) is used to investigate various faulttolerant schemes in this context. The basic SPN is augmented by parameterized subnet primitives to model the faulttolerant schemes. Both centralized and distributed faulttolerant schemes are considered. The two schemes are investigated by considering the individual levels in a hierarchical system independently. In the case of distributed fault tolerance, two different checkpointing strategies are considered. The first scheme is called the arbitrary checkpointing strategy. Each process in this scheme does its checkpointing independently; thus, the domino effect may occur. The second scheme is called the planned strategy. Here, process checkpointing is constrained to ensure no domino effect. The results show that, under certain conditions, an arbitrary checkpointing strategy can perform better than a planned strategy. The effect of integration on the faulttolerant strategies of the various levels of a hierarchy are studied.
[1] T. Albert and R. Charles, "A proposed hierarchical model for automated manufacturing systems,"J. Manufacturing Syst., vol. 5, no. 1, pp. 1525, 1986.
[2] P. Chintamaneni, P. Jalote, Y. Shieh, and S. Tripathi, "On fault tolerance in manufacturing systems,"IEEE Network, vol. 2, pp. 32 39, May 1988.
[3] Y. Shieh, D. Ghosal, P. Chintamaneni, and S. Tripathi, "Application of Petri net models for the evaluation of faulttolerant techniques in distributed systems," inProc. 9th Annu. Int. Conf. Distributed Computing Systems, June 1989.
[4] B. Randell, "System structure for software fault tolerance,"IEEE Trans. Software Eng., vol. SE1, pp. 220232, June 1975.
[5] L. Chen and A. Avizienis, "Nversion programming: A fault tolerance approach to reliability of software operation," inDig. 8th Annu. Int. Conf. FaultTolerant Comput., FTCS8, June 1978, pp. 39.
[6] R. Koo and S. Toueg, "Checkpointing and rollbackrecovery for distributed systems,"IEEE Trans. Software Eng., vol. SE13, pp. 2331, Jan. 1987.
[7] T. Anderson and J. Knight, "A framework for software fault tolerance in realtime systems,"IEEE Trans. Software Eng., vol. SE9, pp. 355364, May 1983.
[8] G. Balbo, S. Bruell, and S. Ghanta, "Combining queueing network and generalized stochastic Petri net models for the analysis of some software blocking phenomena,"IEEE Trans. Software Eng., vol. SE12, no. 4, pp. 561576, 1986.
[9] G. Balbo, S. Bruell, and S. Ghanta, "Combining queueing network and generalized stochastic petri net models for the solution of complex models of system behavior,"IEEE Trans. Comput., vol. 37, pp. 12511268, Oct. 1988.
[10] A. M. Tyrrell and D. J. Holding, "Design of reliable software in distributed systems using the conversation scheme,"IEEE Trans. Software Eng., vol. SE12, no. 9, pp. 921928, Sept. 1986.
[11] N.G. Leveson and J.L. Stolzy, "Safety analysis using Petri nets,"IEEE Trans. Software Eng., vol. SE13, no. 3, pp. 386397, Mar. 1987.
[12] A. Adiga and S. Deshpande, "Evaluation of effectiveness of circuit based and packet based interconnection networks via Petrinet models," Univ. Texas at Austin, Tech. Rep., Jan. 1987.
[13] M. K. Vernon and M. A. Holliday, "Performance analysis of multiprocessor cache consistency protocols using generalized timed petri nets," inProc. Performance '86 and ACM Sigmetrics 1986, Raleigh, NC, May 1986, pp. 917.
[14] G. Peterka and T. Murata, "Proof procedure and answer extraction in Petri net model of logic programs,"IEEE Trans. Software Eng., vol, 15, pp. 209217, Feb. 1989.
[15] J. Dugan and G. Ciardo, "Stochastic Petri net analysis of a replicated file systems,"IEEE Trans. Software Eng., vol. 15, pp. 394401, Apr. 1989.
[16] P. J. Haas and G. S. Shedler, "Stochastic Petri net representation of discrete event simulations,"IEEE Trans. Software Eng., vol. 15, pp. 381393, Apr. 1989.
[17] Y.B. Shieh, D. Ghosal, and S. Tripathi, "Modeling of faulttolerant techniques in hierarchical systems," inProc. FTCS19, Chicago, IL, 1989.
[18] M. Molloy, "Performance analysis using stochastic Petri nets,"IEEE Trans. Comput., vol. C31, pp. 913917, Sept. 1982.
[19] M. Ajmone Marsan, G. Balbo, and G. Conte, "A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems,"ACM Trans. Comput. Syst., vol. 2, pp. 93122, May 1984.
[20] R. Nelson, L. Haibt, and P. Sheridan, "Casting Petri nets into program,"IEEE Trans. Software Eng., vol. SE9, pp. 590602, Sept. 1983.
[21] G. Chiola, "A software package for the analysis of Generalized Stochastic Petri Net models," inProc. Int. Workshop Timed Petri Nets, July 1985.
[22] Ada Reference Manual, ANSI/MILSTD 1815A, 1983.
[23] C. A. R. Hoare, "Communicating sequential processes,"Commun. ACM, vol. 21, pp. 666677, 1978.
[24] D. Peng and K. G. Shin, "Modeling of concurrent task execution in a distributed system for realtime control,"IEEE Trans. Comput., vol. C36, no. 4, Apr. 1987.
[25] E. Gelenbe and D. Derochette, "Performance of rollback recovery systems under intermittent failures,"Commun. ACM, vol. 21, no. 6, pp. 493499, 1978.
[26] M. L. Powell and D. L. Presotto, "Publishing: A reliable broadcast communication mechanism," inProc. 9th ACM Symp. Operat. Syst. Principles, Oct. 1983, pp. 100109.
[27] A. Borg, J. Baumbach, and S. Glazer, "A Message System Supporting Fault Tolerance,"Proc. Ninth Symp. on Operating System Principles, 1983, pp. 9099.
[28] K. Shin and Y. Lec, "Evaluation of error recovery blocks used for cooperating processes,"IEEE Trans. Software Eng., vol. SE10, pp. 692700, Nov. 1984.