|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Marco Serafini, Andrea Bondavalli, Neeraj Suri, "Online Diagnosis and Recovery: On the Choice and Impact of Tuning Parameters," IEEE Transactions on Dependable and Secure Computing, vol. 4, no. 4, pp. 295-312, October-December, 2007. | |||
| BibTex | x | ||
| @article{ 10.1109/TDSC.2007.70210, author = {Marco Serafini and Andrea Bondavalli and Neeraj Suri}, title = {Online Diagnosis and Recovery: On the Choice and Impact of Tuning Parameters}, journal ={IEEE Transactions on Dependable and Secure Computing}, volume = {4}, number = {4}, issn = {1545-5971}, year = {2007}, pages = {295-312}, doi = {http://doi.ieeecomputersociety.org/10.1109/TDSC.2007.70210}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Dependable and Secure Computing TI - Online Diagnosis and Recovery: On the Choice and Impact of Tuning Parameters IS - 4 SN - 1545-5971 SP295 EP312 EPD - 295-312 A1 - Marco Serafini, A1 - Andrea Bondavalli, A1 - Neeraj Suri, PY - 2007 VL - 4 JA - IEEE Transactions on Dependable and Secure Computing ER - | |||
[1] P. Agrawal, “Fault Tolerance in Multiprocessor Systems without Dedicated Redundancy,” IEEE Trans. Computers, vol. 37, no. 3, pp.358-362, Mar. 1988.
[2] M. Barborak, M. Malek, and A. Dahbura, “The Consensus Problem in Fault-Tolerant Computing,” ACM Surveys, vol. 25, no. 2, pp. 171-220, June 1993.
[3] K. Birman and T. Joseph, “Exploiting Virtual Synchrony in Distributed Systems,” Proc. 11th Symp. Operating Systems Principles (SOSP '87), pp. 123-138, 1987.
[4] D.M. Blough and H.W. Brown, “The Broadcast Comparison Model for On-Line Fault Diagnosis in Multicomputer Systems: Theory and Implementation,” IEEE Trans. Computers, vol. 48, no. 5, pp. 470-493, May 1999.
[5] M. Blount, “Probabilistic Treatment of Diagnosis in Digital Systems,” Proc. Seventh Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '77), pp. 72-77, 1977.
[6] A. Bondavalli, S. Chiaradonna, F. Di Giandomenico, and F. Grandoni, “Discriminating Fault Rate and Persistency to Improve Fault Treatment,” Proc. 27th Ann. Int'l Symp. Fault-Tolerant Computing Symp. (FTCS '97), pp. 354-362, 1997.
[7] A. Bondavalli, S. Chiaradonna, F. Di Giandomenico, and F. Grandoni, “Threshold-Based Mechanisms to Discriminate Transient from Intermittent Faults,” IEEE Trans. Computers, vol. 49, no. 3, pp. 230-245, Mar. 2000.
[8] T. Chandra and S. Toueg, “Unreliable Failure Detectors for Reliable Distributed Systems,” J. ACM, vol. 43, no. 2, pp. 225-267, Mar. 1996.
[9] C. Constantinescu, “Impact of Deep Submicron Technology on Dependability of VLSI Circuits,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '02), pp. 205-209, 2002.
[10] F. Cristian, “Reaching Agreement on Processor-Group Membership in Synchronous Distributed Systems,” Distributed Computing, vol. 4, no. 4, pp. 175-187, Dec. 1991.
[11] F. Cristian and C. Fetzer, “The Timed Asynchronous Distributed System Model,” IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 6, pp. 642-657, June 1999.
[12] D.D. Deavours, G. Clark, T. Courtney, D. Daly, S. Derisavi, J.M. Doyle, and W.H. Sanders, “The Möbius Framework and Its Implementation,” IEEE Trans. Software Eng., vol. 20, no. 10, pp.956-969, Oct. 2002.
[13] L. Gong, P. Lincoln, and J. Rushby, “Byzantine Agreement with Authentication: Observations and Applications in Tolerating Hybrid and Link Faults,” Proc. Fifth Conf. Dependable Computing for Critical Applications (DCCA '95), pp. 139-157, 1995.
[14] “Road Vehicles—Electrical Disturbances from Conduction and Coupling,” ISO 7637, Int'l Organization for Standardization, 1997.
[15] R. Iyer, L.T. Young, and P.V.K. Iyer, “Automatic Recognition of Intermittent Failures: An Experimental Study of Field Data,” IEEE Trans. Computers, vol. 39, no. 3, pp. 525-537, Apr. 1990.
[16] H. Kopetz and G. Grunsteidl, “TTP—A Protocol for Fault-Tolerant Real-Time Systems,” Computer, vol. 27, no. 1, pp. 14-23, Jan. 1994.
[17] J. Kuhl and S. Reddy, “Fault Diagnosis in Fully Distributed Systems,” Proc. 11th Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '81), pp. 100-105, 1981.
[18] J. Lala and L. Alger, “Hardware and Software Fault Tolerance: A Unified Architectural Approach,” Proc. 18th Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '88), pp. 240-245, 1988.
[19] J.-C. Laprie, “Dependable Computing and Fault Tolerance: Concepts and Terminology,” Proc. 25th Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '95), pp. 2-11, 1995.
[20] E. Latronico and P. Koopman, “Design Time Reliability Analysis of Distributed Fault Tolerance Algorithms,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN), pp. 486-495, 2005.
[21] T. Lin and D. Siewiorek, “Error Log Analysis: Statistical Modeling and Heuristic Trend Analysis,” IEEE Trans. Computers, vol. 39, no. 4, pp. 419-432, Oct. 1990.
[22] P. Lincoln and J. Rushby, “A Formally Verified Algorithm for Interactive Consistency under a Hybrid Fault Model,” Proc. 23rd Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '93), pp. 402-411, 1993.
[23] S. Mallela and G. Masson, “Diagnosis without Repair for Hybrid Fault Situations,” IEEE Trans. Computers, vol. 29, no. 6, pp. 461-470, June 1980.
[24] M. Malek, “A Comparison Connection Assignment for Diagnosis of Multiprocessor Systems,” Proc. Seventh Ann. Symp. Computer Architecture, pp. 31-36, 1980.
[25] D. Powell, J. Arlat, L. Beus-Dukic, A. Bondavalli, P. Coppola, A. Fantechi, E. Jenn, C. Rabéjac, and A. Wellings, “GUARDS: A Generic Upgradable Architecture for Real-Time Dependable Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 6, pp. 580-599, June 1999.
[26] F.P. Preparata, G. Metze, and R.T. Chien, “On the Connection Assignment Problem of Diagnosable Systems,” IEEE Trans. Electronic Computers, vol. 16, no. 12, pp. 848-854, Dec. 1967.
[27] U. Schmid, “How to Model Link Failures: A Perception-Based Fault Model,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '95), pp. 57-66, 1995.
[28] A. Sengupta and A. Dahbura, “On Self-Diagnosable Multiprocessor Systems: Diagnosis by the Comparison Approach,” IEEE Trans. Computers, vol. 41, no. 11, pp. 1386-1396, Nov. 1992.
[29] M. Serafini, N. Suri, J. Vinter, A. Ademaj, W. Brandstätter, F. Tagliabò, and J. Koch, “A Tunable Add-On Diagnostic Protocol for Time-Triggered Systems,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '07), pp. 164-174, 2007.
[30] K. Shin and P. Ramanathan, “Diagnosis of Processors with Byzantine Faults in a Distributed Computing System,” Proc. 17th Ann. Int'l Symp. Fault-Tolerant Computing (FTCS '87), pp. 55-60, 1987.
[31] D.P. Siewiorek and R.R. Swarz, Reliable Computer Systems: Design and Evaluation. AK Peters, 1998.
[32] C. Walter, M.M. Hugue, and N. Suri, “Continual On-Line Diagnosis of Hybrid Faults,” Proc. Fourth Conf. Dependable Computing for Critical Applications (DCCA '94), pp. 150-166, 1994.
[33] C. Walter, P. Lincoln, and N. Suri, “Formally Verified On-Line Diagnosis,” IEEE Trans. Software Eng., vol. 23, no. 11, pp. 684-721, Nov. 1997.

