|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Gunjan Khanna, Padma Varadharajan, Saurabh Bagchi, "Automated Online Monitoring of Distributed Applications through External Monitors," IEEE Transactions on Dependable and Secure Computing, vol. 3, no. 2, pp. 115-129, April-June, 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/TDSC.2006.17, author = {Gunjan Khanna and Padma Varadharajan and Saurabh Bagchi}, title = {Automated Online Monitoring of Distributed Applications through External Monitors}, journal ={IEEE Transactions on Dependable and Secure Computing}, volume = {3}, number = {2}, issn = {1545-5971}, year = {2006}, pages = {115-129}, doi = {http://doi.ieeecomputersociety.org/10.1109/TDSC.2006.17}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Dependable and Secure Computing TI - Automated Online Monitoring of Distributed Applications through External Monitors IS - 2 SN - 1545-5971 SP115 EP129 EPD - 115-129 A1 - Gunjan Khanna, A1 - Padma Varadharajan, A1 - Saurabh Bagchi, PY - 2006 KW - Error detection KW - blackbox detection KW - monitor system KW - temporal and combinatorial rules KW - reliable multicast. VL - 3 JA - IEEE Transactions on Dependable and Secure Computing ER - | |||
[1] A.S. Danthine, “Protocol Representation with Finite State Models,” IEEE Trans. Comm., vol. 28, no. 4, pp. 632-643, Apr. 1980.
[2] L. Lamport, “The Temporal Logic of Actions,” ACM Trans. Programming Languages and Systems, vol. 16, no. 3, pp. 872-923, 1994.
[3] Z. Liu and M. Joseph, “Specification and Verification of Fault-Tolerance, Timing, and Scheduling,” ACM Trans. Programming Languages and Systems, vol. 21, no. 1, pp. 46-89, 1999.
[4] B. Berthomieu and M. Diaz, “Modeling and Verification of Time Dependent Systems using Time Petri Nets,” IEEE Trans. Software Eng., vol. 17, no. 3, pp. 259-273, Mar. 1991.
[5] W. Peng, “Deadlock Detection in Communicating Finite State Machines by Even Reachability Analysis,” Proc. IEEE Conf. Computer Comm. and Networks (ICCCN), pp. 656-662, Sept. 1995.
[6] A. Agarwal and J.W. Atwood, “A Unified Approach to Fault-Tolerance in Communication Protocols Based on Recovery Procedures,” IEEE/ACM Trans. Networking, vol. 4, no. 5, pp. 785-795, Oct. 1996.
[7] L.-B. Chen and I-C. Wu, “Detection of Summative Global Predicates,” Proc. IEEE Conf. Parallel and Distributed Systems (ICPADS '97), pp. 466-473, Dec 1997.
[8] M. Zulkernine and R.E. Seviora, “A Compositional Approach to Monitoring Distributed Systems,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '02), pp. 763-772, June 2002.
[9] C. Wang and M. Schwartz, “Fault Detection with Multiple Observers,” IEEE/ACM Trans. Networking, vol. 1, no. 1, pp. 48-55, Feb. 1993.
[10] G. Khanna, J.S. Rogers, and S. Bagchi, “Failure Handling in a Reliable Multicast Protocol for Improving Buffer Utilization and Accommodating Heterogeneous Receivers,” Proc. IEEE Pacific Rim Dependable Computing Conf. (PRDC '04), pp. 15-24, Mar. 2004.
[11] D.M. Chiu, S. Hurst, M. Kadansky, and J. Wesley, “TRAM: A Tree-Based Reliable Multicast Protocol,” Sun Technical Report TR 98-66, July 1998.
[12] D.M. Chiu, M. Kadansky, J. Provino, J. Wesley, H. Bischof, and H. Zhu, “A Congestion Control Algorithm for Tree-Based Reliable Multicast Protocols,” Proc. INFOCOM '02, pp. 1209-1217, 2002.
[13] W. Chen, S. Toueg, and M.K. Aguilera, “On the Quality of Service of Failure Detectors,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '00), pp. 191-201, June 2000.
[14] R. Baldoni, J.-M. Helary, and M. Raynal, “From Crash Fault-Tolerance to Arbitrary-Fault Tolerance: Towards a Modular Approach,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '00), pp. 273-282, June 2000.
[15] M. Diaz, G. Juanole, and J.-P. Courtiat, “Observer— A Concept for Formal On-Line Validation of Distributed Systems,” IEEE Trans. Software Eng., vol. 20, no. 12, pp. 900-913, Dec. 1994.
[16] S. Krishna, T. Diamond, and V.S.S. Nair, “Hierarchical Object Oriented Approach to Fault Tolerance in Distributed Systems,” Proc. IEEE Int'l Symp. Software Reliability Eng. (ISSRE '93), pp. 168-177, Nov. 1993.
[17] G. Khanna, P. Varadharajan, and S. Bagchi, “Self Checking Network Protocols: A Monitor Based Approach,” Proc. 23rd IEEE Symp. Reliable Distributed Systems (SRDS '04), pp. 18-30, Oct. 2004.
[18] R. Alur, R.K. Brayton, T.A. Henzinger, S. Qadeer, and S.K. Rajamani, “Partial-Order Reduction in Symbolic State-Space Exploration,” J. Formal Methods in System Design, 2001.
[19] K.L. McMillan, Symbolic Model Checking: An Approach to the State-Explosion Problem. Dordrecht: Kluwer Academic Publishers, 1993.
[20] A.W. Mazurkiewicz, “Basic Notions of Trace Theory,” Linear Time, Branching Time, and Partial Order in Logics and Models for Concurrency, J.W. de Bakker, W.-P. de Roever, and G. Rozenberg, eds., pp. 285-363, 1989.
[21] R.V. Renesse, K.P. Birman, and W. Vogels, “Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining,” ACM Trans. Computer Systems, vol. 21, no. 2, pp. 164-206, May 2003.
[22] M.L. Massie, B.N. Chun, and D.E. Culler, “The Ganglia Distributed Monitoring System: Design, Implementation, and Experience,” Parallel Computing, vol. 30, no. 7, July 2004.
[23] K. Bhargavan, S. Chandra, P.J. McCann, and C.A. Gunter, “What Packets May Come: Automata for Network Monitoring,” ACM SIGPLAN Notices, vol. 36, no. 3, pp. 206-219, 2001.
[24] I. Lee, S. Kannan, M. Kim, O. Sokolsky, and M. Viswanathan, “Runtime Assurance Based on Formal Specifications,” Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications, 1999.
[25] V. Paxson, “Automated Packet Trace Analysis of TCP Implementations,” Computer Comm. Rev., vol. 27, no. 4, Oct. 1997.
[26] M.K. Aguilera, J.C. Mogul, J.L. Wiener, P. Reynolds, and A. Muthitacharoen, “Performance Debugging for Distributed Systems of Black Boxes,” Proc. ACM Symp. Operating Systems Principles (SOSP), 2003.
[27] M.Y. Chen, E. Kiciman, E. Fratkin, A. Fox, and E. Brewer, “Pinpoint: Problem Determination in Large, Dynamic Internet Services,” Proc. 2002 Int'l Conf. Dependable Systems and Networks (DSN), pp. 595-604, 2002.
[28] SNMP Research International Inc., “Simple Network Management Protocol,” http://www.snmp.comprotocol/, 2006.
[29] Quest Software, “Big Brother System and Network Monitor,” http:/www.bb4.org/, 2006.
[30] P. Mason, “Turning IT Overhead into Business Value by Improving Infrastructure Management,” IDC White Paper, May 2002.
[31] Hewlett Packard, “HP OpenView Management Solutions for Your Adaptive Enterprise,” www.openview.hp.com, 2006.
[32] E. Skoudis, Counter Hack, chapter 2. Prentice-Hall Inc., 2002.
[33] M.Y. Chen, E. Kiciman, E. Fratkin, A. Fox, and E. Brewer, “Pinpoint: Problem Determination in Large, Dynamic Internet Services,” Proc. 2002 Int'l Conf. Dependable Systems and Networks (DSN), pp. 595-604, 2002.
[34] M.K. Aguilera, J.C. Mogul, J.L. Wiener, P. Reynolds, and A. Muthitacharoen, “Performance Debugging for Distributed Systems of Black Boxes,” Proc. 19th ACM Symp. Operating Systems Principles (SOSP), 2003.
[35] I. Rouvellou and G.W. Hart, “Automatic Alarm Correlation for Fault Identification,” Proc. Infocom, pp. 553-561, 1995.
[36] I. Katzela and M. Schwartz, “Schemes for Fault Identification in Communication Networks,” IEEE/ACM Trans. Networking, vol. 3, no. 6, pp. 753-764, Dec. 1995.
[37] S. Bagchi, Y. Liu, K. Whisnant, Z. Kalbarczyk, R.K. Iyer, Y. Levendel, and L.G. Votta, “A Framework for Database Audit and Control Flow Checking for a Wireless Telephone Network Controller,” Proc. IEEE Int'l Conf. Dependable Systems and Networks (DSN '01), pp. 225-234, 2001.
[38] G. Khanna, P. Varadharajan, M. Cheng, and S. Bagchi, “Automated Monitor Based Diagnosis in Distributed Systems,” Purdue ECE Technical Report 05-13, Aug. 2005, also submitted to IEEE Trans. Dependable and Secure Computing.
[39] G. Khanna, M.Y. Cheng, J. Dyaberi, S. Bagchi, M.P. Correia, and P. Vérissimo, “Probabilistic Diagnosis through NonIntrusive Monitoring in Distributed Applications,” Purdue ECE Technical Report 05-19, Nov. 2005.

