CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2013 vol.35 Issue No.09 - Sept.

Subscribe

Issue No.09 - Sept. (2013 vol.35)

pp: 2078-2090

G. Bellala , Hewlett Packard Labs., Palo Alto, CA, USA

J. Stanley , Citadel Investment Group, Chicago, IL, USA

S. K. Bhavnani , Inst. for Translational Sci., Univ. of Texas Med. Branch, Galveston, TX, USA

C. Scott , Dept. of Electr. Eng. & Comput. Sci., Univ. of Michigan, Ann Arbor, MI, USA

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2013.30

ABSTRACT

The problem of active diagnosis arises in several applications such as disease diagnosis and fault diagnosis in computer networks, where the goal is to rapidly identify the binary states of a set of objects (e.g., faulty or working) by sequentially selecting, and observing, potentially noisy responses to binary valued queries. Previous work in this area chooses queries sequentially based on Information gain, and the object states are inferred by maximum a posteriori (MAP) estimation. In this work, rather than MAP estimation, we aim to rank objects according to their posterior fault probability. We propose a greedy algorithm to choose queries sequentially by maximizing the area under the ROC curve associated with the ranked list. The proposed algorithm overcomes limitations of existing work. When multiple faults may be present, the proposed algorithm does not rely on belief propagation, making it feasible for large scale networks with little loss in performance. When a single fault is present, the proposed algorithm can be implemented without knowledge of the underlying query noise distribution, making it robust to any misspecification of these noise parameters. We demonstrate the performance of the proposed algorithm through experiments on computer networks, a toxic chemical database, and synthetic datasets.

INDEX TERMS

Noise, Approximation methods, Diseases, Entropy, Noise measurement, Fault diagnosis, Computer networks,area under the ROC curve, Active diagnosis, active learning, Bayesian network, persistent noise

CITATION

G. Bellala, J. Stanley, S. K. Bhavnani, C. Scott, "A Rank-Based Approach to Active Diagnosis",

*IEEE Transactions on Pattern Analysis & Machine Intelligence*, vol.35, no. 9, pp. 2078-2090, Sept. 2013, doi:10.1109/TPAMI.2013.30REFERENCES

- [1] T.S. Jaakkola and M.I. Jordan, "Variational Methods and the QMR-DT Databases,"
J. Artificial Intelligence Research, vol. 10, pp. 291-322, 1999.- [2] N.I. Santoso, C. Darken, G. Povh, and J. Erdmann, "Nuclear Plant Fault Diagnosis Using Probabilistic Reasoning,"
Proc. IEEE Power Eng. Soc. Meeting, vol. 2, pp. 714-719, 1999.- [3] I. Rish, M. Brodie, S. Ma, N. Odintsova, A. Beygelzimer, G. Grabarnik, and K. Hernandez, "Adaptive Diagnosis in Distributed Systems,"
IEEE Trans. Neural Networks, vol. 16, no. 5, pp. 1088-1109, Sept. 2005.- [4] A.X. Zheng, I. Rish, and A. Beygelzimer, "Efficient Test Selection in Active Diagnosis via Entropy Approximation,"
Proc. Int'l Conf. Uncertainty in Artificial Intelligence, 2005.- [5] Z. Yongli, H. Limin, and L. Jinling, "Bayesian Networks Based Approach for Power Systems Fault Diagnosis,"
IEEE Trans. Power Delivery, vol. 21, no. 2, pp. 634-639, Apr. 2006.- [6] S. Dasgupta, "Analysis of a Greedy Active Learning Strategy,"
Proc. Advances in Neural Information Processing Systems, 2004.- [7] S. Hanneke, "Teaching Dimension and the Complexity of Active Learning,"
Proc. 20th Conf. Learning Theory, 2007.- [8] S.K. Bhavnani, A. Abraham, C. Demeniuk, M. Gebrekristos, A. Gong, S. Nainwal, G. Vallabha, and R. Richardson, "Network Analysis of Toxic Chemicals and Symptoms: Implications for Designing First-Responder Systems,"
Proc. Am. Medical Informatics Assoc. Ann. Symp., 2007.- [9] A.P. Korostelev and J.C. Kim, "Rates of Convergence of the Sup-Norm Risk in Image Models under Sequential Designs,"
Statistics and Probability Letters, vol. 46, pp. 391-399, 2000.- [10] D. Geman and B. Jedynak, "An Active Testing Model for Tracking Roads in Satellite Images,"
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 1-14, Jan. 1996.- [11] S.R. Kosaraju, T.M. Przytycka, and R.S. Borgstrom, "On an Optimal Split Tree Problem,"
Proc. Sixth Int'l Workshop Algorithms and Data Structures, pp. 11-14, 1999.- [12] A. Gupta, R. Krishnaswamy, V. Nagarajan, and R. Ravi, "Approximation Algorithms for Optimal Decision Trees and Adaptive TSP Problems,"
Proc. 37th Int'l Colloquium Conf. Automata, Languages and Programming, 2010.- [13] K.P. Murphy, Y. Weiss, and M. Jordan, "Loopy Belief Propagation for Approximate Inference: An Empirical Study,"
Proc. Int'l Conf. Uncertainty in Artificial Intelligence, pp. 467-475, 1999.- [14] M. Kääriäinen, "Active Learning in the Non-Realizable Case,"
Proc. 17th Int'l Conf. Algorithmic Learning Theory, pp. 63-77, 2006.- [15] R. Nowak, "Noisy Generalized Binary Search,"
Proc. Advances in Neural Information Processing Systems 21, 2009.- [16] G. Bellala, S.K. Bhavnani, and C. Scott, "Active Diagnosis under Persistent Noise with Unknown Noise Distribution: A Rank-Based Approach,"
Proc. 14th Int'l Conf. Artificial Intelligence and Statistics, 2011.- [17] G. Bellala, J. Stanley, C. Scott, and S.K. Bhavnani, "Active Diagnosis via AUC Maximization: An Efficient Approach for Multiple Fault Identification in Large Scale, Noisy Networks,"
Proc. 27th Int'l Conf. Uncertainty in Artificial Intelligence, 2011.- [18] S. Kandula, D. Katabi, and J.P. Vasseur, "Shrink: A Tool for Failure Diagnosis in IP Networks,"
Proc. ACM SIGCOMM MineNet Workshop, Aug. 2005.- [19] D.W. Loveland, "Performance Bounds for Binary Testing with Arbitrary Weights,"
Acta Informatica, vol. 22, pp. 101-114, 1985.- [20] L. Hyafil and R. Rivest, "Constructing Optimal Binary Decision Trees Is NP-Complete,"
Information Processing Letters, vol. 5, no. 1, pp. 15-17, 1976.- [21] A. Rényi, "On a Problem of Information Theory,"
MTA Mat. Kut. Int. Kozl., vol. 6B, pp. 505-516, 1961.- [22] G. Bellala, S.K. Bhavnani, and C. Scott, "Group-Based Active Query Selection for Rapid Diagnosis in Time-Critical Situations,"
IEEE Trans. Information Theory, vol. 58, no. 1, pp. 459-478, Jan. 2012.- [23] D. Golovin, D. Ray, and A. Krause, "Near-Optimal Bayesian Active Learning with Noisy Observations,"
Proc. Advances in Neural Information Processing Systems 23, 2010.- [24] J. Pearl,
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.- [25] T. Le and C.N. Hadjicostis, "Max-Product Algorithms for the Generalized Multiple-Fault Diagnosis Problem,"
IEEE Trans. Systems, Man, and Cybernetics, vol. 37, no. 6, pp. 1607-1621, Dec. 2007.- [26] L. Cheng, X. Qui, L. Meng, Y. Qiao, and R. Boutaba, "Efficient Active Probing for Fault Diagnosis in Large Scale and Noisy Networks,"
Proc. IEEE INFOCOM, 2010.- [27] C. Cortes and M. Mohri, "AUC Optimization versus Error Rate Minimization,"
Proc. Advances in Neural Information Processing Systems 15, 2003.- [28] P.M. Long and R.A. Servedio, "Boosting the Area under the ROC Curve,"
Proc. Advances in Neural Information Processing Systems 19, 2007.- [29] K. Ataman, W.N. Street, and Y. Zhang, "Learning to Rank by Maximizing AUC with Linear Programming,"
Proc. IEEE Int'l Joint Conf. Neural Networks, pp. 123-129, 2006.- [30] M. Culver, K. Deng, and S. Scott, "Active Learning to Maximize Area under the ROC Curve,"
Proc. Sixth Int'l Conf. Data Mining, 2006.- [31] "Supplementary Material," http://web.eecs.umich.edu/~cscottpubs.html , 2012.
- [32] J. Guillaume and M. Latapy,
Bipartite Graphs as Models of Complex Networks. Springer, 2004.- [33] A. Medina, A. Lakhina, I. Matta, and J. Byers, "BRITE: An Approach to Universal Topology Generation,"
Proc. Ninth Int'l Symp. Modeling, Analysis and Simulation of Computer and Telecomm. Systems, 2001.- [34] J. Winick and S. Jamin, "INET-3.0: Internet Topology Generator," Technical Report CSE-TR-456-02, Univ. of Michigan, 2002.
- [35] J.M. Mooij, "libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models,"
J. Machine Learning Research, vol. 11, pp. 2169-2173, 2010. |