This Article 
 Bibliographic References 
 Add to: 
Prospective Infectious Disease Outbreak Detection Using Markov Switching Models
April 2010 (vol. 22 no. 4)
pp. 565-577
Hsin-Min Lu, University of Arizona, Tucson
Daniel Zeng, University of Arizona, Tucson and the Chinese Academy of Sciences
Hsinchun Chen, University of Arizona, Tucson
Accurate and timely detection of infectious disease outbreaks provides valuable information which can enable public health officials to respond to major public health threats in a timely fashion. However, disease outbreaks are often not directly observable. For surveillance systems used to detect outbreaks, noises caused by routine behavioral patterns and by special events can further complicate the detection task. Most existing detection methods combine a time series filtering procedure followed by a statistical surveillance method. The performance of this "two-step” detection method is hampered by the unrealistic assumption that the training data are outbreak-free. Moreover, existing approaches are sensitive to extreme values, which are common in real-world data sets. We considered the problem of identifying outbreak patterns in a syndrome count time series using Markov switching models. The disease outbreak states are modeled as hidden state variables which control the observed time series. A jump component is introduced to absorb sporadic extreme values that may otherwise weaken the ability to detect slow-moving disease outbreaks. Our approach outperformed several state-of-the-art detection methods in terms of detection sensitivity using both simulated and real-world data.

[1] P.-H. Hu, D. Zeng, H. Chen, C. Larson, W. Chang, C. Tseng, and J. Ma, "System for Infectious Disease Information Sharing and Analysis: Design and Evaluation," IEEE Trans. Information Technology in Biomedicine, vol. 11, no. 4, pp. 483-492, July 2007.
[2] S. Niiranen, J. Yli-Hietanen, and L. Nathanson, "Toward Reflective Management of Emergency Department Chief Complaint Information," IEEE Trans. Information Technology in Biomedicine, vol. 12, no. 6, pp. 763-767, Nov. 2008.
[3] W.W. Chapman, L.M. Christensen, M.M. Wagner, P.J. Haug, O. Ivanov, J.N. Dowling, and R.T. Olszewski, "Classifying Free-Text Triage Chief Complaints into Syndromic Categories with Natural Language Processing," Artificial Intelligence in Medicine, vol. 33, no. 1, pp. 31-40, 2005.
[4] O. Ivanov, M.M. Wagner, W.W. Chapman, and R.T. Olszewski, "Accuracy of Three Classifiers of Acute Gastrointestinal Syndrome for Syndromic Surveillance," Proc. Am. Medical Informatics Assoc. (AMIA) Symp., pp. 345-349, 2002.
[5] P. Yan, H. Chen, and D. Zeng, "Syndromic Surveillance Systems," Ann. Rev. of Information Science and Technology, vol. 42, pp. 425-495, 2008.
[6] J. Espino, M. Wagner, F. Tsui, H. Su, R. Olszewski, Z. Lie, W. Chapman, X. Zeng, L. Ma, Z. Lu, and J. Dara, "The RODS Open Source Project: Removing a Barrier to Syndromic Surveillance," Studies in Health Technology and Informatics, vol. 107, pp. 1192-1196, 2004.
[7] K.D. Mandl, M. Overhage, M. Wagner, W. Lober, P. Sebastiani, F. Mostashari, J. Pavlin, P.H. Gesteland, T. Treadwell, E. Koski, L. Hutwagner, D.L. Buckeridge, R.D. Aller, and S. Grannis, "Implementing Syndromic Surveillance: A Practical Guide Informed by the Early Experience," J. Am. Medical Informatics Assoc., vol. 11, no. 2, pp. 141-150, 2004.
[8] B.Y. Reis, M. Pagano, and K.D. Mandl, "Using Temporal Context to Improve Biosurveillance," Proc. Nat'l Academy of Sciences USA, vol. 100, pp. 1961-1965, Feb. 2003.
[9] B.Y. Reis and K.D. Mandl, "Time Series Modeling for Syndromic Surveillance," BMC Medical Informatics and Decision Making, vol. 3, no. 2, Jan. 2003.
[10] J. Takeuchi and K. Yamanishi, "A Unifying Framework for Detecting Outliers and Change Points from Time Series," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 482-492, Apr. 2006.
[11] W.A. Shewhart, Statistical Method from the Viewpoint of Quality Control. Department of Agriculture, The Graduate School, 1939.
[12] D.C. Montgomery, Introduction to Statistical Quality Control, fifth ed. Wiley, 2005.
[13] E.S. Page, "Continuous Inspection Schemes," Biometrika, vol. 41, nos. 1/2, pp. 100-115, June 1954.
[14] CDC, "Increased Antiviral Medication Sales Before the 2005-06 Influenza Season-New York City," Morbidity and Mortality Weekly Report, vol. 55, pp. 277-279, Mar. 2006.
[15] D.L. Buckeridge, P. Switzer, D. Owens, D. Siegrist, J. Pavlin, and M. Musen, "An Evaluation Model for Syndromic Surveillance: Assessing the Performance of a Temporal Algorithm," Morbidity and Mortality Weekly Report, vol. 54, pp. 109-115, Aug. 2005.
[16] M.P. Clements and D.F. Hendry, "Forecasting with Breaks," Handbook of Economic Forecasting, vol. 1, pp. 605-657, Elsevier, 2006.
[17] C.-S.J. Chu, M. Stinchcombe, and H. White, "Monitoring Structural Change," Econometrica, vol. 64, no. 5, pp. 1045-1065, 1996.
[18] J.D. Hamilton, "A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle," Econometrica, vol. 57, no. 2, pp. 357-84, Mar. 1989.
[19] B.Y. Reis and K.D. Mandl, "Integrating Syndromic Surveillance Data Across Multiple Locations: Effects on Outbreak Detection Performance," Proc. Am. Medical Informatics Assoc. (AMIA) 2003 Symp., pp. 549-553, 2003.
[20] G. Box and G. Jenkins, Time Series Analysis: Forecasting and Control. Holden Day, 1970.
[21] W.H. Greene, Econometric Analysis. Prentice Hall, 2000.
[22] H. Akaike, "Statistical Predictor Identification," Annals of the Inst. of Statistical Math., vol. 22, pp. 203-217, 1970.
[23] H. Akaike, "Information Theory and an Extension of the Likelihood Principle," Proc. Second Int'l Symp. Information Theory, B.N. Perov and F. Csaki, eds., 1973.
[24] G. Schwarz, "Estimating the Dimension of a Model," Annals of Statistics, vol. 6, pp. 461-464, 1978.
[25] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[26] H. White, Approximate Nonlinear Forecasting Methods, vol. 1, chapter 9, pp. 459-512. Elsevier, Jan. 2006.
[27] J. Shao, "An Asymptotic Theory for Linear Model Selection," Statistica Sinica, vol. 7, pp. 221-264, 1997.
[28] M.L. Jackson, A. Baer, I. Painter, and J. Duchin, "A Simulation Study Comparing Aberration Detection Algorithms for Syndromic Surveillance," BMC Medical Informatics and Decision Making, vol. 7, no. 6, 2007.
[29] S.C. Wieland, J.S. Brownstein, B. Berger, and K.D. Mandl, "Automated Real Time Constant-Specificity Surveillance for Disease Outbreaks," BMC Medical Informatics and Decision Making, vol. 7, no. 15, 2007.
[30] J. Zhang, F.-C. Tsui, M.M.W. William, and R. Hogan, "Detection of Outbreaks from Time Series Data Using Wavelet Transform," Proc. Am. Medical Informatics Assoc. (AMIA) Symp., 2003.
[31] R. Serfling, "Methods for Current Statistical Analysis of Excess Pneumonia-Influenza Deaths," Public Health Reports, vol. 78, pp. 494-506, 1963.
[32] J.C. Brillman, T. Burr, D. Forslund, E. Joyce, R. Picard, and E. Umland, "Modeling Emergency Department Visit Patterns for Infectious Disease Complaints: Results and Application to Disease Surveillance," BMC Medical Informatics and Decision Making, vol. 5, no. 4, 2005.
[33] C.C. Holt, "Forecasting Seasonals and Trends by Exponentially Weighted Moving Averages," Int'l J. Forecasting, vol. 20, no. 1, pp. 5-10, 2004.
[34] P.R. Winters, "Forecasting Sales by Exponentially Weighted Moving Averages," Management Science, vol. 6, no. 3, pp. 324-342, 1960.
[35] H.S. Burkom, S.P. Murphy, and G. Shmueli, "Automated Time Series Forecasting for Biosurveillance," Statistics in Medicine, vol. 26, pp. 4202-4218, Sept. 2007.
[36] J. Hamilton, Time Series Analysis. Princeton, 1994.
[37] A.N. Shiryaev, "On Optimum Methods in Quickest Detection Problems," Theory of Probability and Its Applications, vol. 8, pp. 22-46, 1963.
[38] S.W. Roberts, "A Comparison of Some Control Chart Procedures," Technometrics, vol. 8, pp. 411-430, 1966.
[39] M. Frisen and J. De Mare, "Optimal Surveillance," Biometrika, vol. 78, no. 2, pp. 271-280, 1991.
[40] C. Sonesson and D. Book, "Review and Discussion of Prospective Statistical Surveillance in Public Health," J. Royal Statistical Soc., Series A, vol. 166, no. 1, pp. 5-21, 2003.
[41] G.V. Moustakides, "Optimal Stopping Times for Detecting Changes in Distributions," Annals of Statistics, vol. 14, no. 4, pp. 1379-1387, Dec. 1986.
[42] M. Frisen, "Statistical Surveillance. Optimality and Methods," Int'l Statistical Rev., vol. 71, no. 2, pp. 403-434, 2003.
[43] S. Chandrasekaran, J.R. English, and R.L. Disney, "Modeling and Analysis of EWMA Control Schemes with Variance-Adjusted Control Limits," IIE Trans., vol. 27, pp. 282-290, 1995.
[44] S.H. Steiner, "EWMA Control Charts with Time-Varying Control Limits and Fast Initial Response," J. Quality Technology, vol. 31, no. 1, pp. 75-86, 1999.
[45] C. Sonesson, "Evaluations of Some Exponentially Weighted Moving Average Methods," J. Applied Statistics, vol. 30, no. 10, pp. 1115-1133, 2003.
[46] H. Burkom, "Alerting Algorithms for Biosurveillance," Disease Surveillance: A Public Health Informatics Approach, pp. 143-192, John Wiley & Sons, 2007.
[47] C.-J. Kim and C.R. Nelson, State-Space Models with Regime Switching. MIT Press, 1999.
[48] L.E. Baum and T. Petrie, "Statistical Inference for Probabilistic Functions of Finite State Markov Chains," Annals of Math. Statistics, vol. 37, pp. 1554-1563, 1966.
[49] L.E. Baum and J.A. Egon, "An Inequality with Applications to Statistical Estimation for Probabilistic Functions of a Markov Process and to a Model for Ecology," Bull. Am. Meteorology Soc., vol. 73, pp. 360-363, 1967.
[50] Y.L. Strat and F. Carrat, "Monitoring Epidemiologic Surveillance Data Using Hidden Markov Models," Statistics in Medicine, vol. 18, pp. 3463-3478, 1999.
[51] M. Dahlquist and S.F. Gray, "Regime-Switching and Interest Rates in the European Monetary System," J. Int'l Economics, vol. 50, no. 2, pp. 399-419, Apr. 2000.
[52] S.L. Scott, "Bayesian Methods for Hidden Markov Models: Recursive Computing in the 21st Century," J. Am. Statistical Assoc., vol. 97, pp. 337-351, 2002.
[53] A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc., Series B (Methodological), vol. 39, no. 1, pp. 1-38, 1977.
[54] C.A. Popescu and Y.S. Wong, "Nested Monte Carlo EM Algorithm for Switching State-Space Models," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 12, pp. 1653-1663, Dec. 2005.
[55] X. Song, M. Wu, C. Jermaine, and S. Ranka, "Conditional Anomaly Detection," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 5, pp. 631-645, May 2007.
[56] S. Chib and E. Greenberg, "Understanding the Metropolis-Hastings Algorithm," Am. Statistician, vol. 49, no. 4, pp. 327-335, Nov. 1995.
[57] J.H. Albert and S. Chib, "Bayes Inference via Gibbs Sampling of Autoregressive Time Series Subject to Markov Mean and Variance Shifts," J. Business & Economic Statistics, vol. 11, no. 1, pp. 1-15, Jan. 1993.
[58] C.K. Carter and R. Kohn, "On Gibbs Sampling for State Space Models," Biometrika, vol. 81, no. 3, pp. 541-553, Aug. 1994.
[59] D. Madigan, "Bayesian Data Mining for Health Surveillance," Spatial and Syndromic Surveillance for Public Health, pp. 203-221, John Wiley & Sons, 2005.
[60] J. Besag, "Spatial Interaction and the Statistical Analysis of Lattice Systems," J. Royal Statistical Soc., Series B (Methodological), vol. 36, no. 2, pp. 192-236, 1974.
[61] S. Geman and D. Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721-741, Nov. 1984.
[62] ISDS, "My Algorithm Can Out-Detect Your Algorithm: Biosurveillance Using Time Series Data," technical report, Int'l Soc. for Disease Surveillance, view/IsdsTechnicalContest; accessed, Nov. 2008.
[63] L.M. Wein, D.L. Craft, and E.H. Kaplan, "Emergency Response to an Anthrax Attack," Proc. Nat'l Academy of Sciences USA, vol. 100, pp. 4346-4351, Apr. 2003.
[64] J.A. Jernigan, D.S. Stephens, D.A. Ashford, C. Omenaca, M.S. Topiel, M. Galbraith, M. Tapper, T.L. Fisk, S. Zaki, T. Popovic, R.F. Meyer, C.P. Quinn, S.A. Harper, S.K. Fridkin, J.J. Sejvar, C.W. Shepard, M. McConnell, J. Guarner, W.J. Shieh, J. Malecki, J.L. Gerberding, J.M. Hughes, and B.A. Perkins, "Bioterrorism-Related Inhalational Anthrax: The First 10 Cases Reported in the United States," Emerging Infectious Diseases, vol. 7, pp. 933-944, 2001.
[65] T. Burr, T. Graves, R. Klamann, S. Michalak, R. Picard, and N. Hengartner, "Accounting for Seasonal Patterns in Syndromic Surveillance Data for Outbreak Detection," BMC Medical Informatics and Decision Making, vol. 6, no. 40, 2006.
[66] H. Rolka, H. Burkom, G.F. Cooper, M. Kulldorff, D. Madigan, and W.-K. Wong, "Issues in Applied Statistics for Public Health Bioterrorism Surveillance Using Multiple Data Stream: Research Needs," Statistics in Medicine, vol. 26, pp. 1834-1856, 2007.
[67] B. Efron and R. Tibshirani, "Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy," Statistical Science, vol. 1, no. 1, pp. 54-75, 1986.
[68] H.-M. Lu, D. Zeng, L. Trujillo, K. Komatsu, and H. Chen, "Ontology-Enhanced Automatic Chief Complaint Classification for Syndromic Surveillance," J. Biomedical Informatics, vol. 41, no. 2, pp. 340-356, Apr. 2008.

Index Terms:
Markov switching models, syndromic surveillance, Gibbs sampling, outbreak detection.
Hsin-Min Lu, Daniel Zeng, Hsinchun Chen, "Prospective Infectious Disease Outbreak Detection Using Markov Switching Models," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 4, pp. 565-577, April 2010, doi:10.1109/TKDE.2009.115
Usage of this product signifies your acceptance of the Terms of Use.