The Community for Technology Leaders
Green Image
ISSN: 1041-4347
Antonio Picariello , Universita di Napoli, Napoli
Fabio Persia , Universita di Napoli, Napoli
Cristian Molinaro , University of Maryland, College Park
Massimiliano Albanese , George Mason University, Fairfax
V. S. Subrahmanian , University of Maryland, College Park
ABSTRACT
There are numerous applications where we want to discover unexpected activities in a sequence of time-stamped observation data---for instance, we may want to detect inexplicable events in transactions at a web site or in video surveillance of an airport tarmac. In this paper, we start with a known set A of activities (both innocuous and dangerous) that we wish to monitor. However, in addition, we wish to identify "unexplained" subsequences in a sequence of observations that are poorly explained by A (e.g., because they may contain occurrences of activities that have never been seen or anticipated before, i.e. they are not in A). We formally define the probability that a sequence of observations is unexplained totally or partially w.r.t. A. We develop efficient algorithms to identify the top-k Totally and Partially Unexplained Sequences w.r.t. A. These algorithms leverage a set of theorems that enable us to speed up the search for totally/partially unexplained sequences. We describe experiments using real-world datasets in the video and cyber security domains showing that our approach works well in practice in terms of both running time and accuracy.
INDEX TERMS
Knowledge base management, Computing Methodologies, Artificial Intelligence, Knowledge Representation Formalisms and Methods
CITATION
Antonio Picariello, Fabio Persia, Cristian Molinaro, Massimiliano Albanese, V. S. Subrahmanian, "Discovering the Top-k "Unexplained" Sequences in Time-Stamped Observation Data", IEEE Transactions on Knowledge & Data Engineering, vol. , no. , pp. 0, 5555, doi:10.1109/TKDE.2013.33
88 ms
(Ver 3.3 (11022016))