loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
22nd International Conference on Data Engineering (ICDE'06)
SketchTree: Approximate Tree Pattern Counts over Streaming Labeled Trees
Atlanta, Georgia
April 03-April 07
ISBN: 0-7695-2570-9
Praveen Rao, University of Arizona
Bongki Moon, University of Arizona
In recent years, there has been a rising interest in developing online approximation algorithms for data streams. Some of the key challenges are posed by the fact that streaming data can be read only once in a fixed order of arrival and only a limited amount of memory is available for storage. In this paper, we address the problem of approximately counting tree patterns over a stream of labeled trees (e.g., XML documents). We propose a new approximation algorithm called SketchTree that computes a synopsis of the stream in a single pass by processing each tree only once. Using a limited amount of memory, SketchTree provides approximate answers for both ordered and unordered tree pattern counts. Furthermore, we discuss a class of count queries that can be handled by SketchTree and their utility. We provide theoretical analyses to show that our algorithm has provably strong guarantees on the error bounds. Experiments on real datasets demonstrate that SketchTree can indeed estimate tree pattern counts within 10-15% relative error with high confidence under various situations.
Citation:
Praveen Rao, Bongki Moon, "SketchTree: Approximate Tree Pattern Counts over Streaming Labeled Trees," icde, pp.80, 22nd International Conference on Data Engineering (ICDE'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.