19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)
Adaptive-Size Reservoir Sampling over Data Streams
Banff, Alberta, Canada
July 09-July 11
ISBN: 0-7695-2868-6
DOI Bookmark:
http://doi.ieeecomputersociety.org/10.1109/SSDBM.2007.29
Reservoir sampling is a well-known technique for sequential random sampling over data streams. Conventional reservoir sampling assumes a fixed-size reservoir. There are situations, however, in which it is necessary and/or advantageous to adaptively adjust the size of a reservoir in the middle of sampling due to changes in data characteristics and/or application behavior. This paper studies adaptivesize reservoir sampling over data streams considering two main factors: reservoir size and sample uniformity. First, the paper conducts a theoretical study on the effects of adjusting the size of a reservoir while sampling is in progress. The theoretical results show that such an adjustment may bring a negative impact on the probability of the sample being uniform (called uniformity confidence herein). Second, the paper presents a novel algorithm for maintaining the reservoir sample after the reservoir size is adjusted such that the resulting uniformity confidence exceeds a given threshold. Third, the paper extends the proposed algorithm to an adaptive multi-reservoir sampling algorithm for a practical application in which samples are collected from memory-limited wireless sensor networks using a mobile sink. Finally, the paper empirically examines the adaptivity of the multi-reservoir sampling algorithm with regard to reservoir size and sample uniformity using real sensor networks data sets.
Citation:
Mohammed Al-Kateb, Byung Suk Lee, X. Sean Wang, "Adaptive-Size Reservoir Sampling over Data Streams," ssdbm, pp.22, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007), 2007
Usage of this product signifies your acceptance of the
Terms of Use.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||