The Community for Technology Leaders
40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039) (1999)
New York, New York
Oct. 17, 1999 to Oct. 18, 1999
ISSN: 0272-5428
ISBN: 0-7695-0409-4
pp: 501
J. Feigenbaum , AT&T Labs--Research
S. Kannan , AT&T Labs--Research
M. Strauss , AT&T Labs--Research
M. Viswanathan , University of Pennsylvania
ABSTRACT
We give a space-efficient, one-pass algorithm for approximating the L1 difference \math between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {Vj} of limited-independence random variables that are /range-summable/, by which we mean that the \math for \math is computable in time polylog(c), for all seeds s. These random-variable families may be of interest outside our current application domain, i.e., massive data streams generated by communication networks. Our L1-difference algorithm can be viewed as a ``sketching'' algorithm, in the sense of [Broder, Charikar, Frieze, and Mitzenmacher, STOC '98, pp. 327-336], and our algorithm performs better than that of Broder et al. when used to approximate the symmetric difference of two sets with small symmetric difference.
INDEX TERMS
L1-Difference, stream, range-summable random variables, 4-wise independent random variables
CITATION

M. Viswanathan, S. Kannan, J. Feigenbaum and M. Strauss, "An Approximate L1-Difference Algorithm for Massive Data Streams," 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)(FOCS), New York, New York, 1999, pp. 501.
doi:10.1109/SFFCS.1999.814623
84 ms
(Ver 3.3 (11022016))