40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039) (1999)

New York, New York

Oct. 17, 1999 to Oct. 18, 1999

ISSN: 0272-5428

ISBN: 0-7695-0409-4

pp: 501

J. Feigenbaum , AT&T Labs--Research

S. Kannan , AT&T Labs--Research

M. Strauss , AT&T Labs--Research

M. Viswanathan , University of Pennsylvania

ABSTRACT

We give a space-efficient, one-pass algorithm for approximating the L1 difference \math between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {Vj} of limited-independence random variables that are /range-summable/, by which we mean that the \math for \math is computable in time polylog(c), for all seeds s. These random-variable families may be of interest outside our current application domain, i.e., massive data streams generated by communication networks. Our L1-difference algorithm can be viewed as a ``sketching'' algorithm, in the sense of [Broder, Charikar, Frieze, and Mitzenmacher, STOC '98, pp. 327-336], and our algorithm performs better than that of Broder et al. when used to approximate the symmetric difference of two sets with small symmetric difference.

INDEX TERMS

L1-Difference, stream, range-summable random variables, 4-wise independent random variables

CITATION

M. Viswanathan, S. Kannan, J. Feigenbaum and M. Strauss, "An Approximate L1-Difference Algorithm for Massive Data Streams,"

*40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)(FOCS)*, New York, New York, 1999, pp. 501.

doi:10.1109/SFFCS.1999.814623

CITATIONS

SEARCH