Proceedings 17th IEEE Annual Conference on Computational Complexity (2002)

Montreal, Canada

May 21, 2002 to May 24, 2002

ISBN: 0-7695-1468-5

pp: 0017

Tugkan Batu , University of Pennsylvania

Sanjoy Dasgupta , AT&T Labs-Research

Ravi Kumar , IBM Almaden Research Center

Ronitt Rubinfeld , NEC Research Institute

ABSTRACT

We consider the problem of approximating the entropy of a discrete distribution under several models. If the distribution is given explicitly as an array where the i-th location is the probability of the i-th element, then linear time is both necessary and sufficient for approximating the entropy.We consider a model in which the algorithm is given access only to independent samples from the distribution. Here, we show that a \gamma-multiplicative approximation to the entropy can be obtained in O\left(n^{(1+\eta)/\gamma^2} \poly(\log n)\right) time for distributions with entropy \Omega(\gamma/\eta), where n is the size of the domain of the distribution and \eta is an arbitrarily small positive constant. We show that one cannot get a multiplicative approximation to the entropy in general in this model. Even for the class of distributions to which our upper bound applies, we obtain a lower bound of \Omega\left(n^{\max(1/(2\gamma^2),2/(5\gamma^2-2))} \right).We next consider a hybrid model in which both the explicit distribution as well as independent samples are available. Here, significantly more efficient algorithms can be achieved: a \gamma-multiplicative approximation to the entropy can be obtained in O \left(\frac{\gamma^2 \log^2{n}}{h^2 (\gamma-1)^2} \right) time for distributions with entropy \Omega(h); we show a lower bound of \Omega \left(\frac{\log n}{h(\gamma^2-1)} \right).Finally, we consider two special families of distributions: those for which the probability of an element decreases monotonically in the label of the element, and those that are uniform over a subset of the domain. In each case, we give more efficient algorithms for approximating the entropy.

INDEX TERMS

entropy, entropy approximation, black-box distribution, sample complexity, monotone distribution

CITATION

S. Dasgupta, R. Kumar, R. Rubinfeld and T. Batu, "The Complexity of Approximating the Entropy,"

*Proceedings 17th IEEE Annual Conference on Computational Complexity(CCC)*, Montreal, Canada, 2002, pp. 0017.

doi:10.1109/CCC.2002.1004329

CITATIONS

SEARCH