2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) (2018)
Paris, France
Oct 7, 2018 to Oct 9, 2018
ISSN: 2575-8454
ISBN: 978-1-5386-4230-6
pp: 297-308
ABSTRACT
We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports. For a A ⊂Z+ ubset A of non-negative integers, a sum of independent random variables with collective support A (called an "A-sum" in this paper) is a distribution S = X1 + ... + XN where the Xi's are mutually independent (but not necessarily identically distributed) integer random variables all of whose supports are contained in A. We give two main algorithmic results for learning such distributions: 1) For the case |A|=3, we give an algorithm for learning A-sums to accuracy ε that uses poly(1/ε) samples and runs in time poly(1/ε), independent of N and of the elements of A. 2) For an arbitrary constant k>=4, if A = {a1,...,ak} with 0<;=a1 <; ... <; ak, we give an algorithm that uses poly(1/ε)*log log ak samples (independent of N) and runs in time poly(1/ε, log ak). We prove an essentially matching lower bound: if |A| = 4, then any algorithm must use Ω(log log a4) samples even for learning to constant accuracy. We also give similar-in-spirit (but quantitatively very different) algorithmic results, and essentially matching lower bounds, for the case in which A is not known to the learner. Our learning algorithms employ new limit theorems which may be of independent interest. Our algorithms and lower bounds together settle the question of how the sample complexity of learning sums of independent integer random variables scales with the elements in the union of their supports, both in the known-support and unknown-support settings. Finally, all our algorithms easily extend to the "semi-agnostic" learning model, in which training data is generated from a distribution that is only c*ε-close to some A-sum for a constant c>0.
INDEX TERMS
approximation theory, computational complexity, learning (artificial intelligence), mathematics computing, set theory
CITATION

A. De, P. M. Long and R. A. Servedio, "Learning Sums of Independent Random Variables with Sparse Collective Support," 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), Paris, France, 2019, pp. 297-308.
doi:10.1109/FOCS.2018.00036