June 25, 2007 to June 27, 2007
Hua Chen , Tsinghua University
Jian-guang Lou , Microsoft Research Asia, China
Jiang Li , Microsoft Research Asia, China
Learning the underlying model from distributed data is often useful for many distributed systems. In this paper, we study the problem of learning a non-parametric model from distributed observations. We propose a gossip-based distributed kernel density estimation algorithm and analyze the convergence and consistency of the estimation process. Furthermore, we extend our algorithm to distributed systems under communication and storage constraints by introducing a fast and efficient data reduction algorithm. Experiments show that our algorithm can estimate underlying density distribution accurately and robustly with only small communication and storage overhead.
Kernel Density Estimation, Non-parametric Statistics, Distributed Estimation, Data Reduction, Gossip
Hua Chen, Jian-guang Lou, Jiang Li, "Distributed Density Estimation Using Non-parametric Statistics", ICDCS, 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07), 27th International Conference on Distributed Computing Systems (ICDCS '07) 2007, pp. 28, doi:10.1109/ICDCS.2007.100