Issue No. 10 - October (2003 vol. 25)
<p><b>Abstract</b>—The requirement to reduce the computational cost of evaluating a point probability density estimate when employing a Parzen window estimator is a well-known problem. This paper presents the Reduced Set Density Estimator that provides a kernel-based density estimator which employs a small percentage of the available data sample and is optimal in the L_2 sense. While only requiring O(N^2) optimization routines to estimate the required kernel weighting coefficients, the proposed method provides similar levels of performance accuracy and sparseness of representation as Support Vector Machine density estimation, which requires O(N^3) optimization routines, and which has previously been shown to consistently outperform Gaussian Mixture Models. It is also demonstrated that the proposed density estimator consistently provides superior density estimates for similar levels of data reduction to that provided by the recently proposed Density-Based Multiscale Data Condensation algorithm and, in addition, has comparable computational scaling. The additional advantage of the proposed method is that no extra free parameters are introduced such as regularization, bin width, or condensation ratios, making this method a very simple and straightforward approach to providing a reduced set density estimator with comparable accuracy to that of the full sample Parzen density estimator.</p>
Kernel density estimation, Parzen window, data condensation, sparse representation.
Chao He, Mark Girolami, "Probability Density Estimation from Optimally Condensed Data Samples", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 25, no. , pp. 1253-1264, October 2003, doi:10.1109/TPAMI.2003.1233899