
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
D. Cohn, E.A. Riskin, R. Ladner, "Theory and Practice of Vector Quantizers Trained on Small Training Sets," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 5465, January, 1994.  
BibTex  x  
@article{ 10.1109/34.273717, author = {D. Cohn and E.A. Riskin and R. Ladner}, title = {Theory and Practice of Vector Quantizers Trained on Small Training Sets}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {16}, number = {1}, issn = {01628828}, year = {1994}, pages = {5465}, doi = {http://doi.ieeecomputersociety.org/10.1109/34.273717}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Theory and Practice of Vector Quantizers Trained on Small Training Sets IS  1 SN  01628828 SP54 EP65 EPD  5465 A1  D. Cohn, A1  E.A. Riskin, A1  R. Ladner, PY  1994 KW  vector quantisation; image coding; learning systems; statistics; small training sets; memoryless vector quantizer; training set distortion; test distortion; training image; VapnikChervonenkis dimension; formal bounds; vector quantizer codebooks; empirical simulations VL  16 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Examines how the performance of a memoryless vector quantizer changes as a function of its training set size. Specifically, the authors study how well the training set distortion predicts test distortion when the training set is a randomly drawn subset of blocks from the test or training image(s). Using the VapnikChervonenkis (VC) dimension, the authors derive formal bounds for the difference of test and training distortion of vector quantizer codebooks. The authors then describe extensive empirical simulations that test these bounds for a variety of codebook sizes and vector dimensions, and give practical suggestions for determining the training set size necessary to achieve good generalization from a codebook. The authors conclude that, by using training sets comprising only a small fraction of the available data, one can produce results that are close to the results obtainable when all available data are used.
[1] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth, "Learnability and the VapnikChervonenkis dimension,"J. Ass. Comput. Mach., vol. 36, no. 4, pp. 929965, Oct. 1989.
[2] D. Cohn and G. Tesauro, "How tight are the VapnikChervonenkis bounds?"Neural Computation, vol. 4, no. 2, pp. 249210, Mar. 1992.
[3] D. Cohn, "Separating formal bounds from practical performance in learning systems," Ph.D. dissertation, Dept. Comput. Sci. Eng., Univ. of Washington, Seattle, 1992.
[4] P. Cosman, K. Perlmutter, S. Perlmutter, R. A. Olshen, and R. M. Gray, "Training sequence size and vector quantizer performance," inProc. 25th Asilomar Conf. Signals, Syst., Comput., Asilomar, CA, Nov. 1991, pp. 434438.
[5] J. Crutchfield and K. Young,Computation at the Onset of Chaos. Reading, MA: AddisonWesley, 1990, pp. 223269.
[6] R. Floyd and L. Steinberg, "An adaptive algorithm for spatial grey scale," inSID Int. Symp. Dig. Tech. Papers, 1975, pp. 3637.
[7] A. Gersho and R. M. Gray,Vector Quantization and Signal Compression. Boston: Kluwer Academic, 1992.
[8] R. M. Gray, "Vector quantization,"IEEE ASSP Mag., vol. 1, pp. 429, Apr. 1984.
[9] D. Haussler, M. Kearns, and R. Schapire, "Unifying bounds on the sample complexity of Bayesian learning theory using information theory and the VC dimension," inProc. 4th Annual Workshop Computational Learning Theory. San Mateo, CA: Morgan Kaufmann, 1991, pp. 6174.
[10] F. Itakura and S. Saito, "Analysis synthesis telephony based on the maximum likelihood method," inProc. 6th Int. Congress Acoustics, Tokyo, Japan. New York: Elsevier, 1968, pp. c17c20.
[11] J. Lin and J. Vitter, "εapproximations with minimum constraint violation," inProc. 24th Annual ACM Symp. Theory of Computing, Victoria, Canada, May 1992, pp. 771782.
[12] Y. Linde, A. Buzo, and R. M. Gray, "An algorithm for vector quantizer design,"IEEE Trans. Commun., vol. COM28, pp. 8495, Jan. 1980.
[13] A. N. Netravali and B. G. Haskell,Digital Pictures Representation and Compression. New York: Plenum, 1988.
[14] D. Pollard, "A central limit theorem forkmeans clustering,"Annals Probability, vol. 10, no. 4, pp. 919926, 1982.
[15] R. Ulichney,Digital Halftoning. Cambridge, MA: MIT Press, 1987.
[16] V. Vapnik,Estimation of Dependencies Based on Empirical Data. New York: SpringerVerlag, 1982.
[17] V. Vapnik and A. Chervonenkis, "On the uniform convergence of relative frequencies of events to their probabilities,"Theory of Probability and its Applications, vol. 16, no. 2, pp. 264280, 1971.
[18] S. Weiss and C. Kulikowski,Computer Systems that Learn. San Mateo, CA: Morgan Kaufmann, 1991.
[19] P. Zehna,Probability Distributions and Statistics. Boston: Allyn and Bacon, 1970, pp. 286289.
[20] T. Linder, G. Lugosi, and K. Zeger, Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding,"IEEE Trans. Inform. Theory, to be published.