
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Ajit V. Rao, David J. Miller, Kenneth Rose, Allen Gersho, "A Deterministic Annealing Approach for Parsimonious Design of Piecewise Regression Models," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 2, pp. 159173, February, 1999.  
BibTex  x  
@article{ 10.1109/34.748824, author = {Ajit V. Rao and David J. Miller and Kenneth Rose and Allen Gersho}, title = {A Deterministic Annealing Approach for Parsimonious Design of Piecewise Regression Models}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {21}, number = {2}, issn = {01628828}, year = {1999}, pages = {159173}, doi = {http://doi.ieeecomputersociety.org/10.1109/34.748824}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  A Deterministic Annealing Approach for Parsimonious Design of Piecewise Regression Models IS  2 SN  01628828 SP159 EP173 EPD  159173 A1  Ajit V. Rao, A1  David J. Miller, A1  Kenneth Rose, A1  Allen Gersho, PY  1999 KW  Statistical regression KW  piecewise regression KW  deterministic annealing KW  parsimonious modeling KW  generalization KW  nearestprototype models. VL  21 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—A new learning algorithm is proposed for piecewise regression modeling. It employs the technique of deterministic annealing to design space partition regression functions. While the performance of traditional space partition regression functions such as CART and MARS is limited by a simple treestructured partition and by a hierarchical approach for design, the deterministic annealing algorithm enables the joint optimization of a more powerful piecewise structure based on a Voronoi partition. The new method is demonstrated to achieve consistent performance improvements over regular CART as well as over its extension to allow arbitrary hyperplane boundaries. Comparison tests, on several benchmark data sets from the regression literature, are provided.
[1] H. Akaike, A New Look at the Statistical Model Identification IEEE Trans. Automatic Control, vol. 19, no. 6, pp. 716723, 1974.
[2] R. Bellman and R. Roth, "Curve Fitting by Segmented Straight Lines," J. Am. Statistical Assoc., vol. 64, pp. 1,0791,084, 1969.
[3] L. Breiman and J.H. Friedman, "Estimating Optimal Transformations for Multiple Regression," Computer Science and Statistics: Proc. 16th Symp. Interface, pp. 121134, 1985.
[4] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees.Belmont, Calif.: Wadsworth, 1984.
[5] J.M. Buhmann and H. Kühnel, "Vector Quantization with Complexity Costs," IEEE Trans Information Theory, vol. 39, pp. 1,1331,145, July 1993.
[6] V. Cherkassky, Y. Lee, and H. LariNajafi, "SelfOrganizing Network for Regression: Efficient Implementation and Comparative Evaluation," Proc. Int'l Joint Conf. Neural Networks, vol. 1, pp. 7984, 1991.
[7] P. Chou,“Optimal partitioning for classification and regression trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 4, pp. 340354, Apr. 1991.
[8] P.A. Chou,T. Lookabaugh,, and R.M. Gray,“Optimal pruning with applications to treestructured source coding and modeling,” IEEE Trans. Inform. Theory, vol. 35, no. 2, pp. 299315, Mar. 1989.
[9] T.M. Cover, "Estimation by the Nearest Neighbor Rule," IEEE Trans. Inform. Theory, vol. 14, pp. 5055, 1968.
[10] J.H. Friedman, "Multiple Adaptive Regression Splines," Ann. Stat. vol. 19, pp. 1141, 1991.
[11] J.H. Friedman and W. Stuetzle, "Projection Pursuit Regression," J. Am. Statistical Assoc. vol. 76, pp. 817823, 1981.
[12] A. Gersho and R.M. Gray, Vector Quantization and Signal Compression. Boston: Kluwer Academic, 1992.
[13] D. Harrison and D.L. Rubinfeld, "Hedonic Prices and the Demand for Clean Air," J. Environ. Economics&Management, vol. 5, pp. 81102, 1978.
[14] B. Hassibi and D.G. Stork, "Second Order Derivative for Network Pruning: Optimal Brain Surgeon," Proc. NIPS5, 1993.
[15] G.E. Hinton and M. Revow, "Using Pairs of Data Points to Define Splits for Decision Trees," Advances in Neural Information Processing Systems, vol. 8, pp. 507513, 1995.
[16] J.H. Hwang, S.R. Lay, M. Maechler, R.D. Martin, and J. Schimert, "Regression Modeling in BackPropagation and Projection Pursuit Learning," IEEE Trans. Neural Networks, vol. 5, no. 3, pp. 342353, 1994.
[17] T. Kohonen, "An Introduction to Neural Computing," Neural Networks, vol. 1, no. 1, pp. 316, 1988.
[18] Y. Linde, A. Buzo, R.M. Gray, An Algorithm for Vector Quantizer Design IEEE Trans. Comm., vol. 28, no. 1, pp. 8495, 1980.
[19] W.Y. Loh and N. Vanichsetakul, "TreeStructured Classification via Generalized Discriminant Analysis (With Discussion)," J. Am. Statistical Assoc., vol. 83, no. 403, pp. 715727, 1988.
[20] G.C. McDonald and R.C. Schwing, "Instabilities of Regression Estimates Relating Air Pollution to Mortality," Technometrics, vol. 15, pp. 463482, 1973.
[21] D. Miller, A. Rao, K. Rose, and A. Gersho, "A Global Optimization Technique for Statistical Classifier Design," IEEE Trans. Signal Proc., vol. 44, no. 12, pp. 3,1083,122, 1996.
[22] J. Moody and C.J. Darken, "Fast Learning in Networks of LocallyTuned Processing Units," Neural Computation, vol. 1, no. 2, pp. 28194, Summer 1989.
[23] A. Rao, D. Miller, K. Rose, and A. Gersho, "A Generalized VQ Method for Combined Compression and Estimation," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 4, pp. 2,0322035, 1996.
[24] A. Rao, D. Miller, K. Rose, and A. Gersho, "Mixture of Experts Regression Modeling by Deterministic Annealing," IEEE Trans. Signal Processing, vol. 45, no. 11, pp. 2,8112,820, Nov. 1997.
[25] J. Rissanen, "Stochastic Complexity and Modeling," Ann. Stat., vol. 14, pp. 1,0801,100, 1986.
[26] K. Rose, "A Mapping Approach to RateDistortion Computation and Analysis," IEEE Trans. Inform. Theory, vol. 40, pp. 1,9391,952, 1994.
[27] K. Rose,E. Gurewitz,, and G.C. Fox,“Constrained clustering as an optimization method,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.15, pp. 785794, 1993.
[28] K. Rose, E. Gurewitz, and G.C. Fox, "Statistical Mechanics and Phase Transitions in Clustering," Phys. Rev. Lett., vol. 65, no. 8, pp. 945948, 1990.
[29] K. Rose, E. Gurewitz, and G. Fox, "Vector Quantization by Deterministic Annealing," IEEE Trans Information Theory, vol. 38, no. 4, pp. 12491257, 1992.
[30] B. Silverman, "Density Estimation for Statistics and Data Analysis," Monographs on Statistics and Applied Probability.London: Chapman and Hall, 1986.
[31] G.R. Terrell and D.W. Scott, "Variable Kernel Density Estimation," Ann. Stat., vol. 20, no. 3, pp. 1,2361,265, 1992.
[32] W.N. Venables and W.B. Ripley, Modern Applied Statistics With SPlus.New York: SpringerVerlag, 1994.
[33] S.M. Weiss, R.S. Galen, and P.V. Tadepalli, "Optimizing the Predictive Value of Diagnostic Decision Rules," Proc. Nat'l Conf. Artificial Intelligence, AAAI, pp. 18.1.114,Seattle, 1987.
[34] X. Wu and K. Zhang, "A Better TreeStructured Vector Quantizer," Proc. Data Compression Conf., pp. 392401.Los Alamitos, Calif.: IEEE Computer Society Press, 1991.
[35] J. Zhao and J. ShaweTaylor, "Neural Network Optimization for Good Generalization Performance," Proc. Int'l Conf. Artificial Neural Networks, pp. 561564, 1994.