Issue No. 03 - March (2000 vol. 22)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/34.841759
<p><b>Abstract</b>—Principal curves have been defined as “self-consistent” smooth curves which pass through the “middle” of a <it>d</it>-dimensional probability distribution or data cloud. They give a summary of the data and also serve as an efficient feature extraction tool. We take a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition makes it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction. Our theoretical learning scheme chooses a curve from a class of polygonal lines with <tmath>$k$</tmath> segments and with a given total length to minimize the average squared distance over <tmath>$n$</tmath> training points drawn independently. Convergence properties of this learning scheme are analyzed and a practical version of this theoretical algorithm is implemented. In each iteration of the algorithm, a new vertex is added to the polygonal line and the positions of the vertices are updated so that they minimize a penalized squared distance criterion. Simulation results demonstrate that the new algorithm compares favorably with previous methods, both in terms of performance and computational complexity, and is more robust to varying data models.</p>
Learning systems, unsupervised learning, feature extraction, vector quantization, curve fitting, piecewise linear approximation.
Balázs Kégl, Adam Krzyzak, Tamás Linder, Kenneth Zeger, "Learning and Design of Principal Curves", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 22, no. , pp. 281-297, March 2000, doi:10.1109/34.841759