Issue No.11 - Nov. (2013 vol.35)
pp: 2680-2692
S. Melacci , Dept. of Inf. Eng. & Math. Sci., Univ. of Siena, Siena, Italy
M. Gori , Dept. of Inf. Eng. & Math. Sci., Univ. of Siena, Siena, Italy
Supervised examples and prior knowledge on regions of the input space have been profitably integrated in kernel machines to improve the performance of classifiers in different real-world contexts. The proposed solutions, which rely on the unified supervision of points and sets, have been mostly based on specific optimization schemes in which, as usual, the kernel function operates on points only. In this paper, arguments from variational calculus are used to support the choice of a special class of kernels, referred to as box kernels, which emerges directly from the choice of the kernel function associated with a regularization operator. It is proven that there is no need to search for kernels to incorporate the structure deriving from the supervision of regions of the input space, because the optimal kernel arises as a consequence of the chosen regularization operator. Although most of the given results hold for sets, we focus attention on boxes, whose labeling is associated with their propositional description. Based on different assumptions, some representer theorems are given that dictate the structure of the solution in terms of box kernel expansion. Successful results are given for problems of medical diagnosis, image, and text categorization.
Kernel, Green's function methods, Support vector machines, Optimization, Materials, Probability distribution, Context,regularization operators, Box kernels, Green's functions, kernel machines, propositional rules
S. Melacci, M. Gori, "Learning with Box Kernels", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2680-2692, Nov. 2013, doi:10.1109/TPAMI.2013.73
[1] F. Lauer and G. Bloch, "Incorporating Prior Knowledge in Support Vector Machines for Classification: A Review," Neurocomputing, vol. 71, no. 7-9, pp. 1578-1594, 2008.
[2] G. Fung, O. Mangasarian, and J. Shavlik, "Knowledge-Based Support Vector Machine Classifiers," Proc. Advances in Neural Information Processing Systems, pp. 537-544, 2002.
[3] G. Fung, O. Mangasarian, and J. Shavlik, "Knowledge-Based Nonlinear Kernel Classifiers," Proc. Conf. Learning Theory, pp. 102-114, 2003.
[4] Q. Le, A. Smola, and T. Gártner, "Simpler Knowledge-Based Support Vector Machines," Proc. 23rd Int'l Conf. Machine Language, pp. 521-528, 2006.
[5] O. Mangasarian and E. Wild, "Nonlinear Knowledge-Based Classification," IEEE Trans. Neural Networks, vol. 19, no. 10, pp. 1826-1832, Oct. 2008.
[6] O. Mangasarian, E. Wild, and G. Fung, "Proximal Knowledge-Based Classification," Statistical Analysis and Data Mining, vol. 1, no. 4, pp. 215-222, 2009.
[7] T. Poggio and F. Girosi, "Networks for Approximation and Learning," Proc. IEEE, vol. 78, no. 9, pp. 1481-1497, Sept. 1990.
[8] A. Smola, B. Schoelkopf, and K. Mueller, "The Connection between Regularization Operators and Support Vector Kernels," Neural Networks, vol. 11, pp. 637-649, 1998.
[9] T. Evgenious, M. Pontil, and T. Poggio, "Regularization Networks and Support Vector Machines," Advances in Computational Math., vol. 13, pp. 1-50, 2000.
[10] Z. Chen and S. Haykin, "On Different Facets of Regularization Theory," Neural Computation, vol. 14, pp. 2791-2846, 2002.
[11] G.E. Fasshauer and Q. Ye, "Reproducing Kernels of Generalized Sobolev Spaces via a Green Function Approach with Differential Operators," rapid post, IIT Technical Report 2010, boundary_operator_(Fasshauer_Ye)_(revised).pdf , 2011.
[12] B. Schoelkopf and A. Smola, Learning with Kernels. MIT Press, 2002.
[13] F. Girosi, M. Jones, and T. Poggio, "Regularization Theory and Neural Networks Architectures," Neural Computation, vol. 7, pp. 219-269, 1995.
[14] G.E. Fasshauer, "Green's Functions: Taking Another Look at Kernel Approximation, Radial Basis Functions and Splines," Proc. 13th Int'l Conf. Approximation Theory, M. Neamtu and L. Schumaker, eds., 2011.
[15] G.E. Fasshauer and Q. Ye, "Reproducing Kernels of Sobolev Spaces via a Green Kernel Approach with Differential Operators & Boundary Operators," rapid post, IIT Technical Report 2011, http://mypages.iit.eduqye3, 2010.
[16] M. Taylor, Pseudo-Differential Operators. Princeton Univ. Press, 1981.
[17] T. Poggio and F. Girosi, "A Theory of Networks for Approximation and Learning," technical report, MIT, 1989.
[18] G. Gnecco, M. Gori, and S. Melacci, "Learning with Boundary Conditions," technical report, 2011.
[19] T. Gärtner, P. Flach, A. Kowalczyk, and A. Smola, "Multi-Instance Kernels," Proc. 19th Int'l Conf. Machine Learning, pp. 179-186, 2002,
[20] D. Haussler, "Convolution Kernels on Discrete Structures," technical report, Univ. of California, Santa Cruz, 1999.
[21] A. Smola, A. Gretton, L. Song, and B. Schölkopf, "A Hilbert Space Embedding for Distributions," Proc. 18th Int'l Conf. Algorithmic Learning Theory, pp. 13-31, 2007.
[22] B. Sriperumbudur, A. Gretton, K. Fukumizu, B. Schoelkopf, and G. Lanckriet, "Hilbert Space Embeddings and Metrics on Probability Measures," J. Machine Learning Research, vol. 99, pp. 1517-1561, 2010.
[23] O. Chapelle, "Training a Support Vector Machine in the Primal," Neural Computation, vol. 19, no. 5, pp. 1155-1178, 2007.
[24] A. Frank and A. Asuncion, "UCI Repository," http://archive. ics.uci.eduml, 2010.
[25] G. Kunapuli, K. Bennett, A. Shabbeer, R. Maclin, and J. Shavlik, "Online Knowledge-Based Support Vector Machines," Proc. European Conf. Machine Learning, pp. 145-161, 2010.
[26] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. 10th European Conf. Machine Learning, pp. 137-142, 1998.