The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2010 vol.32)
pp: 193-205
Gianluigi Pillonetto , University of Padova, Padova
Francesco Dinuzzo , University of Pavia, Pavia
Giuseppe De Nicolao , University of Pavia, Pavia
ABSTRACT
Standard single-task kernel methods have recently been extended to the case of multitask learning in the context of regularization theory. There are experimental results, especially in biomedicine, showing the benefit of the multitask approach compared to the single-task one. However, a possible drawback is computational complexity. For instance, when regularization networks are used, complexity scales as the cube of the overall number of training data, which may be large when several tasks are involved. The aim of this paper is to derive an efficient computational scheme for an important class of multitask kernels. More precisely, a quadratic loss is assumed and each task consists of the sum of a common term and a task-specific one. Within a Bayesian setting, a recursive online algorithm is obtained, which updates both estimates and confidence intervals as new data become available. The algorithm is tested on two simulated problems and a real data set relative to xenobiotics administration in human patients.
INDEX TERMS
Collaborative filtering, multitask learning, mixed effects model, kernel methods, regularization, Gaussian processes, Kalman filtering, pharmacokinetic data.
CITATION
Gianluigi Pillonetto, Francesco Dinuzzo, Giuseppe De Nicolao, "Bayesian Online Multitask Learning of Gaussian Processes", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 2, pp. 193-205, February 2010, doi:10.1109/TPAMI.2008.297
REFERENCES
[1] T. Poggio and F. Girosi, “Networks for Approximation and Learning,” Proc. IEEE, vol. 78, no. 9, pp. 1481-1497, Sept. 1990.
[2] D. Barry, “Nonparametric Bayesian Regression,” The Annals of Statistics, vol. 14, pp. 934-953, 1986.
[3] C.E. Rasmussen and C.K.I. Williams, Gaussian Processes for Machine Learning. MIT Press, 2006.
[4] L.B. Sheiner, “The Population Approach to Pharmacokinetic Data Analysis: Rationale and Standard Data Analysis Methods,” Drug Metabolism Rev., vol. 15, pp. 153-171, 1994.
[5] M. Davidian and D.M. Giltinan, Nonlinear Models for Repeated Measurement Data. Chapman and Hall, 1995.
[6] J.A. Jacquez, Compartmental Analysis in Biology and Medicine. Univ. of Michigan Press, 1985.
[7] L.B. Sheiner, B. Rosenberg, and V.V. Marathe, “Estimation of Population Characteristics of Pharmacokinetic Parameters from Routine Clinical Data,” J. Pharmacokinetics and Biopharmaceutics, vol. 5, no. 5, pp. 445-479, 1977.
[8] S. Beal and L. Sheiner, NONMEM User's Guide. NONMEM Project, Group Univ. of California, 1992.
[9] J.C. Wakefield, A.F.M. Smith, A. Racine-Poon, and A.E. Gelfand, “Bayesian Analysis of Linear and Non-Linear Population Models by Using the Gibbs Sampler,” Applied Statistics, vol. 41, pp. 201-221, 1994.
[10] D.J. Lunn, N. Best, A. Thomas, J.C. Wakefield, and D. Spiegelhalter, “Bayesian Analysis of Population PK/PD Models: General Concepts and Software,” J. Pharmacokinetics and Pharmacodynamics, vol. 29, no. 3, pp. 271-307, 2002.
[11] K.E. Fattinger and D. Verotta, “A Nonparametric Subject-Specific Population Method for Deconvolution: I. Description, Internal Validation and Real Data Examples,” J. Pharmacokinetics and Biopharmaceutics, vol. 23, pp. 581-610, 1995.
[12] P. Magni, R. Bellazzi, G. De Nicolao, I. Poggesi, and M. Rocchetti, “Nonparametric AUCEstimation in Population Studies with Incomplete Sampling: A Bayesian Approach,” J. Pharmacokinetics and Pharmacodynamics, vol. 29, nos. 5/6, pp. 445-471, 2002.
[13] M. Neve, G. DeNicolao, and L. Marchesi, “Nonparametric Identification of Pharmacokinetic Population Models via Gaussian Processes,” Proc. 16th IFAC World Congress, 2005.
[14] M. Neve, G. De Nicolao, and L. Marchesi, “Nonparametric Identification of Population Models via Gaussian Processes,” Automatica, vol. 97, no. 7, pp. 1134-1144, 2007.
[15] F. Ferrazzi, P. Magni, and R. Bellazzi, “Bayesian Clustering of Gene Expression Time Series,” Proc. Third Int'l Workshop Bioinformatics for the Management, Analysis and Interpretation of Microarray Data, pp. 53-55, 2003.
[16] R. Caruana, “Multi-Task Learning,” Machine Learning, vol. 28, pp. 41-75, 1997.
[17] S. Thrun and L. Pratt, Learning to Learn. Kluwer, 1997.
[18] B. Bakker and T. Heskes, “Task Clustering and Gating for Bayesian Multi-Task Learning,” J. Machine Learning Research, vol. 4, pp. 83-99, 2003.
[19] J. Baxter, “A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling,” Machine Learning, vol. 28, pp. 7-39, 1997.
[20] C.A. Micchelli and M. Pontil, “On Learning Vector-Valued Functions,” Neural Computation, vol. 17, no. 1, pp. 177-204, 2005.
[21] T. Evgeniou, C.A. Micchelli, and M. Pontil, “Learning Multiple Tasks with Kernel Methods,” J. Machine Learning Research, vol. 6, pp. 615-637, 2005.
[22] G. Pillonetto, G. De Nicolao, M. Chierici, and C. Cobelli, “Fast Algorithms for Nonparametric Population Modeling of Large Data Sets,” Automatica, vol. 45, pp. 173-179, 2009.
[23] A. Schwaighofer, V. Tresp, and K. Yu, “Learning Gaussian Process Kernels via Hierarchical Bayes,” Advances in Neural Information Processing Systems, vol. 17, pp. 1209-1216, MIT Press, 2005.
[24] N.D. Lawrence and J.C. Platt, “Learning to Learn with the Informative Vector Machine,” Proc. Int'l Conf. Machine Learning, vol. 69, p. 65, 2004.
[25] K. Yu, V. Tresp, and A. Schwaighofer, “Learning Gaussian Processes from Multiple Tasks,” Proc. 22nd Int'l Conf. Machine Learning, pp. 1012-1019, 2005.
[26] M. Seeger and M.I. Jordan, “Sparse Gaussian Process Classification with Multiple Classes,” Technical Report 661, Dept. of Statistics, Univ. of California, Berkeley, 2004.
[27] J.O. Ramsay and B.W. Silverman, Functional Data Analysis. Springer-Verlag, 1997.
[28] J.O. Ramsay and C.J. Dalzell, “Some Tools for Functional Data Analysis (with Discussion),” J. Royal Statistical Soc., Series B, vol. 53, pp. 539-572, 1991.
[29] M. Neve, G. De Nicolao, and L. Marchesi, “Nonparametric Identification of Population Models: An MCMC Approach,” IEEE Trans. Biomedical Eng., vol. 55, no. 1, pp. 41-50, Jan. 2008.
[30] Z. Lu, T. Leen, Y. Huang, and D. Erdogmus, “A Reproducing Kernel Hilbert Space Framework for Pairwise Time Series Distances,” Proc. Int'l Conf. Machine Learning, pp. 624-631, 2008.
[31] L. Csató and M. Opper, “Sparse On-Line Gaussian Processes,” Neural Computation, vol. 14, no. 3, pp. 641-668, 2002.
[32] M. Opper, “A Bayesian Approach to Online Learning,” Online Learning in Neural Networks, Cambridge Univ. Press, 1998.
[33] G. Kimeldorf and G. Wahba, “A Correspondence between Bayesian Estimation of Stochastic Processes and Smoothing by Splines,” Annals of Math. Statistics, vol. 41, pp. 495-502, 1979.
[34] B. Schölkopf, R. Herbrich, and A.J. Smola, “A Generalized Representer Theorem,” Proc. Ann. Conf. Computational Learning Theory, pp. 416-426, 2001.
[35] A.N. Shiryaev, Probability. Springer, 1996.
[36] B.D.O. Anderson and J.B. Moore, Optimal Filtering. Prentice-Hall, 1979.
[37] G. Wahba, Spline Models for Observational Data. SIAM, 1990.
[38] J.S. Maritz and T. Lwin, Empirical Bayes Method. Chapman and Hall, 1989.
[39] A. Argyriou, C.A. Micchelli, and M. Pontil, “Learning Convex Combinations of Continuously Parametrized Basic Kernels,” Proc. Ann. Conf. Learning Theory, pp. 338-352, 2005.
[40] A. Argyriou, R. Hauser, C.A. Micchelli, and M. Pontil, “A DC Algorithm for Kernel Selection,” Proc. 23rd Int'l Conf. Machine Learning, pp. 41-48, 2006.
[41] T. Evgeniou, M. Pontil, and O. Toubia, “A Convex Optimization Approach to Modeling Heterogeneity in Conjoint Estimation,” Marketing Science, vol. 26, pp. 805-818, 2007.
[42] M. Rocchetti and I. Poggesi, “Comparison of the Bailer and Yeh Methods Using Real Data,” The Population Approach: Measuring and Managing Variability in Response, Concentration and Dose, pp. 385-390, European Cooperation in the Field of Scientific and Technical Research, European Commission, 1997.
[43] R.N. Bergman, “Minimal Model: Perspective from 2005,” Hormone Research, vol. 64, pp. 8-15, 2006.
[44] R.N. Bergman, Y.Z. Ider, C.R. Bowden, and C. Cobelli, “Quantitative Estimation of Insulin Sensitivity,” Am. J. Physiology (Endocrinology and Metabolism Gastrointestinal and Liver Physiology), vol. 236, pp. E667-E677, 1979.
[45] P. Vicini and C. Cobelli, “The Iterative Two-Stage Population Approach to IVGTTMinimal Modeling: Improved Precision with Reduced Sampling,” Am. J. Physiology Endocrinology and Metabolism, vol. 280, no. 1, pp. 179-186, 2001.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool