loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 IEEE International Conference on Data Mining Workshops
Nonsmooth Bilevel Programming for Hyperparameter Selection
Miami, Florida, USA
December 06-December 06
ISBN: 978-0-7695-3902-7
We propose a nonsmooth bilevel programming method for training linear learning models with hyperparameters optimized via $T$-fold cross-validation (CV). This algorithm scales well in the sample size. The method handles loss functions with embedded maxima such as in support vector machines. Current practice constructs models over a predefined grid of hyperparameter combinations and selects the best one, an inefficient heuristic. Innovating over previous bilevel CV approaches, this paper represents an advance towards the goal of self-tuning supervised data mining as well as a significant innovation in scalable bilevel programming algorithms. Using the bilevel CV formulation, the lower-level problems are treated as unconstrained optimization problems and are replaced with their optimality conditions. The resulting nonlinear program is nonsmooth and nonconvex. We develop a novel bilevel programming algorithm to solve this class of problems, and apply it to linear least-squares support vector regression having hyperparameters $C$ (tradeoff) and $\epsilon$ (loss insensitivity). This new approach outperforms grid search and prior smooth bilevel CV methods in terms of modeling performance. Increased speed foresees modeling with an increased number of hyperparameters.
Citation:
Gregory M. Moore, Charles Bergeron, Kristin P. Bennett, "Nonsmooth Bilevel Programming for Hyperparameter Selection," icdmw, pp.374-381, 2009 IEEE International Conference on Data Mining Workshops, 2009
Usage of this product signifies your acceptance of the Terms of Use.