Sixth IEEE International Conference on Data Mining (ICDM'06)
Accelerating Newton Optimization for Log-Linear Models through Feature Redundancy
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
Log-linear models are widely used for labeling feature vectors and graphical models, typically to estimate robust conditional distributions in presence of a large number of potentially redundant features. Limited-memory quasi-Newton methods like LBFGS or BLMVM are optimization workhorses for such applications, and most of the training time is spent computing the objective and gradient for the optimizer. We propose a simple technique to speed up the training optimization by clustering features dynamically, and interleaving the standard optimizer with another, coarse-grained, faster optimizer that uses far fewer variables. Experiments with logistic regression training for text classification and conditional random field (CRF) training for information extraction show promising speed-ups between 2? and 9? without any systematic or significant degradation in the quality of the estimated models.
Citation:
Arpit Mathur, Soumen Chakrabarti, "Accelerating Newton Optimization for Log-Linear Models through Feature Redundancy," icdm, pp.404-413, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006