Issue No. 06 - June (2012 vol. 34)
T. Gass , Comput. Vision Lab., ETH Zurich, Zurich, Switzerland
T. Deselaers , Google Switzerland, Zurich, Switzerland
G. Heigold , Google Inc., Mountain View, CA, USA
H. Ney , Lehrstuhl fur Inf. 6, RWTH Aachen, Aachen, Germany
We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.
support vector machines, handwritten character recognition, image classification, regression analysis, discriminative deformation parameter training, latent log-linear mixture models, handwritten digit classification, image deformation-aware log-linear models, stationary point convergence, USPS data set, MNIST data set, Training, Deformable models, Hidden Markov models, Kernel, Approximation methods, Data models, Numerical models, image classification., Log-linear models, latent variables, conditional random fields, OCR
T. Gass, T. Deselaers, G. Heigold, H. Ney, "Latent Log-Linear Models for Handwritten Digit Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 1105-1117, June 2012, doi:10.1109/TPAMI.2011.218