The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2013 vol.35)
pp: 367-380
Kui Jia , Adv. Digital Sci. Center, Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Xiaogang Wang , Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Shatin, China
Xiaoou Tang , Dept. of Inf. Eng., Chinese Univ. of Hong Kong, Shatin, China
ABSTRACT
In this paper, we propose a framework of transforming images from a source image space to a target image space, based on learning coupled dictionaries from a training set of paired images. The framework can be used for applications such as image super-resolution and estimation of image intrinsic components (shading and albedo). It is based on a local parametric regression approach, using sparse feature representations over learned coupled dictionaries across the source and target image spaces. After coupled dictionary learning, sparse coefficient vectors of training image patch pairs are partitioned into easily retrievable local clusters. For any test image patch, we can fast index into its closest local cluster and perform a local parametric regression between the learned sparse feature spaces. The obtained sparse representation (together with the learned target space dictionary) provides multiple constraints for each pixel of the target image to be estimated. The final target image is reconstructed based on these constraints. The contributions of our proposed framework are three-fold. 1) We propose a concept of coupled dictionary learning based on coupled sparse coding which requires the sparse coefficient vectors of a pair of corresponding source and target image patches to have the same support, i.e., the same indices of nonzero elements. 2) We devise a space partitioning scheme to divide the high-dimensional but sparse feature space into local clusters. The partitioning facilitates extremely fast retrieval of closest local clusters for query patches. 3) Benefiting from sparse feature-based image transformation, our method is more robust to corrupted input data, and can be considered as a simultaneous image restoration and transformation process. Experiments on intrinsic image estimation and super-resolution demonstrate the effectiveness and efficiency of our proposed method.
INDEX TERMS
Dictionaries, Training, Vectors, Encoding, Image coding, Image resolution, Estimation,super-resolution, Image transformation, image mapping, sparse coding, intrinsic images
CITATION
Kui Jia, Xiaogang Wang, Xiaoou Tang, "Image Transformation Based on Learning Dictionaries across Image Spaces", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 2, pp. 367-380, Feb. 2013, doi:10.1109/TPAMI.2012.95
REFERENCES
[1] A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, and D. Salesin, "Image Analogies," Proc. ACM Siggraph, 2001.
[2] Z. Liu, Z. Zhang, and Y. Shan, "Image-Based Surface Detail Transfer," IEEE Computer Graphics and Applications, vol. 24, no. 3, pp. 30-35, May/June 2004.
[3] S. Bae, S. Paris, and F. Durand, "Two-Scale Tone Management for Photographic Look," Proc. ACM Siggraph, 2006.
[4] W.T. Freeman, E.C. Pasztor, and O.T. Carmichael, "Learning Low-Level Vision," Int'l J. Computer Vision, vol. 40, pp. 25-47, 2000.
[5] J. Besag, "On the Statistical Analysis of Dirty Pictures (with Discussion)," J. Royal Statistical Soc., Series B, vol. 48, no. 3, pp. 259-302, 1986.
[6] S. Baker and T. Kanade, "Limits on Super-Resolution and How to Break Them," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1167-1183, Sept. 2002.
[7] H. Chang, D.Y. Yeung, and Y. Xiong, "Super-Resolution through Neighbor Embedding," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[8] C. Liu, H.Y. Shum, and W.T. Freeman, "Face Hallucination: Theory and Practice," Int'l J. Computer Vision, vol. 75, no. 1, pp. 115-134, 2007.
[9] Y. Li and E.H. Adelson, "Image Mapping Using Local and Global Statistics," Proc. SPIE-IS&T Electronic Imaging, vol. 6806, pp. 680614.1-680614.11, 2008.
[10] D. Lin and X. Tang, "Coupled Space Learning of Image Style Transformation," Proc. 10th IEEE Int'l Conf. Computer Vision, 2005.
[11] M.F. Tappen, E.H. Adelson, and W.T. Freeman, "Estimating Intrinsic Component Images Using Non-Linear Regression," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[12] K. Kim and Y. Kwon, "Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 6, pp. 1127-1133, June 2010.
[13] S. Dai, M. Han, W. Xu, Y. Wu, and Y. Gong, "Soft Edge Smoothness Prior for Alpha Channel Super Resolution," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[14] J. Sun, Z. Xu, and H. Shum, "Image Super-Resolution Using Gradient Profile Prior," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[15] J. Yang, J. Wright, T. Huang, and Y. Ma, "Image Super-Resolution via Sparse Representation," IEEE Trans. Image Processing, vol. 19, no. 10, pp. 2861-2873, Nov. 2009.
[16] R. Grosse, M.K. Johnson, E.H. Adelson, and W.T. Freeman, "Ground-Truth Data Set and Baseline Evaluations for Intrinsic Image Algorithms," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[17] E.H. Land and J.J. McCann, "Lightness and Retinex Theory," J. Optical Soc. of Am., vol. 61, no. 1, pp. 1-11, 1978.
[18] L. Shen, P. Tan, and S. Lin, "Intrinsic Image Decomposition with Non-Local Texture Cues," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[19] Y. Weiss, "Deriving Intrinsic Images from Image Sequences," Proc. Eighth IEEE Int'l Conf. Computer Vision, vol. 2, pp. 68-75, 2001.
[20] Y. Matsushita, S. Lin, S.B. Kang, and H.Y. Shum, "Estimating Intrinsic Images from Image Sequences with Biased Illumination," Proc. European Conf. Computer Vision, vol. 2, pp. 274-286, 2004.
[21] D. Glasner, S. Bagon, and M. Irani, "Super-Resolution from a Single Image," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[22] S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, and A.Y. Wu, "An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions," J. ACM, vol. 45, no. 6, pp. 891-823, 1998.
[23] G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, 2006.
[24] Y. Weiss, A. Torralba, and R. Fergus, "Spectral Hashing," Proc. Advances in Neural Information Processing Systems, 2008.
[25] M. Elad and M. Aharon, "Image Denoising via Sparse and Redundant Representations over Learned Dictionaries," IEEE Trans. Image Processing, vol. 54, no. 12, pp. 3736-3745, 2006.
[26] B.A. Olshausen and D.J. Field, "Sparse Coding with an Over-Complete Basis Set: A Strategy Employed by V1?" Vision Research, vol. 37, pp. 3311-3325, 1997.
[27] S. Roth and M.J. Black, "Fields of Experts: A Framework for Learning Image Priors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[28] M. Yuan and Y. Lin, "Model Selection and Estimation in Regression with Grouped Variables," J. Royal Statistical Soc. Series B, vol. 68, pp. 49-67, 2006.
[29] V. Roth and B. Fischer, "The Group-Lasso for Generalized Linear Models: Uniqueness of Solutions and Efficient Algorithms," Proc. 25th Int'l Conf. Machine Language, 2008.
[30] P. Tseng and S. Yun, "A Coordinate Gradient Descent Method for Nonsmooth Separable Minimization," Math. Programming Series B, vol. 117, pp. 387-423, 2009.
[31] K. Huang and S. Aviyente, "Sparse Representation for Signal Classification," Proc. Advances in Neural Information Processing Systems, vol. 19, pp. 609-616, 2007.
[32] M. Aharon, M. Elad, and A.M. Bruckstein, "The K-SVD: An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representations," IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 4311-4322, Nov. 2006.
[33] H. Lee, A. Battle, R. Raina, and A.Y. Ng, "Efficient Sparse Coding Algorithms," Proc. Advances in Neural Information Processing Systems, 2007.
[34] J. Wright, A.Y. Yang, A. Ganesh, S. Sastry, and Y. Ma, "Robust Face Recognition via Sparse Representation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, Feb. 2008.
[35] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, "Supervised Dictionary Learning," Proc. Advances in Neural Information Processing Systems, 2008.
[36] J. Mairal, F. Bach, and J. Ponce, "Task-Driven Dictionary Learning," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 791-804, Apr. 2011.
[37] K. Kavukcuoglu, M. Ranzato, R. Fergus, and Y. LeCun, "Learning Invariant Features through Topographic Filter Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[38] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 1470-1477, 2003.
[39] H. Jegou, M. Douze, and C. Schmid, "Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search," Proc. 10th European Conf. Computer Vision, 2008.
[40] A.A. Efros and W.T. Freeman, "Quilting for Texture Synthesis and Transfer," Proc. ACM Conf. Computer Graphics and Interactive Techniques, 2001.
[41] M. Osborne, B. Presnell, and B. Turlach, "A New Approach to Variable Selection in Least Squares Problems," IMA J. Numerical Analysis, vol. 20, pp. 389-403, 2000.
[42] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, "Least Angle Regression," Ann. Statistics, vol. 32, no. 2, pp. 407-499, 2004.
[43] R. Tibshirani, "Regression Shrinkge and Selection via the Lasso," J. Royal Statistical Soc. B., vol. 58, no. 1, pp. 267-288, 1996.
[44] F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, "Convex Optimization with Sparsity-Inducing Norms," Optimization for Machine Learning, MIT Press, 2011.
[45] R. Gray and D. Neuhoff, "Quantization," IEEE Trans. Information Theory, vol. 44, no. 6, pp. 2325-2383, 1998.
[46] G.J. McLachlan and K.E. Basford, Mixture Models: Inference and Applications to Clustering. Dekker, 1988.
[47] G. Yu, G. Sapiro, and S. Mallat, "Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity," arXiv:1006.3056, 2010.
[48] D. Nettleton, A. Orriols-Puig, and A. Fornells, "A Study of the Effect of Different Types of Noise on the Precision of Supervised Learning Techniques," Artificial Intelligence Rev., vol. 33, pp. 275-306, 2010.
[49] X. Wang and X. Tang, "Hallucinating Face by Eigentransformation," IEEE Trans. Systems, Man, and Cybernetics-Part C, vol. 35, no. 3, pp. 425-434, Aug. 2005.
[50] Q. Wang, X. Tang, and H. Shum, "Patch Based Blind Image Super Resolution," Proc. 10th IEEE Int'l Conf. Computer Vision, 2005.
[51] X. Tang and X. Wang, "Face Sketch Recognition," IEEE Trans. Circuits and Systems for Video Technology, vol. 14, no. 11, pp. 50-57, Jan. 2004.
[52] X. Wang and X. Tang, "Face Photo-Sketch Synthesis and Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1955-1967, Nov. 2009.
[53] D. Nister and H. Stewenius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[54] S. Lloyd, "Least Squares Quantization in PCM," IEEE Trans. Information Theory, vol. 28, no. 2, pp. 129-137, Mar. 1982.
62 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool