The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2014 vol.36)
pp: 550-563
Dayong Wang , Sch. of Comput. Eng., Nanyang Technol. Univ. Singapore, Singapore, Singapore
Steven C. H. Hoi , Sch. of Comput. Eng., Nanyang Technol. Univ. Singapore, Singapore, Singapore
Ying He , Sch. of Comput. Eng., Nanyang Technol. Univ. Singapore, Singapore, Singapore
Jianke Zhu , Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
Tao Mei , Microsoft Res. Asia, Beijing, China
Jiebo Luo , Dept. of Comput. Sci., Univ. of Rochester, Rochester, NY, USA
ABSTRACT
Auto face annotation, which aims to detect human faces from a facial image and assign them proper human names, is a fundamental research problem and beneficial to many real-world applications. In this work, we address this problem by investigating a retrieval-based annotation scheme of mining massive web facial images that are freely available over the Internet. In particular, given a facial image, we first retrieve the top n similar instances from a large-scale web facial image database using content-based image retrieval techniques, and then use their labels for auto annotation. Such a scheme has two major challenges: 1) how to retrieve the similar facial images that truly match the query, and 2) how to exploit the noisy labels of the top similar facial images, which may be incorrect or incomplete due to the nature of web images. In this paper, we propose an effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique, which exploits the principle of local coordinate coding by learning sparse features, and employs the idea of graph-based weak label regularization to enhance the weak labels of the similar facial images. An efficient optimization algorithm is proposed to solve the WLRLCC problem. Moreover, an effective sparse reconstruction scheme is developed to perform the face annotation task. We conduct extensive empirical studies on several web facial image databases to evaluate the proposed WLRLCC algorithm from different aspects. The experimental results validate its efficacy. We share the two constructed databases "WDB" (714,454 images of 6,025 people) and "ADB" (126,070 images of 1,200 people) with the public. To further improve the efficiency and scalability, we also propose an offline approximation scheme (AWLRLCC) which generally maintains comparable results but significantly reduces the annotation time.
INDEX TERMS
Face, Encoding, Optimization, Vectors, Sparse matrices, Image databases, Image coding,weak label, Face annotation, content-based image retrieval, machine learning, label refinement, web facial images
CITATION
Dayong Wang, Steven C. H. Hoi, Ying He, Jianke Zhu, Tao Mei, Jiebo Luo, "Retrieval-Based Face Annotation by Weak Label Regularized Local Coordinate Coding", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 3, pp. 550-563, March 2014, doi:10.1109/TPAMI.2013.145
REFERENCES
[1] T.L. Berg, A.C. Berg, J. Edwards, M. Maire, R. White, Y.W. Teh, E.G. Learned-Miller, and D.A. Forsyth, "Names and Faces in the News," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 848-854, 2004.
[2] J. Zhu, S.C. Hoi, and M.R. Lyu, "Face Annotation by Transductive Kernel Fisher Discriminant," IEEE Trans. Multimedia, vol. 10, no. 1, pp. 86-96, Jan. 2008.
[3] D. Wang, S. Hoi, Y. He, and J. Zhu, "Mining Weakly-Labeled Web Facial Images for Search-Based Face Annotation," IEEE Trans. Knowledge and Data Eng., vol. 99, no. PrePrints, pp. 1-14, 2012.
[4] A. Holub, P. Moreels, and P. Perona, "Unsupervised Clustering for Google Searches of Celebrity Images," Proc. Eighth IEEE Int'l Conf. Automatic Face & Gesture Recognition (FG '08), pp. 1-8, 2008.
[5] S.C. Hoi, R. Jin, J. Zhu, and M.R. Lyu, "Semi-Supervised SVM Batch Mode Active Learning with Applications to Image Retrieval," ACM Trans. Information Systems, vol. 27, no. 3, pp. 1-29, July 2009.
[6] Z. Wu, Q. Ke, J. Sun, and H.-Y. Shum, "Scalable Face Image Retrieval with Identity-Based Quantization and Multi-Reference Re-Ranking," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3469-3476, 2010.
[7] S.C. Hoi, W. Liu, and S.-F. Chang, "Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval and Clustering," ACM Trans. Multimedia Computing, Comm., and Applications, vol. 6, no. 3, pp. 18:1-18:26, Aug. 2010.
[8] J. Tang, R. Hong, S. Yan, T.-S. Chua, G.-J. Qi, and R. Jain, "Image Annotation by kNN-Sparse Graph-Based Label Propagation over Noisily Tagged Web Images," ACM Trans. Intelligent Systems and Technology, vol. 2, pp. 14:1-14:15, Feb. 2011.
[9] F. Wu, Y. Han, Q. Tian, and Y. Zhuang, "Multi-Label Boosting for Image Annotation by Structural Grouping Sparsity," Proc. ACM Int'l Conf. Multimedia, pp. 15-24, 2010.
[10] W. Dong, Z. Wang, W. Josephson, M. Charikar, and K. Li, "Modeling LSH for Performance Tuning," Proc. 17th ACM Conf. Information and Knowledge Management (CIKM), pp. 669-678, 2008.
[11] W. Zhao, R. Chellappa, P.J. Phillips, and A. Rosenfeld, "Face Recognition: A Literature Survey," ACM Computing Surveys, vol. 35, no. 4, pp. 399-458, Dec. 2003.
[12] Handbook of Face Recognition, second ed., S.Z. Li and A.K. Jain, eds. Springer, 2011.
[13] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, "Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments," technical report, UMASS, pp. 07-49, Oct. 2007.
[14] Z. Cao, Q. Yin, X. Tang, and J. Sun, "Face Recognition with Learning-Based Descriptor," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 2707-2714, 2010.
[15] A. Hanbury, "A Survey of Methods for Image Annotation," J. Visual Languages and Computing, vol. 19, pp. 617-627, Oct. 2008.
[16] P. Wu, S.C.-H. Hoi, P. Zhao, and Y. He, "Mining Social Images with Distance Metric Learning for Automated Image Tagging," Proc. Fourth ACM Int'l Conf. Web Search and Data Mining (WSDM), pp. 197-206, 2011.
[17] H. Xia, P. Wu, S.C. Hoi, and R. Jin, "Boosting Multi-Kernel Locality-Sensitive Hashing for Scalable Image Retrieval," Proc. 35th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), pp. 55-64, 2012.
[18] P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," Proc. Seventh European Conf. Computer Vision (ECCV), pp. 97-112, 2002.
[19] G. Carneiro, A.B. Chan, P. Moreno, and N. Vasconcelos, "Supervised Learning of Semantic Classes for Image Annotation and Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 394-410, Mar. 2007.
[20] C. Wang, F. Jing, L. Zhang, and H.-J. Zhang, "Image Annotation Refinement Using Random Walk with Restarts," Proc. 14th Ann. ACM Int'l Conf. Multimedia (MM), pp. 647-650, 2006.
[21] L. Page, S. Brin, R. Motwani, and T. Winograd, "The Pagerank Citation Ranking: Bringing Order to the Web," Technical Report 1999-66, Stanford InfoLab, Nov. 1999.
[22] B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman, "Labelme: A Database and Web-Based Tool for Image Annotation," Int'l J. Computer Vision, vol. 77, nos. 1-3, pp. 157-173, 2008.
[23] X. Rui, M. Li, Z. Li, W.-Y. Ma, and N. Yu, "Bipartite Graph Reinforcement Model for Web Image Annotation," Proc. ACM 15th Int'l Conf. Multimedia (MM), pp. 585-594, 2007.
[24] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, and Y. Ma, "Robust Face Recognition via Sparse Representation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, Feb. 2009.
[25] G. Wang, A. Gallagher, J. Luo, and D. Forsyth, "Seeing People in Social Context: Recognizing People and Social Relationships," Proc. 11th European Conf. Computer Vision (ECCV), pp. 169-182, 2010.
[26] J. Cui, F. Wen, R. Xiao, Y. Tian, and X. Tang, "EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-Ranking," Proc. SIGCHI Conf. Human Factors in Computing Systems (CHI), pp. 367-376, 2007.
[27] J.Y. Choi, W.D. Neve, K.N. Plataniotis, and Y.M. Ro, "Collaborative Face Recognition for Improved Face Annotation in Personal Photo Collections Shared on Online Social Networks," IEEE Trans. Multimedia, vol. 13, no. 1, pp. 14-28, Feb. 2011.
[28] D. Ozkan and P. Duygulu, "A Graph Based Approach for Naming Faces in News Photos," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1477-1482, 2006.
[29] D.-D. Le and S. Satoh, "Unsupervised Face Annotation by Mining the Web," Proc. Eighth IEEE Int'l Conf. Data Mining (ICDM), pp. 383-392, 2008.
[30] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, "Face Recognition from Caption-Based Supervision," Int'l J. Computer Vision, vol. 96, no. 1, pp. 64-82, Jan. 2011.
[31] T. Mensink and J.J. Verbeek, "Improving People Search Using Query Expansions," Proc. 10th European Conf. Computer Vision (ECCV), pp. 86-99, 2008.
[32] T.L. Berg, A.C. Berg, J. Edwards, and D. Forsyth, "Who's in the Picture," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 264-271, 2006.
[33] D. Wang, S.C. Hoi, and Y. He, "Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation," Proc. 34th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 535-544, 2011.
[34] D. Wang, S.C.H. Hoi, and Y. He, "A Unified Learning Framework for Auto Face Annotation by Mining Web Facial Images," Proc. 21st ACM Int'l Conf. Information and Knowledge Management (CIKM '12), pp. 1392-1401, 2012.
[35] D. Wang, S.C. Hoi, P. Wu, J. Zhu, Y. He, and C. Miao, "Learning to Name Faces: A Multimodal Learning Scheme for Search-Based Face Annotation," Proc. 36th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, 2013.
[36] K. Yu, T. Zhang, and Y. Gong, "Nonlinear Learning Using Local Coordinate Coding," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 2259-2267, 2009.
[37] P.O. Hoyer, "Non-Negative Sparse Coding," CoRR, vol. cs.NE/0202009,/ 2002.
[38] X. Zhu, Z. Ghahramani, and J.D. Lafferty, "Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions," Proc. 20th Int'l Conf. Machine Learning (ICML), pp. 912-919, 2003.
[39] Y.-Y. Sun, Y. Zhang, and Z.-H. Zhou, "Multi-Label Learning with Weak Label," Proc. 24th AAAI Conf. Artificial Intelligence (AAAI), pp. 593-598, 2010.
[40] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, "Learning Object Categories from Google's Image Search," Proc. IEEE Int'l Conf. Computer Vision (ICCV '05), vol. 2, pp. 1816-1823, oct. 2005.
[41] T. Berg and D. Forsyth, "Animals on the Web," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1463-1470, 2006.
[42] J. Zhu, S.C. Hoi, and L.V. Gool, "Unsupervised Face Alignment by Robust Nonrigid Mapping," Proc. 12th IEEE Int'l Conf. Computer Vision (ICCV), pp. 1265-1272, 2009.
[43] C. Siagian and L. Itti, "Rapid Biologically-Inspired Scene Classification Using Features Shared with Visual Attention," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 300-312, Feb. 2007.
[44] E. Elhamifar and R. Vidal, "Sparse Subspace Clustering: Algorithm, Theory, and Applications," CoRR, vol. abs/1203.1005, 2012.
[45] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, and Y. Ma, "Robust Face Recognition via Sparse Representation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, Apr. 2008.
[46] J. Wright and Y. Ma, "Dense Error Correction via l1-Minimization," IEEE Trans. Information Theory, vol. 56, no. 7, pp. 3540-3560, July 2010.
[47] A. Beck and M. Teboulle, "A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems," SIAM J. Imaging Sciences, vol. 2, pp. 183-202, Mar. 2009.
[48] J. Liu, S. Ji, and J. Ye, SLEP: Sparse Learning with Efficient Projections. Arizona State Univ., 2009.
[49] J. Liu and J. Ye, "Efficient Euclidean Projections in Linear Time," Proc. 26th Ann. Int'l Conf. Machine Learning (ICML), pp. 657-664, 2009.
[50] M. Ozcan, J. Luo, V. Ferrari, and B. Caputo, "A Large-Scale Database of Images and Captions for Automatic Face Naming," Proc. British Machine Vision Conf. (BMVC), pp. 29.1-29.11, 2011.
[51] N. Kumar, A.C. Berg, P.N. Belhumeur, and S.K. Nayar, "Attribute and Simile Classifiers for Face Verification," Proc. IEEE Int'l Conf. Computer Vision (ICCV), pp. 365-372, 2009.
[52] T. Ahonen, A. Hadid, and M. Pietikainen, "Face Recognition with Local Binary Patterns," Proc. European Conf. Computer Vision (ECCV), vol. 1, pp. 469-481, 2004.
66 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool