Issue No. 05 - May (2012 vol. 34)
Nenghai Yu , Inf. Process. Center, Univ. of Sci. & Technol. of China, Hefei, China
Xian-Sheng Hua , Microsoft Res. Asia, Redmond, WA, USA
Lei Wu , Dept. of Electron. Eng. & Inf. Sci., Univ. of Sci. & Technol. of China, Hefei, China
Wei-Ying Ma , Microsoft Res. Asia, Beijing, China
Shipeng Li , Microsoft Res. Asia, Beijing, China
This paper proposes the Flickr Distance (FD) to measure the visual correlation between concepts. For each concept, a collection of related images are obtained from the Flickr website. We assume that each concept consists of several states, e.g., different views, different semantics, etc., which are considered as latent topics. Then a latent topic visual language model (LTVLM) is built to capture these states. The Flickr distance between two concepts is defined as the Jensen-Shannon (J-S) divergence between their LTVLM. Differently from traditional conceptual distance measurements, which are based on Web textual documents, FD is based on the visual information. Comparing with the WordNet distance, FD can easily scale up with the increasing size of the conceptual corpus. Comparing with the Google Distance (NGD) and Tag Concurrence Distance (TCD), FD uses the visual information and can properly measure the conceptual relations. We apply FD to multimedia-related tasks and find methods based on FD significantly outperform those based on NGD and TCD. With the FD measurement, we also construct a large-scale visual conceptual network (VCNet) to store the knowledge of conceptual relationship. Experiments show that FD is more coherent to human cognition and it also outperforms text-based distances in real-world applications.
Web sites, image retrieval, multimedia computing, visual languages, conceptual relationship knowledge storage, Flickr distance, visual concepts, visual correlation, relationship measure, image collection, Flickr Web site, latent topic visual language model, Jensen-Shannon divergence, J-S divergence, visual information, conceptual corpus, conceptual relation measurement, multimedia-related tasks, large-scale visual conceptual network, large-scale VCNet, Visualization, Correlation, Semantics, Humans, Feature extraction, Google, Computational modeling, image analysis., Artificial intelligence, distance learning, machine vision
Nenghai Yu, Xian-Sheng Hua, Lei Wu, Wei-Ying Ma and Shipeng Li, "Flickr Distance: A Relationship Measure for Visual Concepts," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. , pp. 863-875, 2012.