This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval
March 2014 (vol. 36 no. 3)
pp. 521-535
Jose Costa Pereira, Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, La Jolla, CA, USA
Emanuele Coviello, Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, La Jolla, CA, USA
Gabriel Doyle, Dept. of Linguistics, Univ. of California, San Diego, La Jolla, CA, USA
Nikhil Rasiwasia, Yahoo!Labs., Bangalore, India
Gert R. G. Lanckriet, Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, La Jolla, CA, USA
Roger Levy, Dept. of Linguistics, Univ. of California, San Diego, La Jolla, CA, USA
Nuno Vasconcelos, Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, La Jolla, CA, USA
The problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, for example, using an image to search for texts. A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities. Two hypotheses are then investigated regarding the fundamental attributes of these spaces. The first is that low-level cross-modal correlations should be accounted for. The second is that the space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. An extensive evaluation of retrieval performance is conducted to test the validity of the hypotheses. All approaches are shown successful for text retrieval in response to image queries and vice versa. It is concluded that both hypotheses hold, in a complementary form, although evidence in favor of the abstraction hypothesis is stronger than that for correlation.
Index Terms:
Semantics,Correlation,Multimedia communication,Joints,Hidden Markov models,Vectors,Databases,logistic regression,Multimedia,content-based retrieval,multimodal,cross-modal,image and text,retrieval model,semantic spaces,kernel correlation
Citation:
Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Nikhil Rasiwasia, Gert R. G. Lanckriet, Roger Levy, Nuno Vasconcelos, "On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 3, pp. 521-535, March 2014, doi:10.1109/TPAMI.2013.142
Usage of this product signifies your acceptance of the Terms of Use.