This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Automatic Indexing and Content-Based Retrieval of Captioned Images
September 1995 (vol. 28 no. 9)
pp. 49-56
This research explores the interaction of textual and photographic information in an integrated text/image database environment developed at the Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York, Buffalo. The idea is to extract information from a newspaper photo caption that can be used for retrieving the picture and for identifying the people shown. A multistage system, called Piction, uses spatial and characteristic constraints derived from the caption in labeling face candidates generated by a face locator. Several other vision systems employ the idea of top-down control in picture understanding by providing the general context; this system carries the notion one step further, exploiting not only general context but also picture-specific context. The author gives several examples showing how information from both text and images can be used in computing the similarity between a given query and an image in the database to satisfy focus-of-attention queries. Although Piction represents only a preliminary foray into truly integrated text/image content-based retrieval, it shows that additional discriminatory capabilities can be obtained by combining the two sources of information. Much work remains, however, both in improving the language processing capabilities and in face location and characterization.
Citation:
Rohini K. Srihari, "Automatic Indexing and Content-Based Retrieval of Captioned Images," Computer, vol. 28, no. 9, pp. 49-56, Sept. 1995, doi:10.1109/2.410153
Usage of this product signifies your acceptance of the Terms of Use.