2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Boston, MA, USA
June 7, 2015 to June 12, 2015
ISSN: 1063-6919
ISBN: 978-1-4673-6963-3
pp: 3081-3089
Gunhee Kim , Seoul National University, Korea
Seungwhan Moon , Carnegie Mellon University, USA
Leonid Sigal , Disney Research Pittsburgh, USA
We propose an approach that utilizes large collections of photo streams and blog posts, two of the most prevalent sources of data on the Web, for joint story-based summarization and exploration. Blogs consist of sequences of images and associated text; they portray events and experiences with concise sentences and representative images. We leverage blogs to help achieve story-based semantic summarization of collections of photo streams. In the opposite direction, blog posts can be enhanced with sets of photo streams by showing interpolations between consecutive images in the blogs. We formulate the problem of joint alignment from blogs to photo streams and photo stream summarization in a unified latent ranking SVM framework. We alternate between solving the two coupled latent SVM problems, by first fixing the summarization and solving for the alignment from blog images to photo streams and vice versa. On a newly collected large-scale Disneyland dataset of 10K blogs (120K associated images) and 6K photo streams (540K images), we demonstrate that blog posts and photo streams are mutually beneficial for summarization, exploration, semantic knowledge transfer, and photo interpolation.

