The recognition of a place depicted in an image typically adopts methods from image retrieval in large-scale databases. First, a query image is described as a "bag-of-features" and compared to every image in the database. Then, the most similar images are passed to a geometric verification stage. However, this is an inefficient approach when considering that some database images may be almost identical.
In a paper presented at the recent International Conference on Computer Vision, Edward Johns and Guang-Zhong Yang of the UK’s Imperial College London address this issue by clustering similar database images to represent distinct scenes, and tracking local features that are consistently detected to form a set of real-world landmarks. Query images can then be matched to landmarks rather than features, and a probabilistic model of landmark properties learned from the cluster can appropriately verify or reject possible feature matches. Results on a database of more than 200,000 images of popular tourist destinations show improvements in both recognition performance and efficiency compared to traditional image retrieval methods.
“From Images to Scenes: Compressing an Image Cluster into a Single Scene Model for Place Recognition,” along with other papers from ICCV 2012, is available to both IEEE Computer Society members and paid subscribers via the Computer Society Digital Library.