The Community for Technology Leaders
Green Image
Issue No. 04 - April (2016 vol. 28)
ISSN: 1041-4347
pp: 994-1006
Pai Peng , Key Laboratory of Intelligent Computing Based Big Data of Zhejiang Province, College of Computer Science and Technology, Hangzhou, China
Lidan Shou , Key Laboratory of Intelligent Computing Based Big Data of Zhejiang Province, College of Computer Science and Technology, Hangzhou, China
Ke Chen , Key Laboratory of Intelligent Computing Based Big Data of Zhejiang Province, College of Computer Science and Technology, Hangzhou, China
Gang Chen , Key Laboratory of Intelligent Computing Based Big Data of Zhejiang Province, College of Computer Science and Technology, Hangzhou, China
Sai Wu , Key Laboratory of Intelligent Computing Based Big Data of Zhejiang Province, College of Computer Science and Technology, Hangzhou, China
ABSTRACT
This paper presents a project called KnowIng camera prototype SyStem (KISS) for real-time places-of-interest (POI) recognition and annotation for smartphone photos, with the availability of online geotagged images for POIs as our knowledge base. We propose a “Spatial+Visual” (S+V) framework which consists of a probabilistic field-of-view (pFOV) model in the spatial phase and sparse coding similarity metric in the visual phase to recognize phone-captured POIs. Moreover, we put forward an offline Collaborative Salient Area (COSTAR) mining algorithm to detect common visual features (called Costars) among the noisy photos geotagged on each POI, thus to clean the geotagged image database. The mining result can be utilized to annotate the region-of-interest on the query image during the online query processing. Besides, this mining procedure also improves the efficiency and accuracy of the S+V framework. Furthermore, we extend the pFOV model into a Bayesian FOV( $_$\beta$_$ FOV) model which improves the spatial recognition accuracy by more than 30 percent and also further alleviates visual computation. From a bayesian point of view, the likelihood of a certain POI being captured by phones is a prior probability in pFOV model which is represented as a posterior probability in $_$\beta$_$ FOV model.Our experiments in the real-world and Oxford 5K datasets show promising recognition results. In order to provide a fine-grained annotation ground truth, we labeled a new dataset based on Oxford 5K and make it public available on the web. Our COSTAR mining techniqueoutperforms state-of-the-art approach on both dataset.
INDEX TERMS
Visualization, Cameras, Databases, Image recognition, Computational modeling, Probabilistic logic, Noise measurement
CITATION

P. Peng, L. Shou, K. Chen, G. Chen and S. Wu, "KISS: Knowing Camera Prototype System for Recognizing and Annotating Places-of-Interest," in IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. 4, pp. 994-1006, 2016.
doi:10.1109/TKDE.2015.2489647
127 ms
(Ver 3.3 (11022016))