The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2014 vol.36)
pp: 388-403
Jingdong Wang , Media Comput. Group, Microsoft Res. Asia, Beijing, China
Naiyan Wang , Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
You Jia , Carnegie Mellon Univ., Pittsburgh, PA, USA
Jian Li , Tsinghua Univ., Beijing, China
Gang Zeng , Peking Univ., Beijing, China
Hongbin Zha , Peking Univ., Beijing, China
Xian-Sheng Hua , Microsoft Corp., Redmond, WA, USA
ABSTRACT
We address the problem of approximate nearest neighbor (ANN) search for visual descriptor indexing. Most spatial partition trees, such as KD trees, VP trees, and so on, follow the hierarchical binary space partitioning framework. The key effort is to design different partition functions (hyperplane or hypersphere) to divide the points so that 1) the data points can be well grouped to support effective NN candidate location and 2) the partition functions can be quickly evaluated to support efficient NN candidate location. We design a trinary-projection direction-based partition function. The trinary-projection direction is defined as a combination of a few coordinate axes with the weights being 1 or -1. We pursue the projection direction using the widely adopted maximum variance criterion to guarantee good space partitioning and find fewer coordinate axes to guarantee efficient partition function evaluation. We present a coordinate-wise enumeration algorithm to find the principal trinary-projection direction. In addition, we provide an extension using multiple randomized trees for improved performance. We justify our approach on large-scale local patch indexing and similar image search.
INDEX TERMS
Principal component analysis, Artificial neural networks, Vegetation, Search problems, Partitioning algorithms, Nearest neighbor searches, Computer vision,trinary-projection trees, Approximate nearest neighbor search, KD trees
CITATION
Jingdong Wang, Naiyan Wang, You Jia, Jian Li, Gang Zeng, Hongbin Zha, Xian-Sheng Hua, "Trinary-Projection Trees for Approximate Nearest Neighbor Search", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 2, pp. 388-403, Feb. 2014, doi:10.1109/TPAMI.2013.125
REFERENCES
[1] D. Achlioptas, "Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins," J. Computer and System Science, vol. 66, no. 4, pp. 671-687, 2003.
[2] S. Arya and D.M. Mount, "Algorithms for Fast Vector Quantizaton," Proc. Data Compression Conf., pp. 381-390, 1993.
[3] S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, and A.Y. Wu, "An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions," J. ACM, vol. 45, no. 6, pp. 891-923, 1998.
[4] M. Bawa, T. Condie, and P. Ganesan, "LSH Forest: Self-Tuning Indexes for Similarity Search," Proc. 14th Int'l Conf. World Wide Web (WWW), pp. 651-660, 2005.
[5] J.S. Beis and D.G. Lowe, "Shape Indexing Using Approximate Nearest-Neighbour Search in High-Dimensional Spaces," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1000-1006, 1997.
[6] J.L. Bentley, "Multidimensional Binary Search Trees Used for Associative Searching," Comm. ACM, vol. 18, no. 9, pp. 509-517, 1975.
[7] J. Besag, "On the Statistical Analysis of Dirty Pictures," J. Royal Statistical Soc., vol. 48, no. 3, pp. 259-302, 1986.
[8] M. Brown and D.G. Lowe, "Recognising Panoramas," Proc. Ninth IEEE Int'l Conf. Computer Vision (ICCV), pp. 1218-1227, 2003.
[9] L. Cayton and S. Dasgupta, "A Learning Framework for Nearest Neighbor Search," Proc. Neural Information Processing Systems Conf. (NIPS), 2007.
[10] S. Dasgupta and Y. Freund, "Random Projection Trees and Low Dimensional Manifolds," Proc. 40th Ann. ACM Symp. Theory of Computing (STOC), pp. 537-546, 2008.
[11] M. Datar, N. Immorlica, P. Indyk, and V.S. Mirrokni, "Locality-Sensitive Hashing Scheme Based on P-Stable Distributions," Proc. 20th Ann. Symp. Computational Geometry, pp. 253-262, 2004.
[12] M. de Berg, T. Eindhoven, O. Cheong, M. van Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications. Springer-Verlag, 2008.
[13] A.A. Efros and W.T. Freeman, "Image Quilting for Texture Synthesis and Transfer," Proc. ACM SIGGRAPH, pp. 341-346, 2001.
[14] C. Faloutsos and K.-I. Lin, "FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), pp. 163-174, 1995.
[15] L. Fei-Fei, R. Fergus, and P. Perona, "Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories," Proc. Conf. Computer Vision and Pattern Recognition Workshop (CVPR '04), 2004.
[16] R.A. Finkel and J.L. Bentley, "Quad Trees: A Data Structure for Retrieval on Composite Keys," Acta Informatica, vol. 4, pp. 1-9, 1974.
[17] J.H. Friedman, J.L. Bentley, and R.A. Finkel, "An Algorithm for Finding Best Matches in Logarithmic Expected Time," ACM Trans. Math. Software, vol. 3, no. 3, pp. 209-226, 1977.
[18] A. Frome, Y. Singer, F. Sha, and J. Malik, "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification," Proc. IEEE 11th Int'l Conf.Computer Vision (ICCV), pp. 1-8, 2007.
[19] J. Hays and A.A. Efros, "Scene Completion Using Millions of Photographs," ACM Trans. Graphics, vol. 26, no. 3, p. 4, 2007.
[20] J. He, W. Liu, and S.-F. Chang, "Scalable Similarity Search with Optimized Kernel Hashing," Proc. 16th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 1129-1138, 2010.
[21] G. Hua, M. Brown, and S.A.J. Winder, "Discriminant Embedding for Local Image Descriptors," Proc. IEEE Int'l Conf. Computer Vision (ICCV), pp. 1-8, 2007.
[22] P. Jain, B. Kulis, and K. Grauman, "Fast Image Search for Learned Metrics," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[23] Y. Jia, J. Wang, G. Zeng, H. Zha, and X.-S. Hua, "Optimizing KD-Trees for Scalable Visual Descriptor Indexing," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3392-3399, 2010.
[24] W. Johnson and J. Lindenstrauss, "Extensions of Lipschitz Mappings into a Hilbert Space," Contemporary Math., vol. 26, pp. 189-206, 1984.
[25] B. Kulis and T. Darrells, "Learning to Hash with Binary Reconstructive Embeddings," Proc. Neural Information Processing Systems (NIPS), pp. 577-584, 2009.
[26] B. Kulis and K. Grauman, "Kernelized Locality-Sensitive Hashing for Scalable Image Search," Proc. IEEE Int'l Conf. Computer Vision (ICCV), pp. 2130-2137, 2009.
[27] L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum, "Real-Time Texture Synthesis by Patch-Based Sampling," ACM Trans. Graphics, vol. 20, no. 3, pp. 127-150, 2001.
[28] T. Liu, A.W. Moore, and A.G. Gray, "New Algorithms for Efficient High-Dimensional Nonparametric Classification," J. Machine Learning Research, vol. 7, pp. 1135-1158, 2006.
[29] T. Liu, A.W. Moore, A.G. Gray, and K. Yang, "An Investigation of Practical Approximate Nearest Neighbor Algorithms," Proc. Neural Information Processing Systems Conf. (NIPS), 2004.
[30] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[31] Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li, "Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search," Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), pp. 950-961, 2007.
[32] J. Matas, O. Chum, M. Urban, and T. Pajdla, "Robust Wide Baseline Stereo from Maximally Stable Extremal Regions," Proc. British Machine Vision Conf. (BMVC), 2002.
[33] A.W. Moore, "The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data," Proc. 16th Conf. Uncertainty in Artificial Intelligence (UAI), pp. 397-405, 2000.
[34] M. Muja and D.G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration," Proc. Int'l Conf. Computer Vision Theory and Applications (VISSAPP), vol. 1, pp. 331-340, 2009.
[35] G. Navarro, "Searching in Metric Spaces by Spatial Approximation," The Int'l J. Very Large Data Bases (VLDB), vol. 11, no. 1, pp. 28-46, 2002.
[36] D. Nistér and H. Stewénius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2161-2168, 2006.
[37] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, no. 3, pp. 145-175, 2001.
[38] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object Retrieval with Large Vocabularies and Fast Spatial Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[39] M. Raginsky and S. Lazebnik, "Locality Sensitive Binary Codes from Shift-Invariant Kernels," Proc. Neural Information Processing Systems (NIPS), 2009.
[40] H. Samet, Foundations of Multidimensional and Metric Data Structures. Elsevier, 2006.
[41] T.B. Sebastian and B.B. Kimia, "Metric-Based Shape Retrieval in Large Databases," Proc. 16th Int'l Conf. Pattern Recognition (ICPR), vol. 3, pp. 291-296, 2002.
[42] G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. The MIT Press, 2006.
[43] C. Silpa-Anan and R. Hartley, "Optimised KD-Trees for Fast Image Descriptor Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[44] N. Snavely, S.M. Seitz, and R. Szeliski, "Photo Tourism: Exploring Photo Collections in 3D," ACM Trans. Graphics, vol. 25, no. 3, pp. 835-846, 2006.
[45] R.F. Sproull, "Refinements to Nearest-Neighbor Searching in K-Dimensional Trees," Algorithmica, vol. 6, no. 4, pp. 579-589, 1991.
[46] A.B. Torralba, R. Fergus, and W.T. Freeman, "80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958-1970, Nov. 2008.
[47] G.T. Toussaint, "The Relative Neighbourhood Graph of a Finite Planar Set," Pattern Recognition, vol. 12, no. 4, pp. 261-268, 1980.
[48] W. Tu, R. Pan, and J. Wang, "Similar Image Search with a Tiny Bag-of-Delegates Representation," Proc. 20th ACM Int'l Conf. Multimedia (Multimedia), pp. 885-888, 2012.
[49] N. Verma, S. Kpotufe, and S. Dasgupta, "Which Spatial Partition Trees Are Adaptive to Intrinsic Dimension?" Proc. 25th Conf. Uncertainty in Artificial Intelligence (UAI), pp. 565-574, 2009.
[50] J. Wang, S. Kumar, and S.-F. Chang, "Semi-Supervised Hashing for Scalable Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[51] J. Wang and S. Li, "Query-Driven Iterated Neighborhood Graph Search for Large Scale Indexing," Proc. 20th ACM Int'l Conf. Multimedia (Multimedia), pp. 179-188, 2012.
[52] J. Wang, J. Wang, G. Zeng, Z. Tu, R. Gan, and S. Li, "Scalable K-NN Graph Construction for Visual Descriptors," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1106-1113, 2012.
[53] Y. Weiss, A.B. Torralba, and R. Fergus, "Spectral Hashing," Proc. Neural Information Processing Systems (NIPS), pp. 1753-1760, 2008.
[54] H. Xu, J. Wang, Z. Li, G. Zeng, S. Li, and N. Yu, "Complementary Hashing for Approximate Nearest Neighbor Search," Proc. IEEE Int'l Conf. Computer Vision (ICCV), pp. 1631-1638, 2011.
[55] K. Yamaguchi, T.L. Kunii, and K. Fujimura, "Octree-Related Data Structures and Algorithms," IEEE Computer Graphics and Applications, vol. 4, no. 1, pp. 53-59, Jan. 1984.
[56] P.N. Yianilos, "Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces," Proc. Fourth Ann. ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 311-321, 1993.
[57] H. Zhang, A.C. Berg, M. Maire, and J. Malik, "SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2126-2136, 2006.
108 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool