The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January/February (2011 vol.31)
pp: 9-19
Sean Arietta , University of Virginia
Jason Lawrence , University of Virginia
ABSTRACT
Many example-based image-processing algorithms operate on image patches (small windows of pixels). Such algorithms are commonly used for texture synthesis, resolution enhancement, image denoising, colorization, and hole filling. One barrier to the widespread adoption and performance of these techniques is inaccessibility to a large, varied collection of image patches. The authors describe a database of one trillion image patches assembled from one million natural images downloaded from the Internet. They also describe and analyze two systems for performing nearest-neighbor searches over this database that use the parallel-programming frameworks Hadoop and MPI, respectively. To demonstrate this database's utility as a research tool, they used it to investigate the fundamental relationships between patch size, amount of training data, and expected accuracy of the closest matches. They report a closed-form analytic expression that relates these three quantities, letting them predict any one from the other two. The findings show that massive databases are necessary to achieve reliable performance for even moderate-size patches. These findings also offer important and heretofore absent guidelines for practitioners and researchers interested in working with and improving such data-driven systems.
INDEX TERMS
natural images, image processing, nearest neighbor, image patches, image databases, kd-trees, locality-sensitive hashing, LSH, distributed processing, image search, computer graphics, graphics and multimedia
CITATION
Sean Arietta, Jason Lawrence, "Building and Using a Database of One Trillion Natural-Image Patches", IEEE Computer Graphics and Applications, vol.31, no. 1, pp. 9-19, January/February 2011, doi:10.1109/MCG.2010.105
REFERENCES
1. A.A. Efros and T.K. Leung, "Texture Synthesis by Non-parametric Sampling," Proc. IEEE Int'l Conf. Computer Vision, IEEE CS Press, 1999, pp. 1033–1038.
2. W. Freeman, E. Pasztor, and O. Carmichael, "Learn-ing Low-Level Vision," Int'l J. Computer Vision, vol. 40, no. 1, 2000, pp. 25–47.
3. A. Buades, B. Coll, and J.M. Morel, "A Review of Image Denoising Algorithms, with a New One," Multiscale Modeling and Simulation, vol. 4, no. 2, 2005, pp. 490–530.
4. J. Hays and A.A. Efros, "Scene Completion Using Millions of Photographs," ACM Trans. Graphics, vol. 26, no. 3, 2007, article 4.
5. M. Elad and D. Datsenko, "Example-Based Regular-ization Deployed to Super-resolution Reconstruction of a Single Image," Computer J., vol. 50, no. 4, 2007, pp. 1–16.
6. D.M. Chandler and D.J. Field, "Estimates of the Information Content and Dimensionality of Natural Scenes from Proximity Distributions," J. Optical Soc. America A, vol. 24, no. 4, 2007, pp. 922–941.
7. J. Yang et al., "Image Super-resolution as Sparse Representation of Raw Image Patches," Proc. 2008 IEEE Conf. Computer Vision and Pattern Recognition (CVPR 08), IEEE CS Press, 2008, pp. 1–8.
8. D.M. Mount and S. Arya, "ANN: A Library for Approximate Nearest Neighbor Searching," 2006; www.cs.umd.edu/~mountANN.
9. A. Andoni and P. Indyk, "Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions," Comm. ACM, vol. 51, no. 1, 2008, pp. 117–122.
10. Z. Wang et al., "Image Quality Assessment: From Error Visibility to Structural Similarity," IEEE Trans. Image Processing, vol. 13, no. 4, 2004, pp. 600–612.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool