This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying
August 2002 (vol. 24 no. 8)
pp. 1026-1038

Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation that provides a transformation from the raw pixel data to a small set of image regions that are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. We describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions that roughly correspond to objects, we allow querying at the level of objects rather than global image properties. We present results indicating that querying for images using Blobworld produces higher precision than does querying using color and texture histograms of the entire image in cases where the image contains distinctive objects.

[1] Special issue on digital libraries, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 8, Aug. 1996.
[2] J. Ashley, R. Barber, M. Flickner, J. Hafner, D. Lee, W. Niblack, and D. Petkovic, “Automatic and Semiautomatic Methods for Image Annotation and Retrieval in QBIC,” SPIE Proc. Storage and Retrieval for Image and Video Databases, pp. 24-35, 1995.
[3] S. Ayer and H. Sawhney, "Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and mdl Encoding," Int'l Conf. Computer Vision, pp. 777-784,Cambridge, Mass., June 1995.
[4] S. Belongie, C. Carson, H. Greenspan, and J. Malik, “Color- and Texture-Based Image Segmentation Using EM and Its Application to Content-Based Image Retrieval,” Proc. Int'l Conf. Computer Vision, pp. 675-682, 1998.
[5] J. Bigün, G. Granlund, and J. Wiklund, “Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 775-790, Aug. 1991.
[6] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Region-Based Image Querying,” Proc. Int'l Workshop Content-Based Access of Image and Video libraries, 1997.
[7] C. Carson, M. Thomas, S. Belongie, J.M. Hellerstein, and J. Malik, “Blobworld: A System for Region-Based Image Indexing and Retrieval,” Proc. Visual Information Systems, pp. 509-516, June 1999.
[8] A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Ser. B, vol. 39, no. 1, pp. 1-38, 1977.
[9] P. Enser, “Query Analysis in a Visual Information Retrieval Context,” J. Document and Text Management, vol. 1, no. 1, pp. 25-52, 1993.
[10] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by Image and Video Content: The QBIC System,” IEEE Computer, 1995.
[11] W. Förstner, “A Framework for Low Level Feature Extraction,” Proc. European Conf. Computer Vision, pp. 383-394, 1994.
[12] D. Forsyth and M. Fleck, “Body Plans,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 678-683, 1997.
[13] D. Forsyth, J. Malik, and R. Wilensky, “Searching for Digital Pictures,” Scientific Am., vol. 276, no. 6, pp. 72-77, June 1997.
[14] W.T. Freeman and E.H. Adelson, "The Design and Use of Steerable Filters," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 891-906, 1991.
[15] J. Gårding and T. Lindeberg, “Direct Computation of Shape Cues Using Scale-Adapted Spatial Derivative Operators,” Int'l J. Computer Vision, vol. 17, no. 2, pp. 163-191, 1996.
[16] G.H. Granlund and H. Knuttson, Signal Processing for Computer Vision. Kluwer, 1995.
[17] A. Gupta and R. Jain, “Visual Information Retrieval,” Comm. ACM, vol. 40, no. 5, pp. 70-79, May 1997.
[18] J. Hafner, H.S. Sawhney, W. Equitz, M. Flickner, and W. Niblack, “Efficient Color Histogram Indexing for Quadratic Form Distance Functions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 729-736, July 1995.
[19] D. Harman, “Relevance Feedback and Other Query Modification Techniques,” Information Retrieval: Data Structures&Algorithms, W.B. Frakes and R. Baeza-Yates, eds. Prentice Hall 1992.
[20] J. Huang, S.R. Kumar, M. Mitra, W. Zhu, and R. Zabih, Image Indexing Using Color Correlograms Proc. Computer Vision and Pattern Recognition, pp. 762-768, 1997.
[21] C.E. Jacobs and A. Finkelstein, S.H. Salesin, “Fast Multiresolution Image Querying,” Proc. SIGGRAPH, 1995.
[22] A.K. Jain and F. Farrokhnia, “Unsupervised Texture Segmentation Using Gabor Filters,” Pattern Recognition, vol. 24, no. 12, pp. 1167-1186, 1991.
[23] J.-S. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing. Prentice Hall 1997.
[24] P. Kelly, M. Cannon, and D. Hush, “Query by Image Example: The CANDID Approach,” SPIE Proc. Storage and Retrieval for Image and Video Databases, pp. 238-248, 1995.
[25] T. Leung and J. Malik, “Detecting, Localizing and Grouping Repeated Scene Elements from an Image,” Proc. European Conf. Computer Vision, pp. 546-555, 1996.
[26] P. Lipson, E. Grimson, and P. Sinha, “Configuration Based Scene Classification and Image Indexing,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 1007-1013, 1997.
[27] W.Y. Ma and B.S. Manjunath, “NETRA: A Toolbox for Navigating Large Image Databases,” Proc. IEEE Int'l Conf. Image Processing, 1997.
[28] J. Malik and P. Perona, “Preattentive Texture Discrimination with Early Vision Mechanisms,” J. Optical Soc. Am. A, vol. 7, no. 5, pp. 923-932, 1990.
[29] T.P. Minka and R.W. Picard, "Interactive Learning Using a 'Society of Models,'" Pattern Recognition, 1996. To appear. Also appears as MIT Media Lab Perceptual Computing, TR #349.
[30] V.E. Ogle, “CHABOT—Retrieval from a Relational Database of Images,” Computer, vol. 28, no. 9, pp. 40-48, Sept. 1995.
[31] D.K. Panjwani and G. Healey, “Markov Random Field Models for Unsupervised Segmentation of Textured Color Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, pp. 939-954, 1995.
[32] A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-Based Manipulation of Image Databases,” Int'l J. Computer Vision, vol. 18, no. 3, pp. 233-254, 1996.
[33] J. Ponce, A. Zisserman, and M. Hebert, “Object Representation in Computer Vision—II,” Lecture Notes in Computer Science, No. 1144, 1996.
[34] J. Puzicha and J.M. Buhmann, “Multiscale Annealing for Real-Time Unsupervised Texture Segmentation,” Proc. Int'l Conf. Computer Vision, pp. 267-273, 1998.
[35] J. Rissanen, “Modeling by Shortest Data Description,” Automatica vol. 14, pp. 465-471, 1978.
[36] J. Rissanen, Stochastic Complexity in Statistical Inquiry. World Scientific Series in Computer Science, vol. 15, 1989.
[37] C. Schmid and R. Mohr, “Combining Grey Value Invariants with Local Constraints for Object Recognition,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 872-877, 1996.
[38] G. Schwarz, “Estimating the Dimension of a Model,” Annals of Statistics, vol. 6, pp. 461-464, 1978.
[39] J.R. Smith and S.-F. Chang, “Single Color Extraction and Image Query,” Proc. IEEE Int'l Conf. Image Processing, pp. 528-531, 1995.
[40] J.R. Smith and S.-F. Chang, “Tools and Techniques for Color Image Retrieval,” SPIE Proc. Storage and Retrieval for Image and Video Databases, vol. 2670, pp. 426-437, 1996.
[41] M. Stricker and A. Dimai, “Spectral Covariance and Fuzzy Regions for Image Indexing,” Machine Vision and Applications, vol. 10, no. 2, pp. 66-73, 1997.
[42] M. Stricker and M. Swain, “The Capacity and the Sensitivity of Color Histogram Indexing,” Technical Report 94-05, Univ. of Chicago, Mar. 1994.
[43] M.J. Swain and B.H. Ballard, “Color Indexing,” Int'l J. Computer Vision, vol. 7, no. 1, pp. 11-32, 1991.
[44] Y. Weiss and E. Adelson, “A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.
[45] W. Wells, R. Kikinis, W. Grimson, and F. Jolesz, “Adaptive Segmentation of MRI Data,” Int'l Conf. Computer Vision, Virtual Reality, and Robotics in Medicine, pp. 59-69, 1995.
[46] M. Wertheimer, “Laws of Organization in Perceptual Forms,” A Source Book of Gestalt Psychology, W.D. Ellis, ed. Harcourt Brace, 1938.
[47] G. Wyszecki and W. Stiles, Color Science: Concepts and Methods, Quantitative Data and Formulae, second ed. Wiley, 1982.

Index Terms:
Segmentation and grouping, image retrieval, image querying, clustering, Expectation-Maximization.
Citation:
Chad Carson, Serge Belongie, Hayit Greenspan, Jitendra Malik, "Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002, doi:10.1109/TPAMI.2002.1023800
Usage of this product signifies your acceptance of the Terms of Use.