Issue No. 03 - March (2003 vol. 25)
Jennifer G. Dy , IEEE
Avi Kak , IEEE
<p><b>Abstract</b>—This paper describes a new hierarchical approach to content-based image retrieval called the “customized-queries” approach (CQA). Contrary to the single feature vector approach which tries to classify the query and retrieve similar images in one step, CQA uses multiple feature sets and a two-step approach to retrieval. The first step classifies the query according to the class labels of the images using the features that best discriminate the classes. The second step then retrieves the most similar images within the predicted class using the features customized to distinguish “subclasses” within that class. Needing to find the customized feature subset for each class led us to investigate feature selection for unsupervised learning. As a result, we developed a new algorithm called FSSEM (feature subset selection using expectation-maximization clustering). We applied our approach to a database of high resolution computed tomography lung images and show that CQA radically improves the retrieval precision over the single feature vector approach. To determine whether our CBIR system is helpful to physicians, we conducted an evaluation trial with eight radiologists. The results show that our system using CQA retrieval doubled the doctors' diagnostic accuracy.</p>
Image retrieval, feature selection, clustering, expectation-maximization, unsupervised learning.
J. G. Dy, C. E. Brodley, A. M. Aisen, A. Kak and L. S. Broderick, "Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 25, no. , pp. 373-378, 2003.