Issue No. 07 - July (2016 vol. 28)
Nikolaos Passalis , Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Anastasios Tefas , Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
In this paper, we present a supervised dictionary learning method for optimizing the feature-based Bag-of-Words (BoW) representation towards Information Retrieval. Following the cluster hypothesis, which states that points in the same cluster are likely to fulfill the same information need, we propose the use of an entropy-based optimization criterion that is better suited for retrieval instead of classification. We demonstrate the ability of the proposed method, abbreviated as EO-BoW, to improve the retrieval performance by providing extensive experiments on two multi-class image datasets. The BoW model can be applied to other domains as well, so we also evaluate our approach using a collection of 45 time-series datasets, a text dataset, and a video dataset. The gains are three-fold since the EO-BoW can improve the mean Average Precision, while reducing the encoding time and the database storage requirements. Finally, we provide evidence that the EO-BoW maintains its representation ability even when used to retrieve objects from classes that were not seen during the training.
Dictionaries, Feature extraction, Histograms, Training, Time series analysis, Entropy, Optimization
N. Passalis and A. Tefas, "Entropy Optimized Feature-Based Bag-of-Words Representation for Information Retrieval," in IEEE Transactions on Knowledge & Data Engineering, vol. 28, no. 7, pp. 1664-1677, 2016.