loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining (ICDM'06)
Dirichlet Aspect Weighting: A Generalized EM Algorithm for Integrating External Data Fields with Semantically Structured Queries by Using Gradient Projection Method
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
Atulya Velivelli, University of Illinois at Urbana-Champaign, USA
Thomas S. Huang, University of Illinois at Urbana-Champaign, USA
In this paper we address the problem of document retrieval with semantically structured queries - queries where each term has a tagged field label. We introduce Dirichlet Aspect Weighting model which integrates terms from external databases into the query language model in a bayesian learning framework. For this model, the dirichlet prior distribution is governed by parameters which depend on the number of fields in the external databases. This model needs additional examples to be augmented to the semantically structured query. These examples are obtained using pseudo relevance feedback. We formulate a loglikelihood function for the Dirichlet Aspect Weighting model and maximize it using a novel Generalized EM algorithm. Comparison of the results of Dirichlet Aspect Weighting model on TREC 2005 Genomics Track dataset with baseline methods using pseudo relevance feedback, while incorporating terms from external databases shows an improvement.
Citation:
Atulya Velivelli, Thomas S. Huang, "Dirichlet Aspect Weighting: A Generalized EM Algorithm for Integrating External Data Fields with Semantically Structured Queries by Using Gradient Projection Method," icdm, pp.633-644, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.