International Conference on Semantic Computing (ICSC 2007) LDA-Based Retrieval Framework for Semantic News Video Retrieval Irvine, California September 17-September 19 ISBN: 0-7695-2997-6
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICSC.2007.26
Topic-based language model has attracted much attention as the propounding of semantic retrieval in recent years. Especially for the ASR text with errors, the topic representation is more reasonable than the exact term representation. Among these models, Latent Dirichlet Allocation(LDA) has been noted for its ability to discover the latent topic structure, and is broadly applied in many text-related tasks. But up to now its application in information retrieval(IR) is still limited to be a supplement to the standard document models, and furthermore, it has been pointed out that directly employing the basic LDA model will hurt retrieval performance. In this paper, we propose a lexicon-guided two-level LDA retrieval framework. It uses the HowNet to guide the first-level LDA model?s parameter estimation, and further construct the second-level LDA models based on the first-level?s inference results. We use TRECVID 2005 ASR collection to evaluate it, and compare it with the vector space model(VSM) and latent semantic Indexing(LSI). Our experiments show the proposed method is very competitive.
Index Terms:
ASR text, LDA, Topic-based model, Semantic video retrieval
Citation:
Juan Cao, Jintao Li, Yongdong Zhang, Sheng Tang, "LDA-Based Retrieval Framework for Semantic News Video Retrieval," icsc, pp.155-160, International Conference on Semantic Computing (ICSC 2007), 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||