This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Answering General Time-Sensitive Queries
February 2012 (vol. 24 no. 2)
pp. 220-235
Wisam Dakka, Google, California
Luis Gravano, Columbia University, New York
Panagiotis G. Ipeirotis, New York University, New York
Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating topically similar documents for a query. Unfortunately, topic similarity alone is not always sufficient for document ranking. In this paper, we observe that, for an important class of queries that we call time-sensitive queries, the publication time of the documents in a news archive is important and should be considered in conjunction with the topic similarity to derive the final document ranking. Earlier work has focused on improving retrieval for “recency” queries that target recent documents. We propose a more general framework for handling time-sensitive queries and we automatically identify the important time intervals that are likely to be of interest for a query. Then, we build scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism. We present an extensive experimental evaluation using a variety of news article data sets, including TREC data as well as real web data analyzed using the Amazon Mechanical Turk. We examine several techniques for detecting the important time intervals for a query over a news archive and for incorporating this information in the retrieval process. We show that our techniques are robust and significantly improve result quality for time-sensitive queries compared to state-of-the-art retrieval techniques.

[1] R. Jones and F. Diaz, "Temporal Profiles of Queries," ACM Trans. Information Systems, vol. 25, no. 3,article 14, 2007.
[2] X. Li and W.B. Croft, "Time-Based Language Models," Proc. 12th ACM Conf. Information and Knowledge Management (CIKM '03), 2003.
[3] D. Metzler and W.B. Croft, "Combining the Language Model and Inference Network Approaches to Retrieval," Information Processing and Management, vol. 40, no. 5, pp. 735-750, Sept. 2004.
[4] S.E. Robertson, S. Walker, M. Hancock-Beaulieu, A. Gull, and M. Lau, "Okapi at TREC," Proc. Fourth Text REtrieval Conf. (TREC-4), 1994.
[5] S.E. Robertson, "Overview of the Okapi Projects," J. Documentation, vol. 53, no. 1, pp. 3-7, 1997.
[6] K.S. Jones, S. Walker, and S.E. Robertson, "A Probabilistic Model of Information Retrieval: Development and Comparative Experiments - Part 1," Information Processing and Management, vol. 36, no. 6, pp. 779-808, 2000.
[7] K.S. Jones, S. Walker, and S.E. Robertson, "A Probabilistic Model of Information Retrieval: Development and Comparative Experiments - Part 2," Information Processing and Management, vol. 36, no. 6, pp. 809-840, 2000.
[8] W. Dakka, L. Gravano, and P.G. Ipeirotis, "Answering General Time-Sensitive Queries," Proc. 17th ACM Conf. Information and Knowledge Management (CIKM '08), pp. 1437-1438, 2008.
[9] J.M. Ponte and W.B. Croft, "A Language Modeling Approach to Information Retrieval," Proc. 21st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '98), 1998.
[10] F. Song and W.B. Croft, "A General Language Model for Information Retrieval," Proc. Eighth ACM Conf. Information and Knowledge Management (CIKM '99), 1999.
[11] N. Craswell, S.E. Robertson, H. Zaragoza, and M. Taylor, "Relevance Weighting for Query Independent Evidence," Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), 2005.
[12] S. Brin and L. Page, "The Anatomy of a Large-Scale Hypertextual Web Search Engine," Proc. Seventh Int'l World Wide Web Conf. (WWW '98), 1998.
[13] V. Lavrenko and W.B. Croft, "Relevance-Based Language Models," Proc. 24th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '01), 2001.
[14] S.E. Robertson, "The Probability Ranking Principle in IR," Readings in Information Retrieval, pp. 281-286, Morgan Kaufmann, 1997.
[15] S.E. Robertson, S. Walker, and M. Hancock-Beaulieu, "Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive Track," Proc. Seventh Text REtrieval Conf. (TREC-7), 1998.
[16] N. Craswell, H. Zaragoza, and S.E. Robertson, "Microsoft Cambridge at TREC-14: Enterprise Track," Proc. 14th Text REtrieval Conf. (TREC-14), 2005.
[17] K. McKeown, R. Barzilay, D. Evans, V. Hatzivassiloglou, J. Klavans, A. Nenkova, C. Sable, B. Schiffman, and S. Sigelman, "Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster," Proc. Second Int'l Conf. Human Language Technology (HLT '02), 2002.
[18] R. Krovetz, "Viewing Morphology as an Inference Process," Proc. 16th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '93), 1993.
[19] F. Diaz, "Personal Communication," 2007.
[20] E.M. Voorhees and D. Harman, "Overview of TREC-9," Proc. Ninth Text REtrieval Conf. (TREC-9), 2001.
[21] J.P. Marques de Sá, Applied Statistics. Springer Verlag, 2003.
[22] J.D. Lafferty and C. Zhai, "Document Language Models, Query Models, and Risk Minimization for Information Retrieval," Proc. 24th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '01), 2001.
[23] G. Mishne, "Using Blog Properties to Improve Retrieval," Proc. First Int'l Conf. Weblogs and Social Media (ICWSM '07), 2007.
[24] W. Kraaij, T. Westerveld, and D. Hiemstra, "The Importance of Prior Probabilities for Entry Page Search," Proc. 25th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '02), 2002.
[25] A. Berger and R. Miller, "Just-in-Time Language Modelling," Proc. IEEE Int'l Conf. Acoustics, Speech and Signal Processing (ICASSP '98), 1998.
[26] D.M. Blei and J.D. Lafferty, "Dynamic Topic Models," Proc. 23rd Int'l Conf. Machine Learning (ICML '06), 2006.
[27] K.-Y. Chen, L. Luesukprasert, and S.-C.T. Chou, "Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 8, pp. 1016-1025, Aug. 2007.
[28] R.C. Swan and J. Allan, "Automatic Generation of Overview Timelines," Proc. 23rd Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '00), 2000.
[29] H.L. Chieu and Y.K. Lee, "Query Based Event Extraction along a Timeline," Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '04), 2004.
[30] Y. Yang, A. Lad, N. Lao, A. Harpale, B. Kisiel, and M. Rogati, "Utility-Based Information Distillation over Temporally Sequenced Documents," Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), 2007.
[31] F. Diaz and R. Jones, "Using Temporal Profiles of Queries for Precision Prediction," Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '04), 2004.
[32] I. Mani, J. Pustejovsky, and R. Gaizauskas, The Language of Time: A Reader. Oxford Univ. Press, 2005.

Index Terms:
Information search and retrieval, processing time-sensitive queries, time-sensitive search.
Citation:
Wisam Dakka, Luis Gravano, Panagiotis G. Ipeirotis, "Answering General Time-Sensitive Queries," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 2, pp. 220-235, Feb. 2012, doi:10.1109/TKDE.2010.187
Usage of this product signifies your acceptance of the Terms of Use.