The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2011 vol.23)
pp: 540-553
Abhijith Kashyap , SUNY Buffalo, Buffalo
Michalis Petropoulos , SUNY Buffalo, Buffalo
Sotiria Tavoulari , Yale University, New Haven
ABSTRACT
Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according to their MeSH annotations. MeSH is a comprehensive concept hierarchy used by PubMed. In this paper, we present the BioNav system, a novel search interface that enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, the query results are organized into a navigation tree. At each node expansion step, BioNav reveals only a small subset of the concept nodes, selected such that the expected user navigation cost is minimized. In contrast, previous works expand the hierarchy in a predefined static manner, without navigation cost modeling. We show that the problem of selecting the best concepts to reveal at each node expansion is NP-complete and propose an efficient heuristic as well as a feasible optimal algorithm for relatively small trees. We show experimentally that BioNav outperforms state-of-the-art categorization systems by up to an order of magnitude, with respect to the user navigation cost. BioNav for the MEDLINE database is available at http://db.cse.buffalo.edu/bionav.
INDEX TERMS
Interactive data exploration and discovery, search process, graphical user interfaces, interaction styles.
CITATION
Abhijith Kashyap, Michalis Petropoulos, Sotiria Tavoulari, "Effective Navigation of Query Results Based on Concept Hierarchies", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 4, pp. 540-553, April 2011, doi:10.1109/TKDE.2010.135
REFERENCES
[1] J.S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis, "Automated Ranking of Database Query Results," Proc. First Biennial Conf. Innovative Data Systems Research, 2003.
[2] K. Chakrabarti, S. Chaudhuri, and S.W. Hwang, "Automatic Categorization of Query Results," Proc. ACM SIGMOD, pp. 755-766, 2004.
[3] Z. Chen and T. Li, "Addressing Diverse User Preferences in SQL-Query-Result Navigation," Proc. ACM SIGMOD, pp. 641-652, 2007.
[4] L. Comtet, Advanced Combinatorics: The Art of Finite and Infinite Expansions, pp. 176-177, Reidel, 1974.
[5] R. Delfs, A. Doms, A. Kozlenkov, and M. Schroeder, "GoPubMed: Ontology-Based Literature Search Applied to Gene Ontology and PubMed," Proc. German Conf. Bioinformatics, pp. 169-178, 2004.
[6] D. Demner-Fushman and J. Lin, "Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering," Proc. Int'l Conf. Computational Linguistics and Ann. Meeting of the Assoc. for Computational Linguistics, pp. 841-848, 2006.
[7] Entrez Programming Utilities, http://www.ncbi.nlm.nih.gov/entrez/query/ staticeutils_help.html, 2008.
[8] U. Feige, D. Peleg, and G. Kortsarz, "The Dense k-Subgraph Problem," Algorithmica, vol. 29, pp. 410-421, 2001.
[9] V. Hristidis and Y. Papakonstantinou, "DISCOVER: Keyword Search in Relational Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2002.
[10] R. Hoffman and A. Valencia, "A Gene Network for Navigating the Literature," Nature Genetics, vol. 36, no. 7, p. 664, 2004.
[11] iHOP—Information Hyperlinked over Protein, http://www. ihop-net.org/UniPubiHOP/, 2008.
[12] M. Kaki, "Findex: Search Results Categories Help When Document Ranking Fails," Proc. ACM SIGCHI Conf. Human Factors in Computing Systems, pp. 131-140, 2005.
[13] A. Kashyap, V. Hristidis, M. Petropoulos, and S. Tavoulari, "BioNav: Effective Navigation on Query Results of Biomedical Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE), (short paper), pp. 1287-1290, 2009.
[14] S. Kundu and J. Misra, "A Linear Tree Partitioning Algorithm," SIAM J. Computing, vol. 6, no. 1, pp. 151-154, 1977.
[15] W. Lee, L. Raschid, H. Sayyadi, and P. Srinivasan, "Exploiting Ontology Structure and Patterns of Annotation to Mine Significant Associations between Pairs of Controlled Vocabulary Terms," Proc. Fifth Int'l Workshop Data Integration in the Life Sciences (DILS), pp. 44-60, 2008.
[16] J. Lin and W.J. Wilbur, "Pubmed Related Articles: A Probabilistic Topic Based Model for Content Similarity," BMC Bioinformatics, vol. 8, article no. 423, 2007.
[17] D. Lindberg, B. Humphreys, and A. McCray, "The Unified Medical Language System," Methods of Information in Medicine, vol. 32, no. 4, pp. 281-291, 1993.
[18] D. Maglott, J. Ostell, K.D. Pruitt, and T. Tatusova, "Entrez Gene: Gene-Centered Information at NCBI," Nucleic Acids Research, vol. 33, pp. D54-D58, Jan. 2005.
[19] Medical Subject Headings (MeSH), http: //www.nlm.nih.govmesh/, 2010.
[20] J.A. Mitchell, A.R. Aronson, and J.G. Mork, "Gene Indexing: Characterization and Analysis of NLM's GeneRIFs," Proc. AMIA Ann. Symp., pp. 460-464, Nov.
[21] OMIM—Online Mendelian Inheritance in Man, http://www.ncbi.nlm.nih.govOmim/, 2008.
[22] C. Perez-Iratxeta, P. Bork, and M.A. Andrade, "Exploring MEDLINE Abstracts with XplorMed," Drugs of Today, vol. 38, pp. 381-389, 2002.
[23] C. Plake, T. Schiemann, M. Pankalla, J. Hakenberg, and U. Leser, "Ali Baba: PubMed as a Graph," Bioinformatics, vol. 22, no. 19, pp. 2444-2445, 2006.
[24] PubMatrix: A Tool for Multiplex Literature Mining, http:/pubmatrix.grc.nia.nih.gov/, 2003.
[25] PubMed PubReMiner: A Tool for PubMed Query Building and Literature Mining, http://bioinfo.amc.uva.nl/human-genetics pubreminer/, 2008.
[26] H. Shatkay and R. Feldman, "Mining the Biomedical Literature in the Genomic Era: An Overview," J. Computational Biology, vol. 10, no. 6, pp. 821-855, 2003.
[27] Stanford Univ.—HighWire Press, http:/highwire.stanford.edu/, 2008.
[28] Transinsight GmbH—GoPubMed, http:/www.gopubmed.org/, 2008.
[29] Vivísimo, Inc.—Clusty, http:/clusty.com/, 2008.
[30] XplorMed: eXploring Medline abstracts, http://www.ogic.ca/projectsxplormed/, 2008.
[31] T. Zhang, R. Ramakrishnan, and M. Livny, "BIRCH: An Efficient Data Clustering Method for Very Large Databases," Proc. ACM SIGMOD, pp. 103-114, 1996.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool