|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99)
Text Retrieval by Using k-word Proximity Search
Kyoto, Japan
November 28-November 30
ISBN: 0-7695-0496-5
| ASCII Text | x | ||
| Kunihiko Sadakane, Hiroshi Imai, "Text Retrieval by Using k-word Proximity Search," Database Applications in Non-Traditional Environments, International Symposium on, pp. 183, 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99), 1999. | |||
| BibTex | x | ||
| @article{ 10.1109/DANTE.1999.844958, author = {Kunihiko Sadakane and Hiroshi Imai}, title = {Text Retrieval by Using k-word Proximity Search}, journal ={Database Applications in Non-Traditional Environments, International Symposium on}, volume = {0}, year = {1999}, isbn = {0-7695-0496-5}, pages = {183}, doi = {http://doi.ieeecomputersociety.org/10.1109/DANTE.1999.844958}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Database Applications in Non-Traditional Environments, International Symposium on TI - Text Retrieval by Using k-word Proximity Search SN - 0-7695-0496-5 SP EP A1 - Kunihiko Sadakane, A1 - Hiroshi Imai, PY - 1999 KW - proximity search KW - text retrieval KW - plane-sweep KW - divide-and-conquer VL - 0 JA - Database Applications in Non-Traditional Environments, International Symposium on ER - | |||
When we search from a huge amount of documents, we often specify several keywords and use conjunctive queries to narrow the result of the search. Though the searched documents contain all keywords, positions of the keywords are usually not considered. As the result, the search result contains some meaningless documents. It is therefore effective to rank documents according to proximity of keywords in the documents. This ranking is regarded as a kind of text data mining.In this paper, we propose two algorithms for finding documents in which all given keywords appear in neighboring places. One is based on plane-sweep algorithm and the other is based on divide-and-conquer approach. Both algorithms run in O (n log n) time where n is the number of occurrences of given keywords. We run the plane-sweep algorithm on a large collection of html files and verify its effectiveness.
Index Terms:
proximity search, text retrieval, plane-sweep, divide-and-conquer
Citation:
Kunihiko Sadakane, Hiroshi Imai, "Text Retrieval by Using k-word Proximity Search," dante, pp.183, 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99), 1999
Usage of this product signifies your acceptance of the Terms of Use.
