Issue No. 03 - May/June (2002 vol. 14)
<p><b>Abstract</b>—This paper presents an intelligent Internet information system, Automatic Classifier for the Internet Resource Discovery (ACIRD), which uses machine learning techniques to organize and retrieve Internet documents. ACIRD consists of a knowledge acquisition process, document classifier, and two-phase search engine. The knowledge acquisition process of ACIRD automatically learns classification knowledge from classified Internet documents. The document classifier applies learned classification knowledge to classify newly collected Internet documents into one or more classes. Experimental results indicate that ACIRD performs as well or better than human experts in both knowledge acquisition and document classification. By using the learned classification knowledge and the given class lattice, the ACIRD two-phase search engine responds to user queries with hierarchically structured navigable results (instead of a conventional flat ranked document list), which greatly aids users in locating information from numerous, diversified Internet documents.</p>
Document classification, data mining, information retrieval, search engine
S. Lin, M. Chen, J. Ho and Y. Huang, "ACIRD: Intelligent Internet Document Organization and Retrieval," in IEEE Transactions on Knowledge & Data Engineering, vol. 14, no. , pp. 599-614, 2002.