First IEEE International Conference on Data Mining (ICDM'01) Subject Classification in the Oxford English Dictionary San Jose, California November 29-December 02 ISBN: 0-7695-1119-8
The oxford English Dictionary is a valuable source of lexical information and a rich testing ground for mining highly structured text. Each entry is organized into a hierarchy of senses, which include definitions, labels and cited quotations. Subject labels distinguish the subject classification of a sense, for example they signal how a word may be used in Anthropology, Music or Computing. Unfortunately subject labeling in the dictionary is incomplete. To overcome this incompleteness, we attempt to classify the senses (i.e., definitions) in the dictionary by their subjects, using the citations as an information guide. We report on four different approaches: K Nearest Neighbors, a standard classification technique; Term Weighting, an information retrieval method dealing with text; Naïve Bayes, a probabilistic method; and Expectation Maximization, An iterative probabilistic method. Experimental performance of these Methods is compared based on standard classification metrics.
Citation:
Zarrin Langari, Frank Wm. Tompa, "Subject Classification in the Oxford English Dictionary," icdm, pp.329, First IEEE International Conference on Data Mining (ICDM'01), 2001 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||