The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March/April (2009 vol.24)
pp: 64-75
Tuukka Ruotsalo , Helsinki University of Technology
Lora Aroyo , Vrije Universiteit Amsterdam
Guus Schreiber , Vrije Universiteit Amsterdam
ABSTRACT
The authors present a method for automatic annotation of objects in digital cultural heritage collections. Given a set of objects each accompanied by a text description, a set of structured vocabularies, a metadata schema, and a training set of annotations of the text descriptions, the method produces annotations for the objects. These annotations consist of structured vocabulary concepts or named entities (for example, Paris as a city) and metadata schema roles that each concept plays in an annotation (for example, Paris as a subject matter). The method focuses on identifying the metadata schema roles. The authors have evaluated the method using the ARIA collection from Rijksmuseum Amsterdam. The evaluation used four structured vocabularies, an artwork annotation schema, and a collection of natural language descriptions of artworks. The method achieved 61.2 percent accuracy in role identification, outperforming the baseline method without background knowledge (p < 0.01), which achieved 57.8 percent accuracy. Human annotators achieved 65.1 percent accuracy.
INDEX TERMS
natural language processing, intelligent Web services, Semantic Web, machine learning, cultural heritage
CITATION
Tuukka Ruotsalo, Lora Aroyo, Guus Schreiber, "Knowledge-Based Linguistic Annotation of Digital Cultural Heritage Collections", IEEE Intelligent Systems, vol.24, no. 2, pp. 64-75, March/April 2009, doi:10.1109/MIS.2009.32
REFERENCES
1. L. Hollink et al., "Classification of User Image Descriptions," Int'l J. Human Computer Studies, vol. 61, no. 5, 2004, pp. 501–626.
2. J. Kekäläinen, and K. Järvelin, "The Co-effects of Query Structure and Expansion on Retrieval Performance in Probabilistic Text Retrieval," Information Retrieval, vol. 1, no. 4, 2000, pp. 329–344.
3. K.P. Yee et al., "Faceted Metadata for Image Search and Browsing, Proc. SIGCHI Conf. Human Factors in Computing Systems, ACM Press, 2003, pp. 401–408.
4. G. Schreiber et al., "Semantic Annotation and Search of Cultural-Heritage Collections: The MultimediaN e-Culture Demonstrator," J. Web Semantics, vol. 6, no. 4, 2008, pp. 243–249.
5. A.T. Schreiber et al., "Ontology-Based Photo Annotation," IEEE Intelligent Systems, vol. 16, no. 3, 2001, pp. 66–74.
6. P. Buitelaar and T. Declerck, "Linguistic Annotation for the Semantic Web," Annotation for the Semantic Web, S. Handschuh, and S. Staab eds., IOS Press, 2003.
7. D. Gildea, and D. Jurafsky, "Automatic Labeling of Semantic Roles," Computational Linguistics, vol. 28, no. 3, 2002, pp. 245–288.
8. D. Klein and C.D. Manning, "Fast Exact Inference with a Factored Model for Natural Language Parsing," Advances in Neural Information Processing Systems (NIPS 02), S. Becker, S. Thrun, and K. Obermayer eds., MIT Press, 2002, pp. 3–10.
9. M. Marcus, B. Santorini, and M.A. Marcinkiewicz, "Building a Large Annotated Corpus of English: The Penn Treebank," Computational Linguistics, vol. 19, no. 2, 1993, pp. 313–330.
10. J.R. Finkel, T. Grenager, and C. Manning, "Incorporating Non-Local Information into Information Extraction Systems by Gibbs Sampling," Proc. 43rd Ann. Meeting Assoc. for Computational Linguistics (ACL 05), Assoc. for Computational Linguistics, 2005, pp. 363–370.
11. E.F. Tjong Kim Sang and F. De Meulder, "Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition," Proc. 7th Conf. Natural Language Learning (CoNLL 03), Morgan Kaufmann, 2003, pp. 142–147.
12. S. Pradhan et al., "Support Vector Learning for Semantic Argument Classification," Machine Learning, vol. 60, nos. 1–3, 2005, pp. 11–39.
13. A. Ben-David, "About the Relationship between ROC Curves and Cohen's Kappa," Eng. Applications of Artificial Intelligence, vol. 21, no. 6, 2008, pp. 874–882.
14. S.S. Pradhan, W. Ward, and J.H. Martin, "Towards Robust Semantic Role Labeling," Computational Linguistics, vol. 34, no. 2, 2008, pp. 289–310.
15. N. Xue and M. Palmer, "Calibrating Features for Semantic Role Labeling," Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP 04), Assoc. of Computational Linguistics, 2004, pp. 88–94.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool