Issue No.02 - March/April (2009 vol.24)
Tuukka Ruotsalo , Helsinki University of Technology
Lora Aroyo , Vrije Universiteit Amsterdam
Guus Schreiber , Vrije Universiteit Amsterdam
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2009.32
The authors present a method for automatic annotation of objects in digital cultural heritage collections. Given a set of objects each accompanied by a text description, a set of structured vocabularies, a metadata schema, and a training set of annotations of the text descriptions, the method produces annotations for the objects. These annotations consist of structured vocabulary concepts or named entities (for example, Paris as a city) and metadata schema roles that each concept plays in an annotation (for example, Paris as a subject matter). The method focuses on identifying the metadata schema roles. The authors have evaluated the method using the ARIA collection from Rijksmuseum Amsterdam. The evaluation used four structured vocabularies, an artwork annotation schema, and a collection of natural language descriptions of artworks. The method achieved 61.2 percent accuracy in role identification, outperforming the baseline method without background knowledge (p < 0.01), which achieved 57.8 percent accuracy. Human annotators achieved 65.1 percent accuracy.
natural language processing, intelligent Web services, Semantic Web, machine learning, cultural heritage
Tuukka Ruotsalo, Lora Aroyo, Guus Schreiber, "Knowledge-Based Linguistic Annotation of Digital Cultural Heritage Collections", IEEE Intelligent Systems, vol.24, no. 2, pp. 64-75, March/April 2009, doi:10.1109/MIS.2009.32