, Wayne State University
, University of Michigan-Dearborn
, Kettering University
Pages: pp. 16-17
Abstract—The Semantic Web enables programs and agents to automatically understand what data is about, and therefore bridge the so-called semantic gap between the ways in which users request Web resources and the real needs of those users, ultimately improving the quality of Web information retrieval. This issue presents four expanded articles from The First International Workshop on the Many Faces of Multimedia Semantics.
Keywords—Semantic Web, Web 2.0, Guest Editors' Introduction, multimedia and graphics
Global information is increasingly becoming ubiquitous and pervasive, with the Web now serving as its primary repository. However, the rapid growth in the amount of information on the Web creates new challenges for information retrieval. Recently, there has been a growing interest in the investigation and development of the next-generation Web—the Semantic Web. The Semantic Web enables programs and agents to automatically parse what data is about, and therefore bridge the so-called semantic gap between the ways in which users request Web resources and the real needs of those users, ultimately improving the quality of Web information retrieval.
Multimedia information has always been part of the Semantic Web paradigm, but requires substantial effort to integrate both domain-dependent and media-dependent knowledge. The World Wide Web Consortium's incubator group on multimedia semantics (see http://www.w3.org/2005/Incubator/mmsem/) published deliverables on this subject (see http://www.w3.org/2005/Incubator/mmsem/#Deliverables), including several use cases (see http://www.w3.org/ 2005/Incubator/mmsem/XGR-image-annotation/ #use_cases/).
We believe that, in addition to trying to express a media object's hidden meaning explicitly, one should formulate ways of managing media objects to help people make more intelligent use of them. We also believe that the relationship between users and media objects should be studied closely and that media objects should be interpreted relative to the particular goal or point of view of a particular user at a particular time. Content-based descriptors are necessary for doing so.
While major search engines are in the process of rolling out audiovisual search capabilities, such descriptions are definitely not sufficient. Context is important in these scenarios, and must be managed to make such searches truly useful. In light of these issues, research teams around the world have begun to work on multimedia semantics to study the measured interactions between users and media objects, with the ultimate goal of trying to satisfy the user community by providing them with the media objects they require, on the basis of their individual previous media interactions.
The arrival of Web 2.0 has added new paradigms to the media mix. Concepts such as a folksonomy, a form of emergent semantics, introduce a collaborative, dynamic approach to the generation of ontologies and media-object semantics. That such an approach results in a stable semantics, although surprising, was demonstrated at the First International Workshop on the Many Faces of Multimedia Semantics, held in conjunction with ACM Multimedia 2007 on September 28, 2007, in Augsburg, Germany. The workshop program consisted of a keynote talk and nine contributed papers in sessions entitled The Semantics of Semantics, Annotation, Semantics of Video, and Emerging Applications.
Expanded versions of four papers presented at the workshop were chosen as articles for publication in IEEE MultiMedia. The first article, An Ecosystem for Semantics, by Ansgar Scherp and Ramesh Jain, sets the stage by examining the different types of multimedia semantics and placing them all into a framework. The second article, Hybrid Tagging and Browsing Approaches for Efficient Manual Image Annotation, by Rong Yan, Apostol Natsev, and Murray Campbell, formalizes two approaches to manual annotation: tagging and browsing. The article's analysis offers some interesting insights into these approaches and leads to the formulation of several hybrid annotation algorithms. In the third article, Dynamic Pictorially Enriched Ontologies for Digital Video Libraries, by Marco Bertini, Alberto Del Bimbo, Giuseppe Serra, Carlo Torniai, Rita Cucchiara, Costantino Grana, and Roberto Vezzani, a standard linguistic-oriented ontology is enriched by visually-oriented constructs. In this approach, linguistic domain concepts correspond to various visual instances, which help in the overall annotation process. The final article, Interlinking Music-Related Data on the Web, by Yves Raimond, Christopher Sutton, and Mark Sandler, describes the construction of a set of tools for annotating musical data to integrate various data sources.