Issue No. 04 - October-December (2009 vol. 16)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MMUL.2009.101
Richard Chbeir , Bourgogne University
Harald Kosch , University of Passau
Frederic Andres , National Institute of Informatics
Hiroshi Ishikawa , Shizuoka University
Our special issue on multimedia metadata and semantic management presents new research that focuses on contextualized, ubiquitous, accessible, and interoperable services on a higher semantic level. The multimedia metadata description standards—for example, MPEG-7/21—proposed in recent years have added an important aspect to the creation of multimedia content and related semantic management.
Most of the standards are tailored to specific application domains. Examples include European Broadcasting Union P/Meta 2.0 for broadcasting; TV-Anytime and Society of Motion Picture and Television Engineers Metadata Dictionary for TV; and MPEG-21 for the delivery chain of multimedia and technical aspects (such as, Exchangable Image File Format, or EXIF). 2 These standards exhibit a different semantic level of detail in their descriptions (from simple keywords to regulated taxonomies and ontologies). Only some of the standards are general purpose, for instance MPEG-7.
Recently, the application of Semantic Web technologies to multimedia content has brought forward the use of the Resource Description Framework and Topic Maps for the description of multimedia metadata. 6 Their use might contribute to better interoperability in retrieving multimedia content and is therefore supported by recent standardization efforts in JPEG—that is, the JPSearch initiative for image retrieval (ISO/IEC SC29 WG1 and 24800) and the World Wide Web Consortium Media Annotations Working Group (see http://www.w3.org/2008/WebVideo/Annotations/).
To retrieve content interoperable over distributed databases, appropriate middleware technology must be employed. First, we must bridge heterogeneity in metadata description and query languages; second, we must cleverly aggregate and concisely present results to the user; 3,5 and last, we must provide the security and related access-control techniques according to the multimedia content. 1 Synchronizing the metadata information with the media, and vice versa, is another challenge for effective multimedia management and must be addressed at all levels of the metadata life cycle. 4 This challenge includes dealing with time synchronization as well as synchronized access to streamed and (possibly distributed) stored media and metadata.
After a tight review process, we accepted six original research articles for this special issue, chosen out of 39 candidate articles initially submitted. The selected works reflect the high standards for excellence used by the many esteemed review board members who contributed to this special issue.
This special issue assesses the current status and technologies and describes major challenges and proper solutions for effective multimedia production and management related to evolving Semantic Web strategies. The included articles, which cover different facets of the semantic management of multimedia and multimedia metadata from retrieval and processing to consumption and presentation, represent a step forward in research targeted at improving aspects of the semantic metadata life cycle.
The first two articles deal with querying distributed semantic multimedia databases. In "Managing and Querying Distributed, Multimedia Metadata," Laborie, Manzat, and Sèdes propose an original model of a centralized metadata resume, that is, a concise version of the whole metadata, which locates some desired multimedia content on remote servers and databases. In addition, the authors propose an automatic construction process for the metadata resume. They demonstrate the framework with current Semantic Web technologies for representing and querying semantic metadata. Their experimental results show the benefits of retrieving multimedia content using the metadata resume.
Döller et al., in an article entitled "Semantic MPEG Query Format Validation and Processing," describe the semantic validation of MPEG Query Format queries and the implementation of an MPQF query engine on top of an Oracle database management system. MPQF enables interoperable querying among heterogeneous databases using different metadata standards for the description of multimedia content. This article introduces methods for evaluating MPQF semantic-validation rules not expressed by syntactic means within the XML schema. The authors highlight a prototype implementation of an MPQF-capable processing engine using QueryByFreeText, QueryByXQuery, QueryByDescription, and QueryByMedia query types on a set of MPEG-7 based image annotations.
Along with the theme addressed in the previous articles, image retrieval systems must manage the diversity of the retrieval results, as well as their relevance. In their article "Diversifying Image Retrieval with Affinity-Propagation Clustering on Visual Manifolds," Zhao and Glotin present a post-processing system for retrieval systems to improve retrieval-result diversity. They base their method on affinity-propagation clustering on manifolds whose parameters are optimized by minimizing the Davies-Bouldin criterion without reduction of their relevance. The authors present results based on the Image Retrieval in Cross Language Evaluation Forum (ImageCLEF) 2008 photo task.
The complete multimedia value chain is the subject of Rodriguez-Doncel and Delgado's article, "A Media Value Chain Ontology for MPEG-21." The authors specify a semantic representation of intellectual property using MPEG-21 Part 19. This model defines the minimal set of types of intellectual property, the roles of users interacting with them, and the relevant actions regarding intellectual property law. The article explains the standardization efforts using many examples and offers insight into the multimedia value chain.
In "Using Social Networking and Collections to Enable Video Semantics Acquisition," Davis, Ritz, and Burnett consider the first elements (media production, acquisition, and metadata gathering) of the multimedia value chain. The authors bring together methods from video-content annotation and social networking to solve problems associated with gathering metadata that describes user interaction, usage, and opinions of video content. The individual user-interaction metadata is then aggregated to form semantic metadata for a given video. The authors have successfully implemented the techniques in a custom Flex application based around the popular Facebook API.
The article by Lin, Chen, and Ma, entitled "A Web-Based Music Lecture Database Framework," focuses on semantic audio authoring and presentation in the context of retrieving and presenting music lectures on the Web. For a synchronized presentation between a score and recorded performance audio, the authors propose a dynamic programming-based algorithm for MIDI-to-Wave alignment to explore the temporal relations between MIDI and the corresponding performance recording. The aligned MIDI and wave can be attached to many kinds of teaching materials to form a synchronized presentation. The benefit of this method is that learners can read music scores and get instructional information when listening to certain sections of music pieces. The authors report that a detailed questionnaire in their evaluation system captured positive responses from both engineers and musicians.
We hope this special issue motivates researchers to take the next step beyond building models to implementing, evaluating, comparing, and extending proposed approaches. Many people helped us make this issue becomes a reality. We would first like to gratefully acknowledge and sincerely thank all the reviewers for their timely, insightful, and valuable comments and criticism of the manuscripts that greatly improved the quality of the final versions. Of course, thanks are due to the authors, who provided excellent articles and timely revisions. Finally, we are grateful to the editors of IEEE MultiMedia for their trust in us, and for their effort, patience, and editorial work during the production of this special issue.
We hope you enjoy reading this stimulating collection of articles.
Richard Chbeir is an associate professor in the Laboratoire d'Electronique, Informatique et Image of the Bourgogne University, France. His research interests include multimedia and Web information retrieval; distributed, multimedia database management; and multimedia access control models. Chbeir has a PhD in computer science from Insa de Lyon, France. He is member of the IEEE and ACM. Contact him at firstname.lastname@example.org.
Harald Kosch is a full professor at the Department of Informatics and Mathematics, University of Passau, Germany. His research interests include multimedia metadata, multimedia databases, middleware, and Internet applications. Kosch has a PhD in computer science from Ecole Normale Supérieure de Lyon, France. Contact him at Harald.Kosch@uni-passau.de.
Frederic Andres is an associate professor in the Digital Content and Media Sciences Research Division of the National Institute of Informatics, Tokyo. His research interests include distributed, semantic management systems and information ecosystems for multimedia applications and Web services. Andres has a PhD in computer science from the University of Paris VI, France. Contact him at email@example.com.
Hiroshi Ishikawa is a full professor in the Faculty of Informatics of Shizuoka University, Japan. His research interests include database and Web mining. Ishikawa has a PhD in computer science from the University of Tokyo, Japan. Contact him at firstname.lastname@example.org.