, Darmstadt University of Technology and Fraunhofer IGD
, University of Konstanz
, Graz University of Technology
Pages: pp. 20-21
Digital libraries (DLs) in general and technical or cultural preservation applications in particular offer a rich set of multimedia objects like audio, music, images, videos, and also 3D models. However, instead of handling these objects consistently as regular documents—in the same way we treat textual documents—most applications handle them differently.
Considering that textual documents are only one media type among many, it's clear that this type of document is handled quite specially. A full-text search engine lets users retrieve a specific document based on its content—that is, one or more words that appear in it. Content-based retrieval of other media types is an active research area, and in the case of 3D documents, only pilot applications exist. The deficits in handling nontextual documents are especially annoying in the present situation, where the proportion of classical text (journals, books, and so on) is decreasing. It's ever easier to create a digital image, a video, or a 3D object, but our libraries, be they classical public libraries or company-internal repositories, are not equipped with the tools to provide all the services for nonstandard documents that are available for books, journals, or technical specification sheets.
This shortfall is due to the fact that standard tasks, such as content categorization, indexing, content representation, and summarization, haven't yet been developed to the point at which DL technology could readily apply them for these document types. Instead, these tasks must be done manually, making the activity almost prohibitively expensive. For example, the Music Genome Project, which began in 2000, aims to describe songs by hundreds of attributes. This job takes a specialized musical analyst around half an hour per song for the feature vector extraction. Consequently, one of the pressing research challenges is to develop an adequate vocabulary to characterize the content and structure of nontextual documents, in particular for graphical 3D objects, as the key to indexing, categorization, abstracting, dissemination, and access. Even more pressing is the demand for having automated categorization at hand—this is true even for the casual user who takes pictures with a digital camera, dumps them into folders, and later can't find particular photos.
As more and more artifacts in the technical and engineering world are digitally born, the content categorization, abstraction, and adequate representation, which by the way must coexist with long-term archival demands, are vital to all disciplines.
In the case of 3D documents, numerous outstanding research problems are on the agenda. Several types of 3D data must be incorporated equally—for example, data defining surfaces, volumes, or structured 3D objects that may comprise many parts that interact with each other. There's no widely accepted standard for a general file format for all (or perhaps a wide class of) 3D data types, let alone a standard for the corresponding and necessary data compression. This is in contrast to images and video, where industry already has a vital interest in such standards. In addition to the 3D data itself, it's important for applications to attach metadata to the data as well as to individual parts of an instance of a 3D model. Such markup methods should also be stable in the sense that they should survive shape representation conversions, such as from volumetric to surface type, and editing operations, such as cropping or cutting. Open questions remain: which type of queries generally should be allowed for 3D shape representations and how to ensure that the capability to answer such queries is passed down to objects that have been modified; how to extract and maintain a 3D object's meaning—that is, how to close the "semantic gap"; and what shape features are and in what way they are or should be invariant with respect to a suitable class of shape transformations. These are some of the pressing questions of current and upcoming research. In addition, the following items and keywords are relevant for current innovative results and concepts in the 3D documents domain:
While not all of the above issues can be covered in a single special issue, we have collected a short survey article and three original contributions that deal with many of them. In the survey "Content-Based 3D Object Retrieval," Bustos, Keim, Saupe, and Schreck give an overview of approaches for searching for similar content in 3D model databases. In particular, the feature-based approach in which a numeric vector characterizes 3D shape and closeness in feature vector-space models' shape similarity has become the most widely used paradigm for 3D search engines. The article describes the desired computational and invariance properties that shape descriptors generally should carry. The authors give examples for several successful search methods and discuss benchmarking.
A goal closely related to content-based 3D search is classification of 3D objects. For classification, the query object must be correctly assigned to a particular class of similar objects from a collection of predefined classes. One possible approach is to generate a unique shape prototype for each class and to compare the query object with the prototypes of all classes. In "Structural Shape Prototypes for the Automatic Classification of 3D Objects," Marini, Spagnuolo, and Falcidieno propose such shape prototypes. Their method uses structural descriptors, consisting of an attributed graph with nodes corresponding to suitable object parts. From the set of all such descriptors of a class of objects, a common substructure can be extracted and edited that captures the characteristic properties of the class objects. A query object then can be compared with a shape prototype by considering its graph descriptor. This approach is challenging in terms of the optimization problem it is facing in the design of the prototypes, but it holds promise over the canonical method in which one or several class representatives are selected for comparison with a query object rather than a carefully designed prototype.
Content-based search and automatic classification for 3D models will provide very useful technologies for the industries that must deal with 3D objects. However, such automatic methods may fail or may return results that must be refined manually. In such cases, visual browsing methods might hold the key to efficient processing. In "Navigation and Discovery in 3D CAD Repositories," Pu, Kalyanaraman, Jayanti, Ramani, and Pizlo address this demand. The authors introduce a new interaction technique to intuitively and efficiently explore 3D models in a CAD repository. In their system, a freehand sketch is used initially as a query. The retrieved object that is most relevant for the query acts as an anchor from which an interactive 3D exploration begins. Objects that look similar from a particular viewpoint can easily be identified. The system promises advantages over the traditional scroll list paradigm for navigation.
In the final article of this special issue, "Managing Complex Augmented Reality Models," Schmalstieg, Schall, Wagner, Barakonyi, Reitmayr, Newman, and Ledermann propose an XML-based data model for managing complex AR models to support mobile applications. They discuss modules for acquisition, storage, delivery, and presentation and present examples of applications for outdoor and indoor guidance (such as an AR tour guide for the city of Vienna).
Of course, we realize that the area of 3D documents has many more facets than the ones this issue addresses, but we are grateful to IEEE Computer Graphics and Applications for giving us the opportunity to start a wider discussion on the topic. We hope this issue serves as the trigger for the graphics community as a whole to look at and address the many challenges resulting from fast-growing digital libraries full of multimedia material.