January–March 2013 (Vol. 20, No. 1) pp. 14-16
1070-986X/13/$31.00 © 2013 IEEE

Published by the IEEE Computer Society
3D Imaging Techniques and Multimedia Applications
Ruzena Bajcsy , University of California, Berkeley

Ruigang Yang , University of Kentucky

Pietro Zanuttigh , University of Padova

Cha Zhang , Microsoft Research
  Article Contents  
  Special Issue Articles  
  Looking to the Future  
Download Citation
Download Content
PDFs Require Adobe Acrobat
With the advances in sensing, transmission, and visualization technology, 3D information has become increasingly incorporated into real-world applications, from architecture, entertainment, and manufacturing to security. Integrating depth perception can help present a richer media interface. For example, spatialized audio and 3D parallax can increase the effectiveness of immersive telecommunications; in medicine, 3D instrument tracking can enable more precise and safer operations; and new low-cost 3D cameras and displays are starting a new chapter in interactive gaming and human-computer interaction.
One of the fundamental requirements of these applications is the estimation of scene depth information, preferably in real time. Fields such as computer vision, computer graphics, and robotics have studied the extraction of 3D information for more than three decades, but it remains a challenging problem, especially in unconstrained environments, which might involve variable lighting, scene surface deformation, object occlusion, and so on. Multimedia researchers must take the imperfectness of depth information and other multisensory information into consideration when designing their systems, making this area a unique research opportunity.
This special issue offers an overview of recent advances in 3D acquisition systems and the many multimedia applications that can benefit from 3D integration and understanding. We initially received 17 submissions. After a rigorous review process, we selected five articles that represent a range of 3D topics from scene acquisition,understanding, to visualization.
Special Issue Articles
The first two articles in this special issue describe end-to-end immersive 3D video systems. In "Viewport: A Distributed, Immersive Teleconferencing System with Infrared Dot Pattern," Cha Zhang, Qin Cai, Philip Chou, Zhengyou Zhang, and Ricardo Martin-Brualla propose the Viewport system specially designed for video teleconferencing. Viewport captures the participant at each site using a camera rig that includes multiple color and infrared (IR) cameras and IR projectors. Their approach uses sparse point clouds instead of dense multiview stereo for geometry reconstruction, which leads to a significant speedup in 3D reconstruction and rendering. The authors also introduce a novel virtual-seating scheme that helps maintain the mutual gaze between participants.
The second is a holoscopic video system, described in "Immersive 3D Holoscopic Video System" by Amar Aggoun, Emmanuel Tsekleves, Dimitrios Zarpalas, Anastasios Dimou, Petros Daras, Paulo Nunes, and Luís Ducla Soares, that focuses on the next generation of TV with full 3D content. This system adopts 3D holoscopic imaging using a microlens array and displays the content on autostereoscopic displays. The article also touches on a number of issues important to 3D holoscopic videos that could be inspiring for readers building similar systems in the future, including coding and transmission, depth estimation, segmentation, and search and retrieval.
The next article in this special issue, "Classification and Analysis of 3D Teleimmersive Activities" by Ahsan Arefin, Zixia Huang, Raoul Rivas, Shu Shi, Pengye Xia, Klara Nahrsted, Wanmin Wu, Gregorij Kurillo, and Ruzena Bajcsy, focuses on a high-level understanding of the 3D content captured in our everyday routines. Based on years of experience working on the Teeve (Teleimmersive for Everybody) project, the authors classify teleimmersion (TI) activities with respect to their physical characteristics, qualitatively analyze the cyber side of TI activities, and argue that researchers must consider different performance profiles for different cyberphysical TI activities on the same TI system platform to achieve a high quality of experience (QoE). Achieving these customizable performance profiles during runtime calls for highly configurable, programmable, and adaptive system platforms.
"Character Behavior Planning and Visual Simulation in Virtual 3D Space" by Mingliang Xu, Zhigeng Pan, Mingmin Zhang, Pei Lv, Pengyu Zhu, Yangdong Ye, and Wei Song looks at 3D data from a computer graphics perspective. This work involves generating interactive humanlike characters (IHCs), autonomous humanoid software agents with self-animating visual bodies, with humanlike behaviors that can adapt more easily to environmental changes. This research could be applied to mixed-reality applications in which real and virtual characters coexist. This article presents a graph-model-based technique for constructing and coordinating both behavioral and cognitive IHC models automatically.
While the first four articles focus primarily on research and technology development, the authors of the last article, "A Software-Based Solution for Distributing and Displaying 3D UHD Films," have already demonstrated a practical application: stereoscopic ultra high definition (UHD) movie playback. UHD movies have a resolution of at least four times current HD movies. Their diffusion has been limited, however, by the high cost of specialized hardware. Authored by Lucenildo Aquino Júnior, Ruan Gomes, Manoel Silva Neto, Alexandre Duarte, Rostand Costa, and Guido Souza Filho, this article describes a distributed and flexible software solution for UHD content distribution. A set of software components that can run on standard PCs provides all the building blocks required for encoding, streaming, and visualizing UHD video content. With a much lower cost, this software solution could benefit broad audiences, including those in developing countries with limited access to technological innovations.
Looking to the Future
The introduction of low-cost depth cameras such as Microsoft's Kinect has made depth acquisition available to the mass market. We expect such depth sensors will prevail in the near future and open the way to novel approaches and large improvements in many fields, from interactive visualization to gesture recognition, video surveillance, and robotics. Indeed, difficult tasks such as segmentation, object recognition, people counting, and environment mapping can exploit the information contained in depth data to solve difficult cases where the data conveyed by images alone is insufficient.
3D data visualization is also advancing quickly. Stereoscopic visualization with 3D glasses is already widespread, but research on autostereoscopic and holographic technologies is a active field (as the articles in this special issue demonstrate). We believe 3D visualization without glasses or similar devices will be cheaper and have higher quality in the near future. Furthermore, although stereoscopic movies are still limited to a predefined viewpoint, novel visualization schemes will remove this limitation and free-viewpoint visualization will open the way to a more immersive way of experiencing events or participating in teleconferences.
Finally, novel multimedia compression and transmission techniques will provide the missing link between the acquisition and visualization worlds, making remote immersive visualization of complex 3D scenes possible.
3D imaging techniques are now making 3D sensing and rendering as easy as 2D sensing and rendering. We believe this will bring about a revolution in our daily lives, and we are excited to work more in this area to make the transition happen sooner.
Ruzena Bajcsy is a professor of electrical engineering and computer sciences at the University of California, Berkeley, and director emeritus of the Center for Information Technology Research in the Interest of Science (CITRIS). Her research interests include artificial intelligence, biosystems, intelligent systems and robotics, human-computer interaction, computer vision, and security. Bajcsy has a PhD in electrical engineering from Slovak Technical University and a PhD in computer science from Stanford University. Contact her at bajcsy@eecs.berkeley.edu.
Ruigang Yang is an associate professor in the Computer Science Department at the University of Kentucky. His research interests include computer vision, computer graphics, and multimedia. Yang has a PhD in computer science from the University of North Carolina, Chapel Hill. Contact him at ryang@cs.uky.edu.
Pietro Zanuttigh is an assistant professor in the multimedia technology and telecommunications group at the University of Padova. His research interests include the transmission and remote visualization of scalably compressed 3D representations and the acquisition and processing of depth data. Zanuttigh has a PhD from the University of Padova. Contact him at zanuttigh@dei.unipd.it.
Cha Zhang is a researcher in the Multimedia, Interaction and Communication Group at Microsoft Research. His research interests include multimedia, computer vision, and machine learning. Zhang has a PhD in electrical and computer engineering from Carnegie Mellon University. Contact him at chazhang@microsoft.com.