Recent MPEG Standards for Future Media Ecosystems
Guest Editors' Introduction • Christian Timmerer and Anthony Vetro • October 2013
Translated by Osvaldo Perez and Tiejun Huang
Additional Multimedia Resources
- DASH Industry Forum
- ITEC DASH open source framework
- MPEG-DASH reference access client
- GPAC open source multimedia framework
- Huanjing Yue and his colleagues introduce a new coding paradigm that transcends traditional architectures in "Cloud-Based Image Coding for Mobile Devices—Toward Thousands to One Compression" (IEEE Transactions on Multimedia). Their approach moves away from pixel-by-pixel image compression and instead seeks to describe images and reconstruct them from a large-scale image repository via descriptions.
- Weiwen Zhang and colleagues propose a "QoE-Driven Cache Management for HTTP Adaptive Bit Rate Streaming over Wireless Networks" (IEEE Transactions on Multimedia).
- Dong Zhang and colleagues address transcoding issues for legacy formats and propose a "Fast Transcoding from H.264/AVC to High Efficiency Video Coding" (2012 IEEE International Conference on Multimedia and Expo (ICME)).
Multimedia has become pervasive in the past decade thanks to easy creation, delivery, and consumption through a vast number of devices and platforms. However, the multimedia ecosystems enabling the services we use in our daily lives can be quite complex and typically involve multiple parties, ranging from various providers (network/content/service) to device manufacturers and software developers acting on different layers and levels of an end-to-end system architecture. Interoperability among those parties is thus an important issue that must be considered when working in this domain. In general, standards enable interoperability — and there are so many to choose from — but it's important that they be open in the sense that they're publicly available as well as drafted, designed, and maintained following an open process (that is, no single-company ownership in the final standard).
One such standards development organization in the multimedia domain is the Moving Picture Experts Group (MPEG) — formally referred to as ISO/IEC JTC 1/SC 29/WG 11 — which specifies standards for compression, decompression, processing, and coded representation of moving pictures, audio, and their combination. MPEG is organized into several subgroups, including requirements, systems, video, audio, and 3D graphics compression, as well as joint collaborative teams on video coding (2D/3D) with the International Telecommunication Union's Telecommunication Standardization Sector (ITU-T).
Computing Now's October 2013 monthly theme highlights some recent developments within MPEG — namely, High-Efficiency Video Coding (HEVC), Unified Speech and Audio Coding (USAC), and Dynamic Adaptive Streaming over HTTP (DASH), for which we've pulled together relevant scientific publications as well as video tutorials by the actual chairs and editors of the MPEG subgroups and standards, respectively.
The HEVC standard will enable ultra high-definition television (UHDTV) services for which products are already emerging on the market at the time of writing. We open the theme with a video presentation by Jens-Rainer Ohm, cochair of the Joint Collaborative Team on Video Coding (JCT-VC), who provides a tutorial on HEVC. Readers looking to delve deeper should examine the article, "Overview of the High Efficiency Video Coding (HEVC) Standard," written by Gary Sullivan and his colleagues and published in the IEEE Transactions on Circuits and Systems for Video Technology, which is available to IEEE Xplore digital library subscribers.
Researchers are also actively exploring various optimizations of HEVC. For example, Abdul Rehman and Zhou Wang propose an "SSIM-Inspired Perceptual Video Coding for HEVC," (2012 IEEE International Conference on Multimedia and Expo (ICME)) which aims to maintain the structural similarity during the encoding process. Doing so simplifies the subsequent perceptual rate-distortion optimization procedure and shows significant gain in terms of perceptual coding performance when compared with conventional state-of-the-art HEVC encoding. In "Intra Frame Constant Rate Control Scheme for High Efficiency Video Coding," Xuecheng Ning, Ling Tian, and Yimin Zhou present an approach to improve coding efficiency for HEVC intra slices, which define segments of the picture.
The USAC standards incorporate existing sound-perception models as well as a sound-production model into a single technology that's capable of compressing a range of speech and music signals with quality that's better than or equal to the best reference codecs optimized for speech or music. In "MPEG Unified Speech and Audio Coding" from IEEE MultiMedia, Schuyler Quackenbush presents an overview of the USAC architecture and summarizes its performance relative to the state-of-the art speech and audio codecs. Additionally, he provides a video tutorial on the topic.
The DASH standard enables multimedia content streaming over the Internet and existing infrastructures using HTTP, which allows for dynamic adaptation at the client depending on the actual context (for example, resolution, bandwidth, codecs, or languages). In this context, Iraj Sodagar's IEEE MultiMedia article provides an overview of "The MPEG-DASH Standard for Multimedia Streaming over the Internet." Additionally, we have video tutorials by Thomas Stockhammer and Christian Timmerer who discuss the standard and available open source software tools, respectively.
The standard deliberately does not define the way in which the client adapts streaming sessions according to context conditions. Doing so remains subject to further research. An important aspect of DASH-based applications and services is the quality of experience (QoE), specifically within wireless networks. Ozgur Oyman and Utsaw Kumar examine QoE in their article, "QoE Evaluation for Video Streaming over eMBMS," which raises interesting prospects about how eMBMS could be combined with DASH. Paolo Bellavista, Antonio Corradi, and Luca Foshini might also benefit from DASH concepts for their deployment of "Self-Organizing Seamless Multimedia Streaming in Dense Manets." Ying-Dar Lin and his colleagues might also adopt DASH within their "In-Kernel Relay for Scalable One-to-Many Streaming," which reduces computing power and enhances subscriber capacity. Finally, Michael Grafl and his colleagues provide an "Evaluation of Hybrid Scalable Video Coding for HTTP-based Adaptive Media Streaming with High-Definition Content" enabling an efficient trade-off between scalable video coding and DASH.
MPEG is currently working on amendments and new editions for the standards introduced in this month's theme. In particular, scalable and 3D extensions for HEVC are currently in development. The audio subgroup is also entering the 3D domain and has successfully evaluated the responses to the July 2013 call for proposals. Finally, the second edition of DASH has been ratified, and the systems subgroup is working on new formats to enable MPEG media transport. For further information, we invite you to explore the links we've provided here to additional multimedia resources.
C. Timmerer and A. Vetro, "Recent MPEG Standards for Future Media Ecosystems," Computing Now, vol. 6, no. 10, Oct. 2013, IEEE Computer Society [online]; http://www.computer.org/portal/web/computingnow/archive/october2013.
Christian Timmerer is a researcher, entrepreneur, and teacher on immersive multimedia communication, streaming, adaptation, and quality of experience. He is a Computing Now associate editor and chair of the IEEE Computer Society Special Technical Community (STC) on Social Networking. Timmerer was general chair of WIAMIS 2008 and QoMEX 2013 and has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, SocialSensor, and the COST Action IC1003 QUALINET. He also participated in ISO/MPEG work for several years – notably, in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH. He has a PhD in computer science from Alpen-Adria-Universität Klagenfurt, Austria. In 2012 he cofounded bitmovin to provide professional services around MPEG-DASH. Follow him on http://www.twitter.com/timse7 and subscribe to his blog http://blog.timmerer.com.
Anthony Vetro is a group manager at Mitsubishi Electric Research Labs where he is responsible for research and standardization on video coding, as well as work on image processing, information security, speech processing, and radar imaging. He has a BS, MS, and PhD in electrical engineering from Polytechnic University, Brooklyn, NY. Vetro has published more than 150 papers in these areas, and serves on several technical committees and editorial boards. He has also been an active member of the ISO/IEC and ITU-T standardization committees on video coding for many years, and currently serves as head of the US delegation to MPEG. Vetro is also a Fellow of IEEE.