The Community for Technology Leaders

Panorama of Multimedia Coding and Processing

Chang Wen , Florida Institute of Technology

Pages: pp. 111-112, C3

R. Steinmetz and K. Nahrstedt, Multimedia Fundamentals, Volume 1: Multimedia Coding and Content Processing, Prentice Hall IMSC Press, 2002, $59, 275 pp., ISBN 0-13-031399-8.

The authors of this book set out to perform a daunting task: defining exactly what is fundamental in multimedia, when it's vastly intertwined with and integral to so many disciplines. For decades the field of multimedia technology has been at the emerging point of several fast-developing industries—including telecommunications, computing, consumer electronics, entertainment, and publishing.

Although each person has his or her own take on what multimedia is, for most people multimedia means the combination of two or more continuous media that need to be played during some well-defined time sequences, possibly with some user interaction. Practically, we often deal with two familiar media types—audio and video—in a time-synchronized fashion.

Because of the time-sequential nature and variety, the multimedia field covers areas ranging from devices to systems and from services to applications. Even for the fundamentals of multimedia, a single book cannot adequately cover many essential topics. Keeping this in mind, Steinmetz and Nahrstedt decided to split the content about multimedia system fundamentals into three volumes. I'm reviewing the first volume that deals with media coding and content processing, the two most basic topics about multimedia. The second volume will describe media processing and communication and the third will present topics in multimedia documents, security, and various applications.


The primary objective of this first volume is to serve as a reference book for a comprehensive panorama of topics in the area of multimedia coding and content processing while also giving the reader a quick grasp of essentials in multimedia research and application. Because the goal of this volume is to provide concise definitions of numerous basic concepts, the coverage is necessarily brief.

As a result, for experts and skilled practitioners in multimedia technologies, the treatment of many key concepts may seem inadequate in terms of technical depth. Fortunately, the authors also compiled a comprehensive list of reference materials for those readers who want to get in-depth coverage of some key concepts.

For myself—as someone who has been working in multimedia-related research for 20 years—I still find this volume useful in terms of its completeness in covering various concepts that you may not be able to find from other more specialized multimedia-related books. I'd like to detail some of the ways the authors concisely and rather uniquely cover certain multimedia concepts.


The book begins with a chapter that briefly describes the contents and organization of this volume. The introduction addresses the interdisciplinary aspects of multimedia: that advancement has been driven by many related research communities and various industrial sectors—including telecommunications, computer, consumer electronics, TV and radio broadcasting, and electronic publishing.

The authors also explain that to support multimedia applications, many hardware and software components in various systems must be properly modified, expanded, or even replaced to facilitate unique multimedia applications. And in keeping with multimedia's interdisciplinary nature, the authors present the systems from an integrated and global perspective. Their attempts are evident in the overall organization of this book, as well as the organization of individual chapters. The section on further readings in the introductory chapter is particularly useful, as it lists specific books, journals, magazines, and conference proceedings on multimedia and related topics.

The next five chapters cover the fundamentals of media characteristics and the coding of all major media types—namely, audio, graphics and images, video, and computer-based animation. Readers will find Chapter 2's discussion on media and data streams useful. For example, the authors discuss the term media from different perspectives and enchance our understanding of how the term varies depending on the context. Among popular understandings of the term media, the authors describe different attributes for media, using certain criteria to distinguish between perception, representation, presentation, storage, transmission, and information exchange media. By considering these attributes, we can better appreciate the presentation spaces, values, and dimensions of the current multimedia technologies and applications.

Chapter 3 presents audio technology, briefly introducing the fundamentals of sound and the concepts of frequency, amplitude, perception, and psychoacoustics. The discussion in physical acoustic perspective and psychoacoustic perspective is stimulating, especially for computer and engineering professionals who aren't familiar with the psychological aspects of sound. Other interesting coverage in sound includes audio representation on computers with sampling and quantization concepts, 3D sound perception, music, and the musical instrument digital interface (MIDI) standard.

This chapter also covers speech signals, with just enough discussion to get the reader familiar. Topics covered include speech synthesis, recognition, and transmission; all three are key concepts in computerized speech technologies.

Chapter 4 deals with nontextual information that can be displayed and printed. The chapter begins with capturing graphics and images. Readers will find the next sections on image format useful, since the book presents a brief introduction to all major image formats—including PostScript, GIF, TIFF, XBM, XPM, PBMplus, and BMP. The authors also attempt to cover a range of techniques in computer-assisted image processing. Because it's a mature and diverse discipline, the authors can only fit in simple examples and rudimentary techniques in image processing. However, these introductory descriptions provide readers with sufficient background to continue if they so desire.

The focus of Chapter 5 is video technology. The basics in this chapter include the representation of video signals and video signal formats such as color encoding, composite signals, and computer video formats. The presentation of television systems—both conventional and high-definition—are useful for multimedia researchers and practitioners to understand the subsequent sections on digitization of video signals and the inevitable evolution from the analog to digital era in video technology.

Chapter 6 contains a treatment on computer-based animation. This is unique, because almost all other multimedia-related books don't address computer-assisted animation as a separate media type, even though the way we generate an animation and the way we present an animation are fundamentally different from other media types.

This chapter begins with the basic concepts and continues with specifications and a discussion of how to control, display, and transmit the animation. This chapter concludes with the Virtual Reality Modeling Language (VRML) that has been adopted as an ISO standard and will be an important tool for multimedia authoring and representation.

The next three chapters discuss technologies for data compression, optical storage, and content analysis. Chapter 7 starts with the need for compression and the concept of entropy. The chapter then introduces various essential multimedia data compression techniques. This includes Huffman coding as well as some lesser-known techniques, such as pattern substitution and diatomic coding. A major portion of this chapter details the image-coding (JPEG) and videoconferencing (H.261 and H.263) standards, as well as the MPEG video coding (MPEG-2 and MPEG-4) standards.

Chapter 8 shifts to optical storage media, which is important, since multimedia data often requires higher storage density. In particular, optical storage media have enabled audio and video to be stored digitally in the computer server for access through the Internet and for exchange among users. The chapter focuses on various types of compact discs, including CD-ROM, CD-recordable, CD-magneto-optical, CD-rewritable, and DVDs. This chapter covers a concise history and the basic technology behind various optical storage media. The authors' discussion of DVD standards is a reference for both newcomers and experts in multimedia technology.

The final chapter is on content analysis. This has been a hot research topic in recent years with numerous advances. The chapter begins with discussions on features for multimedia content analysis and moves on to the analysis of image databases, covering the technologies suitable for both individual images and image sequences. The chapter also deals briefly with content analysis for digital audio, as well as some examples of automatic film genre recognition and text recognition in videos.

Additionally, the authors introduce some pioneering work in this emerging field, such as automatic cut detection in digital films, automatic detection of newscasts, video indexing, and the extraction of keyframes from video sequences.


In summary, the book attempts to somewhat comprehensively cover all the major issues in multimedia coding and content processing. Although the coverage on numerous topics is brief, the sheer number of topics related to multimedia covered in this book does paint a nice panorama on multimedia coding and content processing.

Experienced multimedia researchers will find this book useful for refreshing their knowledge in various multimedia technologies, while less-skilled multimedia practitioners will find the introductory description easy to follow and understand. However, this book does leave something to be desired. The coverage on content processing for multimedia is quite light compared to the amount of significant research that's currently available. In particular, content understanding—which is one intelligent step beyond content processing—may dictate how multimedia as a field would evolve in the future, and yet it has not been adequately covered in this book (it was given only a few paragraphs).

Overall, this is a still a well-written book that's essential for all levels of multimedia researchers and practitioners to possess. The authors are both experienced multimedia teachers and practitioners. I would recommend this book to both undergraduate and graduate students who want to get familiar with various topics in multimedia, as well as experienced researchers and engineers as a handy reference book.

60 ms
(Ver 3.x)