Issue No. 04 - July/August (2004 vol. 24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MCG.2004.17
After years of promises, streaming media has finally arrived. Skeptics argued for years that IP-based video was an overhyped technology that only succeeded in frustrating users by its high failure rate. The criticism was partly justified. Online video suffered from incompatible technologies, network constraints, and poor image quality. Consumers expecting "TV on the desktop" were disappointed to see grainy, choppy video in small windows. But proponents maintained that streaming was first and foremost about access, and that full-screen, DVD-quality video was a matter of time. Developments over the past two years have proved them right.
Streaming media is causing a sea change in education, business, and mass media. Millions of people watch live newscasts, sports, and concerts on their PCs. Universities deploy streaming technology to enhance their distance learning programs, and corporations use it to improve knowledge-sharing and cross-company cooperation. Media companies and broadcasters are opening their archives, giving consumers access to thousands of popular and long-forgotten movies, documentaries, and 50 years of television programming.
Web casters are proliferating, targeting audiences ranging from homemakers to business professionals, and advertisers insert interactive audiovisual ads in Web pages. Streaming is not limited to PCs. Wireless carriers target mobile phones with made-for-mobile content, and several carriers are preparing to launch live TV services for mobile phones. Streaming combines the immediacy of television with the interactivity of the Internet and is revolutionizing the media landscape. It provides access-on-demand rich media anywhere on any device. Based on Internet protocols, streaming technology enables small companies and individuals to become Web casters with a global audience.
Three Decades of Streaming
The history of streaming goes back to the early 1970s, when research labs in the US first demonstrated streaming audio technology. Streaming video followed. In 1974, a consortium headed by the International Telecommunication Union conducted the first experiments with video conferencing. The project would lead to the development of H.264 and MPEG, two widely used video compression standards. Several US and European companies developed video telephone and videoconferencing systems.
In the mid-1980s, with the advent of the compact disc, consumer electronic firms launched hardware-based solutions for digital video technology sufficient to encode movies for playback at CD-ROM rates (about 150 Kbytes per second). In particular Philips, working with Sony, developed a technology known as CD-i while RCA had a competing technology, DVI. The International Moving Picture Experts Group, MPEG, then initiated an open standards effort to ensure that there would be a nonproprietary standard for coded representation of moving pictures and audio. The first standard, MPEG-1, was published in 1990 and delivered near VHS-quality video and subpar quality audio. MPEG-1 was the precursor to MPEG-2 (1994), which became the de facto standard for DVD and HDTV. Meanwhile, in 1991, Apple introduced QuickTime, a software-only solution able to produce real-time playback of digital video, albeit limited initially to a small (120 × 90 pixels) image size.
Streaming remained a niche technology in the telecom industry until the 1990s, when it migrated to the Internet. In 1995, Progressive Networks (later renamed RealNetworks), released RealAudio 1.0, a player for streaming live audio. Two years later, the company released RealVideo, the first all-in-one player for streaming Internet audio and video. The player soon became one of the most popular downloads on the Internet, and video content proliferated.
In 1999, Microsoft introduced Windows Media Player 6, the company's first all-in-one player. Bundled with Microsoft's operating systems, the Windows Media Player quickly gained market share at the expense of Real. The ensuing battle has turned into a replay of the browser war between Microsoft and Netscape (see the " Media Player Wars" sidebar). Real and Microsoft have used their players to launch multimedia portals, both of which generate additional income for the companies. Figure 1 shows Microsoft's MSN portal.
Real-time Transfer Protocol
Streaming enables playback of audiovisual content in real time. It's distinct from downloadable media. With the latter, the entire file is retrieved before playback begins, although pseudo streaming, also known as progressive downloads, lets users view the media as it arrives (the receiving client will buffer data for slow or congested network connections).
While pseudo streaming relies on the TCP/IP, the Internet protocol used for conventional Web pages, true streaming makes use of the Real-Time Transfer Protocol (RTP). RTP streaming time stamps content to ensure the synchronization of video and audio, and it makes efficient use of bandwidth. Layered on top of RTP is the Real-Time Streaming Protocol (RTSP), a two-way protocol similar to HTTP. RTSP allows for interactivity: Users can start, stop, and pause video, and participants of a Web conference can request streams from one or more servers.
True streaming has several advantages. It enables Web casting of live events, supports multiple users (one to one and one to many), and uses bandwidth efficiently. Important to content owners, the video stream is discarded after play, preventing unauthorized duplication. Because true streaming leaves no residual copy on the client, it's ideal for mobile phones and other devices with limited or no memory.
Apple was late to offer streaming capabilities. When the company launched its popular QuickTime format in 1991, the technology could only handle progressive downloads. Apple had argued that video files would not play well over dial-up Internet connections. In 1999, when the number of broadband connections grew, Apple released QuickTime 4.0, its first media player with streaming capabilities.
Streaming Comes of Age
In 2003, streaming media in the US registered growth of 104 percent, according to a report from market analysts AccuStream iMedia Research ( Streaming Media 2003: Brand, User and Audience Share Analysis). Broadband users accounted for about 78 percent of this growth. Music videos captured 33 percent of viewing share, news 28 percent, and sports 17 percent. The top 10 streaming video sites averaged more than 400 million streams per month, an all-time high according to AccuStream.
The growing popularity of Web-based video has not escaped the advertising industry. Advertisers have discovered that streaming ads have a much higher click-through rate than conventional banner ads. US-based EyeWonder developed a technology that lets advertisers insert streaming ads in Web pages without the need to install a media player. The company's Java-based media player is sent together with the streaming ad. Figure 2 shows an ad for a movie on the The New York Times Web site.
Last year, AOL used EyeWonder's technology for 15- to 30-second streaming ads on its instant messenger service. EyeWonder's VideoMail, which uses the same disposable Java media player, lets companies send video mail to clients and shareholders. EyeWonder claims that the open rate (that is, the number of recipients taking the time to view the message) of VideoMail is above 80 percent.
Companies are deploying streaming technology to enhance corporate communications and training. Figure 3 shows a screen shot from Avistar's videoconferencing system, which allows face-to-face communication and document sharing in real time. Mercedes-Benz USA augments its service processes information and corporate training with streaming media. More than 300 of its dealerships have access to 2,000 on-demand video clips to help its 4,700 service technicians with maintenance and repair procedures. The company uses the technology to update technical data on engines, transmissions, and electronic systems. The technology has eliminated the need for service technicians to travel to the company's training center.
Producing Video for the Web
Producing and deploying Web-based video can be intimidating. It involves tasks such as audiovisual production, editing, encoding, storage, caching, and multimedia file management. But the required tools are improving and the cost is coming down. Most popular editing programs run on standard PCs, and digital video cameras—which retail for about $1,500—can transfer video directly to a PC with an IEEE 1394 connection (FireWire or i.Link).
Vegas 5, the latest version of Sony's video editing software, ( http://mediasoftware.sonypictures.com/), can handle RealVideo, QuickTime, Windows Media files, as well as Paradox Application Language (PAL) and National TV Standards Committee (NTSC) standard formats. Vegas lets video editors apply 2D and 3D transitions, filters, and text animations. It has powerful color correction tools to adjust video from different camera setups and lighting situations. Users can analyze edits on four scopes: Vectorscope, Waveform, Parade, and Histogram. They can also view changes instantly on an external monitor via i.Link connector or IEEE 1394 devices.
Adobe and Apple also offer versatile editing software. The latest version of Apple's Final Cut Pro, which supports broadcast-quality high-definition video, is a low-cost alternative to Premier and AfterEffects from market leader Adobe. Tools from Envivio ( http://www.envivio.com) let developers edit and code content for delivery to different networks and multiple platforms, including mobile phones and PDAs. Figure 4 shows a screen shot from the Envivio Encoding Station.
A growing number of companies support the Synchronized Multimedia Integration Language. SMIL is to the RTSP and media servers what HTML is to HTTP and Web servers. Endorsed by the World Wide Web Consortium, SMIL offers an XML-based approach for controlling and coordinating rich media presentations, including video, text, and animations. RealNetworks, the most dedicated supporter of SMIL, has released its own XML and SMIL authoring tool. Microsoft, Adobe, and Apple also support SMIL.
SMIL is a simple but elegant markup language. It supports the overlay media streams with clickable maps and media clips in secondary windows. SMIL lists a separate URL for each component of the media presentation, meaning it can assemble presentations from clips stored on any streaming media or Web server.
Digitized video and audio are data intensive. Ten seconds of uncompressed broadcast-quality video (640 × 480 pixels) amounts to about 300 Mbytes, and a full-length movie can occupy 2 Tbytes. Compression technology can reduce media files by as much as 90 percent. A basic guideline on bandwidth strategy is that there is essentially a tradeoff between bandwidth and video size or quality. The larger the video window, the larger the bandwidth requirement to obtain satisfying results. The four sizes in the table are common dimensions for Internet video. (176 × 132 pixels is about the size of a match box.) Advanced tools can resize videos and target video to multiple bandwidths.
Video Goes Mobile
Mobile video has created a veritable gold rush among wireless carriers, media companies, and content developers. High-speed wireless networks promise to transform the mobile phone into the fourth screen, after television, cinema, and the PC. A growing percentage of the more than 400 million mobile phones sold each year are video enabled, creating a market for mobile video that will ultimately reach a billion-plus consumers.
Asian carriers have taken the lead in mobile video. In 2002, Japanese electronics maker Sharp released the SH-51, the world's first handset capable of recording, editing, and sending video mail. Japanese carrier JPhone used the SH-51 to launch Sha-Movie, a video-mail service that lets consumers send and receive 5-second video clips. Japan's KDDI followed suit. Its EZMovie services let subscribers send up to 15 seconds of video to other KDDI subscribers or to a PC. Earlier this year, KDDI launched 25 multimedia programs for users of its advanced 3G mobile phones in Japan. Figure 5 shows two samples of the company's current offerings.
Korea's SK Telecom has offered movie trailers, music videos, news, and sports highlights since 2001. Despite the hefty fees—about 50 cents for movie trailers and $1.00 for multimedia news or sports clips—the service was an instant hit, and contributes significantly to the carrier's average revenue per user. SK Telecom's content menu has expanded to more than 6,000 titles, and the company plans to produce its own video content.
Video Calling on the Go
Japanese youngsters were the first to embrace mobile video. They use their camera phones to record video clips of their pets or birthday parties and send them to friends and relatives. While on holiday at a beach or ski resort, they send a video clip rather than a postcard. But mobile video is increasingly used in business applications to save time and money.
In Japan, technicians using camera-enabled mobile phones transmit images of equipment to engineers for diagnostics. Location scouts send video clips to art directors and movie producers. Commuters dial traffic cameras to check live road conditions, and security personnel tap into cameras at warehouses and construction sites. Japanese companies have equipped sentry robots with video cameras, enabling home owners to keep an eye on their property. The phones can remotely control the robot and other household appliances.
Mobile phones are evolving into multimedia tools with Sony Walkman-like functions, that is, radio and even TV receivers. KDDI 3G network phones can store 3 Mbytes of high-quality video, and the next generation mobile phone will be equipped with a hard disk. Toshiba has started mass producing 0.85-inch hard disk drives with storage capacities of 2 and 4 Gbytes.
NTT Docomo's latest handsets—the much acclaimed 900i series—feature 2.2-inch color screens, a Macromedia Flash-equipped browser, and videophone capability. Figure 6 shows screen shots from Docomo's video calling service. Users receiving a video call while not suitably dressed (or suffering an outbreak of acne) can select an animated avatar as a stand-in. They can select avatars expressing different emotions (that is, happy, sad, surprised, and so on). East Asian carriers believe 3G networks will make video calling and videoconferencing increasingly popular among consumers and business travelers. Carriers in Japan, Korea, and Hong Kong have completed interoperability tests for international video calls.
Mobile Video Drives Innovation
Japan was late in reacting to the Internet boom, but the country has become a global leader in developing software and applications for the mobile Internet. Tokyo-based Access ( http://www.access.co.jp), referred to as the "Microsoft of non-PC software," develops advanced data platforms and Internet access technologies for the mobile communications market. The company's flagship NetFront browser is the most popular browser used in NTT Docomo's i-mode mobile service and holds full market share for Docomo's 3G Freedom of Multimedia Mobile Access (FOMA) service.
Access' new NetFront v3.1 advanced multimedia browser supports video, sound, still images, vector graphics, text, and PDF files as well as advanced rendering technologies like NetFront's SmartFit Rendering and Rapid Render. Figure 7 shows the NetFront architecture. The browser can synchronize subtitles for video content, reportedly the first mobile browser with this capability. Figure 8 shows a screen shot of a weather forecast with subtitles. NetFront SmartFit Rendering software displays Web content on a mobile phone screen without the need for horizontal scrolling. Streaming content is automatically tailored to the small screen.
In a sign of things to come, carriers are opening their gateways to third-party content providers. NTT Docomo has launched M-Stage V-Live, a one-to-many video streaming platform for delivering live and archived video. The company offers its own content, but lets third parties set up their own Web casting channels. Outside providers need a streaming server, an encoder, and a dedicated line to connect to Docomo's V-Live Center. Docomo charges a monthly fee of 20,000 yen (about $180).
Mobile video is likely to become a must-have service in the coming years. KDDI and NTT Docomo announced all-you-can-watch monthly rates. This year, the pocket TV will arrive. TU Media, a consortium of Japanese and Korean companies, has launched a satellite for transmitting digital multimedia broadcasting to mobile phones, handhelds, and terminals in automobiles. The satellite will offer 39 channels featuring movies, entertainment shows, and radio channels. Monthly fees are between $10 and $12.
Mobile Web Casting for the Masses
Mobile video fever has also reached Europe and the US. TelecomTV in the UK ( http://www.TelecomTV.com), one of the first mobile Web casters catering to professionals, offers news specifically targeted at the wireless industry. Its channels features live reports from trade shows and interviews with industry leaders. MobiTV ( http://www.mobitv.com) in the US offers live headline news, sports, and entertainment. Consumers using Sprint's PCS Vision-enabled phones can download MobiTV's software to transform their handsets into a portable TV.
US-based BigDigit ( http://www.bigdigit.com), an aggregator of content for mobile phones, targets the mobile phone with short movies, animations, and media art. The company organizes "The World's Smallest Film Festival," an annual showcase of short digital films, animation, music videos, and advertisements. Figure 9 shows a screen shot from a short film entitled The Artists that exemplifies the requirements of made-for-mobile content: clean, bold, and high-contrast imagery that conveys a simple concept with a few focused details.
Demand for short movies is expanding rapidly. US-based CinemaElectric ( http://www.cinemaelectric.com) offers "high-energy, information-rich short movies" through its PocketCinema channel. In the last quarter of 2003, the company saw an average monthly growth of 150 percent per month in Germany, Austria, and the United Kingdom. Its content channels offer movie industry gossip (Portable Hollywood), global fashion news (Electric Catwalk), and sports clips (Action!).
Traditional media giants also target the mobile phone. Disney is culling segments from Finding Nemo and other favorites for delivery to the mobile phone. Warner Bros. is repurposing Harry Potter and Bugs Bunny for the small screen. The company is working with Vodafone to develop made-for-mobile video. UK-based Mobix Interactive ( http://www.mobixinteractive.com) is developing Connections, a soap opera situated in the fictional St. Barbaras Country Hospital. According to a Mobix press release, Connections features "tales of betrayal, infidelity and murder," with each 90-second episode offering "melodrama with a cliffhanger ending."
The mobile industry is likely to produce the video equivalent of popular daily newspaper cartoons like Peanuts and Doonesbury. Intense competition makes it likely that such content will be offered at low prices. The wireless Web offers a revenue-generating environment not available on the conventional Internet. Wireless carriers have a virtually tamper-free billing system, and can add micropayments and subscription fees to the customer's monthly phone bill. Carriers can retain a percentage of the collected fees, and pass the remainder to the content provider.
Unless the conventional Internet develops a universal and secure payment infrastructure (which will probably have to await the migration from IPv4 to IPv6), wireless carriers might be tempted to become e-commerce bill collectors. Docomo has conducted trials with Sony's Felica technology that transforms mobile phones into e-wallets. Felica lets consumers pay for mobile content, but also for movie tickets, train tickets, and groceries. Felica's monetary unit, Edy, indicates Sony has international ambitions; Edy stands for euro, dollar, yen.
Convergence of computing and communications means digital media will be delivered to a host of different devices—not only to PCs and mobile phones, but to game consoles, (digital) television, movie theaters, and even billboards. (The latter will no longer sell mere space, but time slots.) Moreover, convergence will take many forms, connecting different networks and devices. NTT Docomo has developed a prototype of a home electronics control system that lets users record a TV program and view the playback on a mobile phone.
The battle for leadership in converging media technology pits computer giants like Microsoft, Intel, and HP against the consumer electronics industry led by the likes of Sony, Philips, and Matsushita. The former companies aim to make a PC-like server become a media hub for the home. Microsoft is developing home entertainment center technology that connects the PC with TV, audiovisual equipment, and portable devices.
Consumer electronics makers believe consumers will favor home entertainment centers that are as easy to use as conventional audiovisual equipment. To counter Microsoft's encroachment on their turf, consumer electronics giants are forming cross-industry alliances. Philips, Nokia, Vodafone, and Universal will introduce digital multimedia services for TVs, laptops, PC tablets, mobile phones, and car devices. Philips and Korea's LG already produce flat-panel display TVs with integrated Internet access.
But in the long term, PC-like media servers, consumer electronics products, and other boxes are likely to go the way of the dinosaurs. While some consumers will cling to local storage, thin clients permanently connected to high-speed networks—both fixed-line and wireless—will probably win the day. Developments in East Asia point in this direction.
Earlier this year, the number of Japanese households with fiber-to-the-home (FTTH) connections passed the one-million mark. FTTH offers speeds of up to 100 megabits per second (Mbps), a thousand times faster than 56-K dial-up connections. Faster networks are likely to follow. Japan's Showa Electric Wire and Cable has produced fiber-optics cables able to transport data at 10 Gbps.
Wireless technology is also breaking speed records. Japanese and Korean firms are working on 4G wireless technology that will deliver 20 Mbps for uplinks and 100 Mbps for downlinks. (Current 3G services offer downlinks of 384 Kbps.) The 4G technology will let users watch live video on fast-moving bullet trains, and, more importantly, it will prevent network congestion as demand for video and other data-intensive applications grows.
Seamless, cross-platform, and cross-network telecommunications will offer anywhere, anytime access to archived and live streaming media. The tradeoff between boxes and thin clients connected to high-speed networks should ultimately favor the latter. Add to this the billion-plus consumers in Asia who will join the digital age in the coming decade, the scramble for natural resources, not to mention growing waste disposal problems, and the thin client may become an economic and ecological imperative.