Real Time Communications Gets a Facelift
by George Lawton
High-definition (HD) video and music services delivered over the Internet have taken off, but real-time HD point-to-point communications have lagged. A startup company launched last September has developed a novel approach for addressing the main bottlenecks in real-time voice and video communication, which might help bridge this gap. The free service has already attracted 15 million users across 35 different devices.
Internet speeds commonly hit 2–20 megabits per second (Mbps) over consumer high-speed Internet services, whereas point-to-point voice and video services that work across multiple providers haven't scaled much beyond the 64 Kbps standardized by ISDN. "The cellular network has a higher variability and is more challenging for conversational video services," said Eric Setton, chief technical officer at Tango, a video chat service.
The three main bottlenecks to the wide-spread adoption of HD communications include platform fragmentation, processing speed, and end-to-end latency optimization.
Disparate Platforms
The market for real-time video conferencing has fragmented between low-latency proprietary video-conferencing systems in businesses and lower-definition higher-latency video-chat systems such as Apple's FaceTime, Google Talk, and Microsoft/Skype's Qik. Today most of these systems, including some from the same vendors, don't talk to each other, said industry analyst Rob Enderle. "But interoperability is what makes the phone work, and people forgot that. We've been working on this for at least three decades, and it looks like the parties are starting to come together, but it's very rudimentary."
Enderle believes that the large vendors such as Skype, backed by Microsoft and Google, could work together to support interoperability. But so far, they've restricted outside access to their platforms. "Vendors seem locked into the idea that if they can do their stuff better, they can lock in their customers to buy all of their products," he said. "There are common video conferencing standards for high-performance systems, but interoperability happens at a low level of quality."
The main issue is not the use of open standards themselves, said Setton, but making the information available on how independent programmers can use them. For example, video-chat services such as FaceTime, Google Talk, and Skype are built using open standards, but none of them reveals encryption keys and connection methods, which makes interoperability difficult.
A similar lack of interoperability hindered the early adoption of simple message service (SMS), argued Setton. It wasn't until the carriers decided to work together that the text messaging industry took off. Today, users must download a separate application for each video-chat network they want to use.
Reducing the Delay
The ITU G.114 standard recommends no more than 150 milliseconds (ms) of one-way delay for a normal voice conversation. Tango has achieved this performance level for Wi-Fi–based calls, and is down to 250–500 ms for calls over 3G networks.
Setton said that it's challenging to reduce latency over the cellular link. In some cases, Tango engineers detect 50–80 ms delays between the phone and the network. This can add up to 150 ms before adding in delays for video processing or transit across the Internet backbone.
Another source of delay is the link to a centralized server common to existing voice and video communication platforms. Tango has developed an architecture that uses only the centralized server for the call setup. The two devices then talk directly to each other, thereby eliminating a hop to a central server.
Cellular telephone routers also present unique challenges. For example, AT&T routes all its mobile calls through five gateway routers with incredibly complex port-mapping schemes. "Making sure you can identify the right router port for a phone is like finding a needle in a haystack," said Setton. Tango has developed a set of tools for measuring the performance of different ports and automatically routing calls through the faster ones.
Getting up to Speed
Video conferencing on mobile devices only became possible with the processing power of the iPhone 3GS, which can render 320 X 240 pixels at 10–15 frames per second in software. Other hardware vendors are introducing dual-core and quad-core chips that could further improve performance because video processing is easy execute across parallel processes.
But perhaps the highest performance gains will come from leveraging native video processing features built into some of the newer chips. For example, Tango is the first company to gain access to Qualcomm's proprietary hardware accelerators, which are built into the chips used across multiple smart-phone models. Setton said his company's early experience demonstrates a 10-fold boost using this native hardware accelerator, as opposed relying solely on the device's main CPU.
With further improvements in processing power and latency, Setton expects these chat services to evolve into real-time HD platforms. "Today, you're seeing most consumer services with quality comparable to normal phone calls," he said. "But in the years to come, we will see applications that come with high-definition voice and video, particularly with the introduction of 4G."
George Lawton is a freelance technical journalist. Contact him at glawton@glawton.com.