2011 IEEE International Conference on Multimedia and Expo (2011)
July 11, 2011 to July 15, 2011
Yunlong Feng , The Graduate University for Advanced Studies, Japan
Gene Cheung , National Institute of Informatics, Japan
Wai-tian Tan , Hewlett-Packard Laboratories, USA
Yusheng Ji , National Institute of Informatics, Japan
With the advent of eye gaze tracking technology, eye gaze is increasingly being used as a media interaction trigger in a variety of applications, such as eye typing, video content customization, and network video streaming based on region-of-interest (ROI). The reaction time of a gaze-based networked system, however, is in practice lower-bounded by the round trip time (RTT) of today's networks, which can be large. To improve the efficacy of gaze-based networked systems, in the paper we propose a Hidden Markov Model (HMM)-based gaze prediction strategy to predict future gaze locations to lower end-to-end reaction delay. We first design an HMM with three states corresponding to human's three major types of intrinsic eye movements. HMM parameters are obtained offline on a per-video basis during training phase. During testing phase, a window of noisy gaze observations are collected in real-time as input to a forward algorithm, which computes the most likely HMM state. Given the deduced HMM state, linear prediction is used to predict gaze location RTT seconds into the future. We demonstrate the applicability of our gaze prediction strategy by focusing on ROI-based bit allocation for network video streaming. To reduce transmission rate of a video stream without degrading viewer's perceived visual quality, we allocate more bits to encode the viewer's current spatial ROI, while devoting fewer bits in other spatial regions. The challenge lies in overcoming the delay between the time a viewer's ROI is detected by gaze tracking, to the time the effected video is encoded, delivered and displayed at the viewer's terminal. To this end, we use our proposed gaze-prediction strategy to predict future eye gaze locations, so that optimized bit allocation can be performed for future frames. Our experiments show that bit rate can be reduced by 21% without noticeable visual quality degradation when end-to-end network delay is as high as 200ms.
Hidden Markov models, Indexes, Predictive models, Training, Software
Yunlong Feng, Gene Cheung, Wai-tian Tan and Yusheng Ji, "Hidden Markov Model for eye gaze prediction in networked video streaming," 2011 IEEE International Conference on Multimedia and Expo(ICME), Barcelona, , pp. 1-6.