Subscribe
Issue No.02 - Feb. (2013 vol.35)
pp: 329-341
Y. Sugano , Sato Lab., Univ. of Tokyo, Tokyo, Japan
Y. Matsushita , Microsoft Res. Asia, Beijing, China
Y. Sato , Sato Lab., Univ. of Tokyo, Tokyo, Japan
ABSTRACT
We propose a gaze sensing method using visual saliency maps that does not need explicit personal calibration. Our goal is to create a gaze estimator using only the eye images captured from a person watching a video clip. Our method treats the saliency maps of the video frames as the probability distributions of the gaze points. We aggregate the saliency maps based on the similarity in eye images to efficiently identify the gaze points from the saliency maps. We establish a mapping between the eye images to the gaze points by using Gaussian process regression. In addition, we use a feedback loop from the gaze estimator to refine the gaze probability maps to improve the accuracy of the gaze estimation. The experimental results show that the proposed method works well with different people and video clips and achieves a 3.5-degree accuracy, which is sufficient for estimating a user's attention on a display.
INDEX TERMS
Visualization, Estimation, Calibration, Feature extraction, Accuracy, Face, Humans,face and gesture recognition, Gaze estimation, visual attention
CITATION
Y. Sugano, Y. Matsushita, Y. Sato, "Appearance-Based Gaze Estimation Using Visual Saliency", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 2, pp. 329-341, Feb. 2013, doi:10.1109/TPAMI.2012.101
REFERENCES
 [1] D.W. Hansen and Q. Ji, "In the Eye of the Beholder: A Survey of Models for Eyes and Gaze," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 3, pp. 478-500, Mar. 2010. [2] R.J.K. Jacob, "Eye Tracking in Advanced Interface Design," Virtual Environments and Advanced Interface Design, W. Barfield and T.A. Furness, eds., pp. 258-288, Oxford Univ. Press, 1995. [3] T. Ohno, "One-Point Calibration Gaze Tracking Method," Proc. Symp. Eye Tracking Research and Applications, pp. 34-34, 2006. [4] E.D. Guestrin and M. Eizenman, "Remote Point-of-Gaze Estimation Requiring a Single-Point Calibration for Applications with Infants," Proc. Symp. Eye Tracking Research and Applications, pp. 267-274, 2008. [5] A. Villanueva and R. Cabeza, "A Novel Gaze Estimation System with One Calibration Point," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 4, pp. 1123-1138, Aug. 2008. [6] T. Nagamatsu, J. Kamahara, T. Iko, and N. Tanaka, "One-Point Calibration Gaze Tracking Based on Eyeball Kinematics Using Stereo Cameras," Proc. Symp. Eye Tracking Research and Applications, pp. 95-98, 2008. [7] E.D. Guestrin and M. Eizenman, "General Theory of Remote Gaze Estimation Using the Pupil Center and Corneal Reflections," IEEE Trans. Biomedical Eng., vol. 53, no. 6, pp. 1124-1133, June 2006. [8] H. Yamazoe, A. Utsumi, T. Yonezawa, and S. Abe, "Remote Gaze Estimation with a Single Camera Based on Facial-Feature Tracking without Special Calibration Actions," Proc. Symp. Eye Tracking Research and Applications, pp. 245-250, 2008. [9] Y. Sugano, Y. Matsushita, Y. Sato, and H. Koike, "An Incremental Learning Method for Unconstrained Gaze Estimation," Proc. 10th European Conf. Computer Vision, pp. 656-667, 2008. [10] C. Koch and S. Ullman, "Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry," Human Neurobiology, vol. 4, no. 4, pp. 219-227, 1985. [11] L. Itti, C. Koch, and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998. [12] C. Privitera and L. Stark, "Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 9, pp. 970-982, Sept. 2000. [13] L. Itti and P.F. Baldi, "Bayesian Surprise Attracts Human Attention," Advances in Neural Information Processing Systems, vol. 19, pp. 547-554, 2006. [14] J. Harel, C. Koch, and P. Perona, "Graph-Based Visual Saliency," Advances in Neural Information Processing Systems, vol. 19, pp. 545-552, 2007. [15] N.D.B. Bruce and J.K. Tsotsos, "Saliency, Attention, and Visual Search: An Information Theoretic Approach," J. Vision, vol. 9, no. 3, pp. 1-24, 2009. [16] D. Parkhurst, K. Law, and E. Niebur, "Modeling the Role of Salience in the Allocation of Overt Visual Attention," Vision Research, vol. 42, no. 1, pp. 107-123, 2002. [17] A.C. Schütz, D.I. Braun, and K.R. Gegenfurtner, "Eye Movements and Perception: A Selective Review," J. Vision, vol. 11, no. 5, 2011. [18] W. Kienzle, F.A. Wichmann, B. Scholkopf, and M.O. Franz, "A Nonparametric Approach to Bottom-Up Visual Saliency," Advances in Neural Information Processing Systems, vol. 19, pp. 689-696, 2006. [19] W. Kienzle, B. Scholkopf, F.A. Wichmann, and M.O. Franz, "How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements," Proc. 29th Ann. Symp. German Assoc. for Pattern Recognition, pp. 405-414, 2007. [20] T. Judd, K. Ehinger, F. Durand, and A. Torralba, "Learning to Predict Where Humans Look," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009. [21] Q. Zhao and C. Koch, "Learning a Saliency Map Using Fixated Locations in Natural Scenes," J. Vision, vol. 11, no. 3, 2011. [22] E. Horvitz, C. Kadie, T. Paek, and D. Hovel, "Models of Attention in Computing and Communication: From Principles to Applications," Comm. ACM, vol. 46, no. 3, pp. 52-59, 2003. [23] R. Vertegaal, J. Shell, D. Chen, and A. Mamuji, "Designing for Augmented Attention: Towards a Framework for Attentive User Interfaces," Computers in Human Behavior, vol. 22, no. 4, pp. 771-789, 2006. [24] Y. Sugano, Y. Matsushita, and Y. Sato, "Calibration-Free Gaze Sensing Using Saliency Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2667-2674, 2010. [25] J. Chen and Q. Ji, "Probabilistic Gaze Estimation without Active Personal Calibration," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [26] M. Cerf, J. Harel, W. Einhauser, and C. Koch, "Predicting Human Gaze Using Low-Level Saliency Combined with Face Detection," Advances in Neural Information Processing systems, vol. 20, pp. 241-248, 2008. [27] OMRON OKAO Vsion Library, https://www.omron.com/r_d/coretech/vision okao.html, 2012. [28] O. Williams, A. Blake, and R. Cipolla, "Sparse and Semi-Supervised Visual Mapping with the S$^3$ GP," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 230-237, 2006. [29] D.W. Hansen, J.S. Agustin, and A. Villanueva, "Homography Normalization for Robust Gaze Estimation in Uncalibrated Setups," Proc. Symp. Eye-Tracking Research and Applications, pp. 13-20, 2010. [30] C.E. Rasmussen and C.K.I. Williams, Gaussian Processes for Machine Learning. The MIT Press, 2006. [31] S. Sundararajan and S.S. Keerthi, "Predictive Approaches for Choosing Hyperparameters in Gaussian Processes," Neural Computation, vol. 13, no. 5, pp. 1103-1118, 2001. [32] C.L. Lawson and R.J. Hanson, Solving Least Squares Problems. SIAM, 1987. [33] Vimeo, http:/vimeo.com/, 2012. [34] OpenMP, http:/openmp.org/, 2012. [35] N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan. 1979. [36] Tobii Technology, http:/www.tobii.com/, 2012. [37] R. Peters, A. Iyer, L. Itti, and C. Koch, "Components of Bottom-Up Gaze Allocation in Natural Images," Vision Research, vol. 45, no. 18, pp. 2397-2416, 2005. [38] F. Lu, Y. Sugano, O. Takahiro, and Y. Sato, "A Head Pose-Free Approach for Appearance-Based Gaze estimation," Proc. 22nd British Machine Vision Conf., 2011.