The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2012 vol.34)
pp: 2327-2340
M. Cohen , Comput. Sci. Dept., Technion - Israel Inst. of Technol., Haifa, Israel
I. Shimshoni , Dept. of Inf. Syst., Univ. of Haifa, Haifa, Israel
E. Rivlin , Comput. Sci. Dept., Technion - Israel Inst. of Technol., Haifa, Israel
A. Adam , Microsoft, Haifa, Israel
ABSTRACT
It is quite common that multiple human observers attend to a single static interest point. This is known as a mutual awareness event (MAWE). A preferred way to monitor these situations is with a camera that captures the human observers while using existing face detection and head pose estimation algorithms. The current work studies the underlying geometric constraints of MAWEs and reformulates them in terms of image measurements. The constraints are then used in a method that 1) detects whether such an interest point does exist, 2) determines where it is located, 3) identifies who was attending to it, and 4) reports where and when each observer was while attending to it. The method is also applied on another interesting event when a single moving human observer fixates on a single static interest point. The method can deal with the general case of an uncalibrated camera in a general environment. This is in contrast to other work on similar problems that inherently assumes a known environment or a calibrated camera. The method was tested on about 75 images from various scenes and robustly detects MAWEs and estimates their related attributes. Most of the images were found by searching the Internet.
INDEX TERMS
Observers, Cameras, Three dimensional displays, Magnetic heads, Face recognition, Signal processing, sparse 3D structure, Head pose, mutual awareness, social signal processing
CITATION
M. Cohen, I. Shimshoni, E. Rivlin, A. Adam, "Detecting Mutual Awareness Events", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 12, pp. 2327-2340, Dec. 2012, doi:10.1109/TPAMI.2012.49
REFERENCES
[1] S. Baron-Cohen, "The Empathizing System: A Revision of the 1994 Model of the Mindreading System," Origins of the Social Mind, B. Ellis and D. Bjorklund, eds., Guilford Publications, Inc., 2005.
[2] N. Emery, "The Eyes Have It: The Neuroethology, Function and Evolution of Social Gaze," Neuroscience and Biobehavioral Rev., vol. 24, pp. 581-604, 2000.
[3] M. Frank, E. Vul, and S. Johnson, "Development of Infants' Attention to Faces during the First Year," Cognition, vol. 110, no. 2, pp. 160-170, 2009.
[4] A. Vinciarelli, M. Pantic, and H. Bourlard, "Social Signal Processing: Survey of an Emerging Domain," Image and Vision Computing, vol. 27, no. 12, pp. 1743-1759, 2009.
[5] X.L.C. Brolly, C. Stratelos, and J.B. Mulligan, "Model-Based Head Pose Estimation for Air-Traffic Controllers," Proc. Int'l Conf. Image Processing, vol. 2, pp. 113-116, 2003.
[6] M. Farenzena, A. Tavano, L. Bazzano, D. Tosato, G. Paggetti, G. Menegaz, V. Murino, and M. Cristani, "Social Interactions by Visual Focus of Attention in a Threedimensional Environment," Proc. Workshop Pattern Recognition and Artificial Intelligence for Human Behaviour Analysis, 2009.
[7] B. Benfold and I. Reid, "Guiding Visual Surveillance by Tracking Human Attention," Proc. 20th British Machine Vision Conf., vol. 1, Sept. 2009.
[8] K. Otsuka, Y. Takemae, J. Yamato, and H. Murase, "Probabilistic Inference of Multiparty-Conversation Structure Based on Markov-Switching Models of Gaze Patterns, Head Directions, and Utterances," Proc. Seventh Int'l Conf. Multimodal Interfaces, 2005.
[9] S.O. Ba and J.-M. Odobez, "A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room," Proc. Third Joint Workshop Multimodal Interaction and Related Machine Learning Algorithms, pp. 75-87, 2006.
[10] S.O. Ba and J.-M. Odobez, "Multi-Party Focus of Attention Recognition in Meetings from Head Pose and Multimodal Contextual Cues," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2008.
[11] S.O. Ba and J.-M. Odobez, "Multi-Person Visual Focus of Attention from Head Pose and Meeting Contextual Cues," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 101-116, Jan. 2011.
[12] L. Dong, H. Di, L. Tao, G. Xu, and P. Oliver, "Visual Focus of Attention Recognition in the Ambient Kitchen," Proc. Ninth Asian Conf. Computer Vision, pp. 548-559, 2010.
[13] K. Smith, S.O. Ba, J.-M. Odobez, and D. Gatica-Perez, "Tracking the Visual Focus of Attention for a Varying Number of Wandering People," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1212-1229, July 2008.
[14] M. Cohen, I. Shimshoni, E. Rivlin, and A. Adam, "Detecting Mutual Awareness Events," Proc. European Conf. Computer Vision Workshop Face Detection: Where We Are, and What Next?, 2010.
[15] R. Stiefelhagen, M. Finke, J. Yang, and A. Waibel, "From Gaze to Focus of Attention," Proc. Third Int'l Conf. Visual Information and Information Systems, pp. 761-768, 1999.
[16] E. Murphy-Chutorian and M. Trivedi, "Head Pose Estimation in Computer Vision: A Survey," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 607-626, Apr. 2009.
[17] M. Osadchy, Y. LeCun, and M. Miller, "Synergistic Face Detection and Pose Estimation with Energy-Based Models," J. Machine Learning Research, vol. 8, pp. 1197-1215, 2007.
[18] P. Viola and M.J. Jones, "Rapid Object Detection Using a Boosted Cascade of Simple Features," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001.
[19] R. Lienhart and J. Maydt, "An Extended Set of Haar-Like Features for Rapid Object Detection," Proc. Int'l Conf. Image Processing, vol. 1, pp. 900-903, 2002.
[20] Y. Wang, Y. Liu, L. Tao, and G. Xu, "Real-Time Multi-View Face Detection and Pose Estimation in Video Stream," Proc. 18th Int'l Conf. Pattern Recognition, pp. 357-360, 2006.
[21] http://www.sourceforge.net/projectsopencvlibrary , 2009.
[22] N. Dodgson, "Variation and Extrema of Human Interpupillary Distance," SDVRS, vol. 5291, no. 6, pp. 36-46, 2004.
[23] C. BenAbdelkader, R. Cutler, and L. Davis, "Person Identification Using Automatic Height and Stride Estimation," Proc. 16th Int'l Conf. Pattern Recognition, vol. 4, no. 16, pp. 377-380, 2002.
[24] A. Gallagher, A. Blose, and T. Chen, "Jointly Estimating Demographics and Height with a Calibrated Camera," Proc. 12th IEEE Int'l Conf. Computer Vision, pp. 1187-1194, 2009.
[25] K. Kanatani, Y. Sugaya, and H. Niitsuma, "Triangulation from Two Views Revisited: Hartley-Sturm vs. Optimal Correction," Proc. 19th British Machine Vision Conf., no. 19, pp. 173-182, 2008.
[26] P.H.S. Torr and A. Zisserman, "MLESAC: A New Robust Estimator with Application to Estimating Image Geometry," Computer Vision and Image Understanding, vol. 78, pp. 138-156, 2000.
[27] A. Desolneux, L. Moisan, and J. Morel, "Meaningful Alignments," Int'l J. Computer Vision, vol. 40, no. 1, pp. 7-23, 2000.
[28] A. Desolneux, L. Moisan, and J. Morel, "Maximal Meaningful Events and Applications to Image Analysis," Annals of Statistics, vol. 31, no. 6, pp. 1822-1851, 2003.
[29] N. Gourier, D. Hall, and J. Crowley, "Estimating Face Orientation from Robust Detection of Salient Facial Structures," Proc. FG Net Workshop Visual Observation of Deictic Gestures, pp. 1-9, 2004.
[30] R. Haralick, "Propagating Covariance in Computer Vision," Proc. 12th IAPR Int'l Workshop Visual Observation, pp. 493-498, 1994.
[31] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2004.
[32] M. Cohen, I. Shimshoni, E. Rivlin, and A. Adam, http://mis. hevra.haifa.ac.il/ishimshoni MAW_Events, 2010.
82 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool