CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2012 vol.34 Issue No.03 - March
Issue No.03 - March (2012 vol.34)
Zhang Zhang , Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Dacheng Tao , Centre for Quantum Comput. & Intell. Syst., Univ. of Technol., Sydney, NSW, Australia
Slow Feature Analysis (SFA) extracts slowly varying features from a quickly varying input signal . It has been successfully applied to modeling the visual receptive fields of the cortical neurons. Sufficient experimental results in neuroscience suggest that the temporal slowness principle is a general learning principle in visual perception. In this paper, we introduce the SFA framework to the problem of human action recognition by incorporating the discriminative information with SFA learning and considering the spatial relationship of body parts. In particular, we consider four kinds of SFA learning strategies, including the original unsupervised SFA (U-SFA), the supervised SFA (S-SFA), the discriminative SFA (D-SFA), and the spatial discriminative SFA (SD--SFA), to extract slow feature functions from a large amount of training cuboids which are obtained by random sampling in motion boundaries. Afterward, to represent action sequences, the squared first order temporal derivatives are accumulated over all transformed cuboids into one feature vector, which is termed the Accumulated Squared Derivative (ASD) feature. The ASD feature encodes the statistical distribution of slow features in an action sequence. Finally, a linear support vector machine (SVM) is trained to classify actions represented by ASD features. We conduct extensive experiments, including two sets of control experiments, two sets of large scale experiments on the KTH and Weizmann databases, and two sets of experiments on the CASIA and UT-interaction databases, to demonstrate the effectiveness of SFA for human action recognition. Experimental results suggest that the SFA-based approach (1) is able to extract useful motion patterns and improves the recognition performance, (2) requires less intermediate processing steps but achieves comparable or even better performance, and (3) has good potential to recognize complex multiperson activities.
visual databases, feature extraction, image coding, image motion analysis, image recognition, learning (artificial intelligence), statistical distributions, support vector machines, complex multiperson activity recognition, human action recognition performance, slowly varying feature analysis, visual receptive field, cortical neuron, temporal slowness principle, learning principle, visual perception, spatial relationship, original unsupervised SFA learning strategy, supervised SFA-based approach, spatial discriminative SFA, feature function extraction, motion boundary, accumulated squared derivative feature vector, ASD feature encoding, statistical distribution, action sequence, linear support vector machine, Weizmann database, UT-interaction database, motion pattern, Feature extraction, Humans, Visualization, Neurons, Vectors, Spatiotemporal phenomena, Pattern recognition, slow feature analysis., Human action recognition
Zhang Zhang, Dacheng Tao, "Slow Feature Analysis for Human Action Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 3, pp. 436-450, March 2012, doi:10.1109/TPAMI.2011.157