2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2010)
San Francisco, CA, USA
June 13, 2010 to June 18, 2010
Yi Li , Computer Vision Lab University of Maryland, College Park, MD 20742
Cornelia Fermuller , Computer Vision Lab University of Maryland, College Park, MD 20742
Yiannis Aloimonos , Computer Vision Lab University of Maryland, College Park, MD 20742
Hui Ji , Department of Mathematics, National University of Singapore
A central problem in the analysis of motion capture (Mo-Cap) data is how to decompose motion sequences into primitives. Ideally, a description in terms of primitives should facilitate the recognition, synthesis, and characterization of actions. We propose an unsupervised learning algorithm for automatically decomposing joint movements in human motion capture (MoCap) sequences into shift-invariant basis functions. Our formulation models the time series data of joint movements in actions as a sparse linear combination of short basis functions (snippets), which are executed (or “activated”) at different positions in time. Given a set of MoCap sequences of different actions, our algorithm finds the decomposition of MoCap sequences in terms of basis functions and their activations in time. Using the tools of L<inf>1</inf> minimization, the procedure alternately solves two large convex minimizations: Given the basis functions, a variant of Orthogonal Matching Pursuit solves for the activations, and given the activations, the Split Bregman Algorithm solves for the basis functions. Experiments demonstrate the power of the decomposition in a number of applications, including action recognition, retrieval, MoCap data compression, and as a tool for classification in the diagnosis of Parkinson (a motion disorder disease).
H. Ji, C. Fermuller, Y. Aloimonos and Y. Li, "Learning shift-invariant sparse representation of actions," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), San Francisco, CA, USA, 2010, pp. 2630-2637.