CSDL Home C CVPRW 2008 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Anchorage, AK, USA
June 23, 2008 to June 28, 2008
Michael Harville , CastTV Inc., San Francisco, California, USA
Feng Tang , University of California, Santa Cruz, USA
Ian N. Robinson , HP Labs. Palo Alto, California, USA
Object tracking methods based on stereo cameras, which provide both color and depth data at each pixel, find advantage in separating objects from each other and from background, determining the 3D size and location of objects, and modeling object shape. However, stereo tracking methods to date sometimes fail due to depth image noise, and discard much useful appearance information. We propose augmenting stereo-based models of tracked objects with sparse local appearance features, which have recently been applied with great success to object recognition under pose variation and partial occlusion. Depth data complements sparse local features by informing correct assignment of features to objects, while tracking of stable local appearance features helps overcome distortion of object shape models due to depth noise and partial occlusion. To speed up tracking of many local features, we also use a “binary Gabor” representation that is highly descriptive yet efficiently computed using integral images. In addition, a novel online feature selection and pruning technique is described to focus tracking onto the best localized and most consistent features. A tracking framework fusing all of these aspects is provided, and results for challenging video sequences are discussed.
Michael Harville, Feng Tang, Ian N. Robinson, "Fusion of local appearance with stereo depth for object tracking", CVPRW, 2008, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008, pp. 1-8, doi:10.1109/CVPRW.2008.4563036