2017 International Conference on 3D Vision (3DV) (2017)
Oct 10, 2017 to Oct 12, 2017
Monocular visual localization and mapping algorithms are able to estimate the environment only up to scale, a degree of freedom which leads to scale drift, difficulty closing loops, and eventual failure. This paper describes an image-driven approach for scale-drift correction which uses a convolutional neural network to infer the speed of the camera from successive monocular video frames. We obtain continuous drift correction, avoiding the need for explicit higher-level representations of the map to resolve scale. We also propose a novel method of including speed estimates as a regularizer in bundle adjustment which avoids the pitfalls of sudden imposition of scale knowledge. We demonstrate our approach using long-distance sequences for which ground truth is available, and find output that is essentially free of scale drift. We compare the performance with number of other methods for scale-drift correction from monocular data, and show that our solution achieves more accurate results.
cameras, feature extraction, image reconstruction, image representation, image sequences, learning (artificial intelligence), mobile robots, neural nets, pose estimation, robot vision, SLAM (robots), video signal processing
D. Frost, D. Murray and V. Prisacariu, "Using Learning of Speed to Stabilize Scale in Monocular Localization and Mapping," 2017 International Conference on 3D Vision (3DV), Qingdao, China, 2018, pp. 527-536.