Issue No. 02 - Feb. (2015 vol. 37)
Marc Peter Deisenroth , Department of Computing, Imperial College London, 180 Queen’s Gate, London SW72AZ, United Kingdom
Dieter Fox , Department of Computer Science & Engineering, University of Washington, Box 352350, Seattle,
Carl Edward Rasmussen , Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB21PZ, United Kingdom
Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.
Computational modeling, Probabilistic logic, Approximation methods, Robots, Uncertainty, Data models, Predictive models,reinforcement learning, Policy search, robotics, control, Gaussian processes, Bayesian inference
Marc Peter Deisenroth, Dieter Fox, Carl Edward Rasmussen, "Gaussian Processes for Data-Efficient Learning in Robotics and Control", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, no. , pp. 408-423, Feb. 2015, doi:10.1109/TPAMI.2013.218