5th Brazilian Symposium on Neural Networks Safe Inclusion of Information about Rates of Variation in a Reinforcement Learning Algorithm Belo Horizonte, MG, Brazil December 09-December 11 ISBN: 0-8186-8629-4
There is a need to enhance reinforcement learning techniques by using prior knowledge built into the agent at its inception. The information crudeness upon which those algorithms operate may be interesting from a theoretical point of view, but large scale problems are made too difficult and unrealistic by considering the learning agent as a `tabula rasa'. Nonetheless,knowledge must be embedded in such a way that the structural, well-studied characteristics of the fundamental algorithms are maintained.A more general formulation of a classical reinforcement learning method is investigated in this article. It allows for a spreading of information derived from single updates towards a neighbourhood of the instantly visited state, and converges to optimality. We show how this new formulation can be used as a mechanism to safely embed prior knowledge about expected rates of variation of action values, and practical studies demonstrate an application of the proposed algorithm.
Citation:
Carlos H.C. Ribeiro, "Safe Inclusion of Information about Rates of Variation in a Reinforcement Learning Algorithm," sbrn, pp.2, 5th Brazilian Symposium on Neural Networks, 1998 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||