loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 3
On-Line EM Reinforcement Learning
Como, Italy
July 24-July 27
ISBN: 0-7695-0619-4
Junichiro Yoshimoto, Nara Institute of Science and Technology
Shin Ishii, Nara Institute of Science and Technology and CREST, Japan Science and Technology Corporation
Masa-aki Sato, A TR Human Information Processing Research Laboratories and CREST, Japan Science and Technology Corporation
In this article, we propose a new reinforcement learning (RL) method for a system having continuous state and action spaces. Our RL method has architecture like the actor-critic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The soft-max policy is more likely to select an action that has a higher Q-function value. The on-line EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations show that our method is able to acquire good control in the two tasks after a few learning trials.
Citation:
Junichiro Yoshimoto, Shin Ishii, Masa-aki Sato, "On-Line EM Reinforcement Learning," ijcnn, vol. 3, pp.3163, IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 3, 2000
Usage of this product signifies your acceptance of the Terms of Use.