2006 First International Multi-Symposiums on Computer and Computational Sciences
Historical Temporal Difference Learning: Some Initial Results
Hangzhou, Zhejiang, China
June 20-June 24
ISBN: 0-7695-2581-4
In this paper, we develop a multi-step prediction algorithm that is guaranteed to converge when using general function approximation. Besides, the new algorithm should satisfy the following requirements: First, it does not have to be faster than TD(0) in the look-up table representation; however, the new algorithm should be faster than residual gradient method. Second, the new algorithm should learn optimally.
Index Terms:
Multi-step Prediction, Reinforcement Learning, Temporal Difference Learning
Citation:
Hengshuai Yao, Diao Dongcui, Zengqi Sun, "Historical Temporal Difference Learning: Some Initial Results," imsccs, vol. 2, pp.678-685, 2006 First International Multi-Symposiums on Computer and Computational Sciences, 2006