7th International Conference on Hybrid Intelligent Systems (HIS 2007) Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion Kaiserslautern, Germany September 17-September 19 ISBN: 0-7695-2946-1
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/HIS.2007.61
In this work authors extend the model of the reinforcement learning paradigm for multi-agent systems called "Influence Value Reinforcement Learning" (IVRL). In previous work an algorithm for repetitive games was proposed, and it outperformed traditional paradigms. Here, authors define an algorithm based on this paradigm for using when agents has to learn from delayed rewards, thus, an influence value reinforcement learning algorithm for two agents stochastic games. The IVRL paradigm is based on social interaction of people, specially in the fact that people communicate each other what they think about their actions and this opinion has some influence in the behavior of each other. A modified version of Q-Learning algorithm using this paradigm was constructed. The so called IVQ-Learning algorithm was implemented and compared with versions of Q-Learning for independent learning and joint action learning. Our approach shows to have more probability to converge to an optimal equilibrium than IQ-Learning and JAQ-Learning algorithms, specially when exploration increases.
Citation:
Dennis Barrios-Aranibar, Luiz Marcos Garcia Goncalves, "Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion," his, pp.198-203, 7th International Conference on Hybrid Intelligent Systems (HIS 2007), 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||