مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Verion

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

1,410
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

A MODEL BASED ON ENTROPY AND LEARNING AUTOMATA FOR SOLVING STOCHASTIC GAMES

Pages

  97-106

Abstract

 Stochastic games, as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi-agent system and are used as a suitable framework for Multi Agent Reinforcement Learning. LEARNING AUTOMATA (LA) was recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. In this paper a model based on LEARNING AUTOMATA and the concept of entropy for finding optimal policies in STOCHASTIC GAMES is proposed. In the proposed model, for each state in the environment of the game and for each agent an S-model variable structure learning automaton is placed that tries to learn the optimal action probabilities in those states. The number of its adjacent states determines the number of actions of each learning automaton in each state and every joint action corresponds to a transition to an adjacent state. Entropy of the probability vector for the learning automaton of the next state is used to help learning process and improve the learning performance and is used a quantitative problem independent measurement for learning progress. We have also implemented a new version of the proposed algorithm that balances exploration with exploitation yielding improved performance. The experimental results show that the proposed algorithm has better learning performance than the other learning algorithms in terms of cost and the speed of reaching the optimal policy.

Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    MASOUMI, BEHROUZ, & MEYBODI, M.R.. (2010). A MODEL BASED ON ENTROPY AND LEARNING AUTOMATA FOR SOLVING STOCHASTIC GAMES. NASHRIYYAH-I MUHANDESI-I BARQ VA MUHANDESI-I KAMPYUTAR-I IRAN (PERSIAN), 8(2), 97-106. SID. https://sid.ir/paper/53820/en

    Vancouver: Copy

    MASOUMI BEHROUZ, MEYBODI M.R.. A MODEL BASED ON ENTROPY AND LEARNING AUTOMATA FOR SOLVING STOCHASTIC GAMES. NASHRIYYAH-I MUHANDESI-I BARQ VA MUHANDESI-I KAMPYUTAR-I IRAN (PERSIAN)[Internet]. 2010;8(2):97-106. Available from: https://sid.ir/paper/53820/en

    IEEE: Copy

    BEHROUZ MASOUMI, and M.R. MEYBODI, “A MODEL BASED ON ENTROPY AND LEARNING AUTOMATA FOR SOLVING STOCHASTIC GAMES,” NASHRIYYAH-I MUHANDESI-I BARQ VA MUHANDESI-I KAMPYUTAR-I IRAN (PERSIAN), vol. 8, no. 2, pp. 97–106, 2010, [Online]. Available: https://sid.ir/paper/53820/en

    Related Journal Papers

    Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button