Model-building adaptive critics for semi-Markov control

Szczegóły
Abstrakt

Tytuł:: Model-building adaptive critics for semi-Markov control
Autorzy:: Gosavi, A.
Murray, S.
Hu, J.
Ghosh, S.
Data publikacji:: 2012
Słowa kluczowe:: adaptive critics
learning algorithm
semi-Markov process
decision process
Język:: angielski
Dostawca treści:: BazTech
: Artykuł

Adaptive (or actor) critics are a class of reinforcement learning algorithms. Generally, in adaptive critics, one starts with randomized policies and gradually updates the probability of selecting actions until a deterministic policy is obtained. Classically, these algorithms have been studied for Markov decision processes under model-free updates. Algorithms that build the model are often more stable and require less training in comparison to their model-free counterparts. We propose a new model-building adaptive critic, which builds the model during the learning, for a discounted-reward semi-Markov decision process under some assumptions on the structure of the process. We illustrate the use of our algorithm with numerical results on a system with 10 states and a real-world case-study from management science.

Informacja

Model-building adaptive critics for semi-Markov control