B1794
Title: Non-parametric recursive regression for $Q$ estimation in actor-critic reinforcement learning
Authors: Leo Grill - Universite de Poitiers (France) [presenting]
Yousri Slaoui - University of Poitiers (France)
Stephane Le Masson - Orange (France)
David Nortershauser - Orange (France)
Abstract: An algorithm is presented to estimate the $Q$ value for reinforcement learning in an actor-critic context. The non-parametric estimator learns recursively while the interactions between the actor and the environment generate the data. The estimator helps the convergence of the model with bias and variance control.