Optimization Online


A Q-Learning Algorithm with Continuous State Space

Kengy Barty (kengy.barty***at***edf.fr)
Pierre Girardeau (pierre.girardeau***at***ensta.fr)
Jean-Sebastien Roy (jean-sebastien.roy***at***edf.fr)
Cyrille Strugarek (cyrille.strugarek***at***edf.fr)

Abstract: We study in this paper a Markov Decision Problem (MDP) with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced to solve this problem by Watkins in 1989 for completely discrete MDPs. Our algorithm relies on stochastic approximation and functional estimation, and uses kernels to locally update the Q-functions. We give a convergence proof for this algorithm under usual assumptions. Finally, we illustrate our algorithm by solving the classical moutain car task with continuous state space.

Keywords: Q-Learning, Continuous state space, kernels

Category 1: Other Topics (Dynamic Programming )

Category 2: Stochastic Programming


Download: [PDF]

Entry Submitted: 09/23/2006
Entry Accepted: 10/01/2006
Entry Last Modified: 09/23/2006

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Programming Society