Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization

Somayeh Moazeni (smoazeni***at***stevens.edu)
Warren Powell (powell***at***princeton.edu)
Boris Defourny (defourny***at***lehigh.edu)
Belgacem Bouzaiene-Ayari (belgacem***at***princeton.edu)

Abstract: This paper presents an approach to non-stationary policy search for finite-horizon, discrete-time Markovian decision problems with large state spaces, constrained action sets, and a risk-sensitive optimality criterion. The methodology relies on modeling time variant policy parameters by a non-parametric response surface model for an indirect parametrized policy motivated by the Bellman equation. Through the interpolating approximation, the level of non-stationarity of the policy and consequently the size of the resulting search problem can be adjusted. The computational tractability and the generality of the approach follow from a nested parallel implementation of a derivative-free optimization in conjunction with Monte Carlo simulation. We illustrate the efficiency of the approach by an optimal energy storage charging problem to minimize a risk functional of the cost. We observe that the achieved improvement from non-stationarity depends on the risk functional and is particularly significant for the Value-at-Risk.

Keywords: dynamic optimization, cost function approximation (CFA) policy, stochastic optimization

Category 1: Stochastic Programming

Category 2: Optimization Software and Modeling Systems (Parallel Algorithms )

Category 3: Other Topics (Dynamic Programming )


Entry Submitted: 09/12/2015
Entry Accepted: 09/12/2015
Entry Last Modified: 05/06/2017

