-

 

 

 




Optimization Online





 

Information Relaxation Bounds for Infinite Horizon Markov Decision Processes

David Brown (dbbrown***at***duke.edu)
Martin Haugh (mh2078***at***columbia.edu)

Abstract: We consider the information relaxation approach for calculating performance bounds for stochastic dynamic programs (DPs), following Brown, Smith, and Sun (2010). This approach generates performance bounds by solving problems with relaxed nonanticipativity constraints and a penalty that punishes violations of these constraints. In this paper, we study infinite horizon DPs with discounted costs and consider applying information relaxations to reformulations of the DP. These reformulations use different state transition functions and correct for the change in state transition probabilities by multiplying by likelihood ratio factors. These reformulations can greatly simplify solution of the information relaxations, both in leading to finite horizon subproblems and by reducing the number of states that need to be considered in these subproblems. We show that any reformulation leads to a lower bound on the optimal cost of the DP when used with an information relaxation and a penalty built from a broad class of approximate value functions. We refer to this class of approximate value functions as \emph{subsolutions}, and this includes approximate value functions based on Lagrangian relaxations as well as those based on approximate linear programs. We show that the method, in theory, recovers a tight lower bound using any reformulation, and is guaranteed to improve upon the lower bounds from subsolutions. Finally, we apply the method to an inventory control application with an autoregressive demand process, as well as dynamic service allocation in a multiclass queue. In our examples, we find that the information relaxation lower bounds are easy to calculate and are very close to the expected cost using simple heuristic policies, thereby showing that these heuristic policies are nearly optimal.

Keywords: dynamic programming, Markov decision processes, multiclass queues

Category 1: Other Topics (Dynamic Programming )

Category 2: Stochastic Programming

Category 3: Applications -- OR and Management Sciences (Scheduling )

Citation: Duke University, September, 2014. Brown, D.B., J.E. Smith, and P. Sun. 2010. Information relaxations and duality in stochastic dynamic programs. Operations Research 58(4) 785-801.

Download: [PDF]

Entry Submitted: 09/06/2014
Entry Accepted: 09/08/2014
Entry Last Modified: 12/30/2016

Modify/Update this entry


  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository

 

Submit
Update
Policies
Coordinator's Board
Classification Scheme
Credits
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society