Kybernetika 48 no. 2, 268-286, 2012

An unbounded Berge´s minimum theorem with applications to discounted Markov decision processes

Raúl Montes-de-Oca and Enrique Lemus-Rodríguez

Abstract:

This paper deals with a certain class of unbounded optimization problems. The optimization problems taken into account depend on a parameter. Firstly, there are established conditions which permit to guarantee the continuity with respect to the parameter of the minimum of the optimization problems under consideration, and the upper semicontinuity of the multifunction which applies each parameter into its set of minimizers. Besides, with the additional condition of uniqueness of the minimizer, its continuity is given. Some examples of nonconvex optimization problems that satisfy the conditions of the article are supplied. Secondly, the theory developed is applied to discounted Markov decision processes with unbounded cost functions and with possibly noncompact actions sets in order to obtain continuous optimal policies. This part of the paper is illustrated with two examples of the controlled Lindley's random walk. One of these examples has nonconstant action sets.

Keywords:

Berge's minimum theorem, moment function, discounted Markov decision process, uniqueness of the optimal policy, continuous optimal policy

Classification:

90C40, 93E20, 90A16

References:

  1. C. D. Aliprantis and K. C. Border: Infinite Dimensional Analysis. Third Edition. Springer-Verlag, Berlin 2006.   CrossRef
  2. R. B. Ash: Real Variables with Basic Metric Space Topology. IEEE Press, New York 1993.   CrossRef
  3. J. P. Aubin and I. Ekeland: Applied Nonlinear Analysis. John Wiley, New York 1984.   CrossRef
  4. L. M. Ausubel and R. J. Deneckere: A generalized theorem of the maximum. Econom. Theory 3 (1993), 99-107.   CrossRef
  5. C. Berge: Topological Spaces. Oliver and Boyd, Edinburgh and London 1963 (reprinted by Dover Publications, Inc., Mineola, New York 1997).   CrossRef
  6. D. Cruz-Suárez, R. Montes-de-Oca and F. Salem-Silva: Conditions for the uniqueness of optimal policies of discounted Markov decision processes. Math. Methods Oper. Res. 60 (2004), 415-436.   CrossRef
  7. J. Dugundji: Topology. Allyn and Bacon, Inc., Boston 1966.   CrossRef
  8. P. K. Dutta, M. K.Majumdar and R. K. Sundaram: Parametric continuity in dynamic programming problems. J. Econom. Dynam. Control 18 (1994), 1069-1092.   CrossRef
  9. P. K. Dutta and T. Mitra: Maximum theorems for convex structures with an application to the theory of optimal intertemporal allocations. J. Math. Econom. 18 (1989), 77-86.   CrossRef
  10. O. Hernández-Lerma and J. B. Lasserre: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York 1996.   CrossRef
  11. O. Hernández-Lerma and W. J. Runggaldier: Monotone approximations for convex stochastic control problems. J. Math. Systems Estim. Control 4 (1994), 99-140.   CrossRef
  12. K. Hinderer: Lipschitz continuity of value functions in Markovian decision Processes. Math. Methods Oper. Res. 60 (2005), 3-22.   CrossRef
  13. K. Hinderer and M. Stieglitz: Increasing and Lipschitz continuous minimizers in one-dimensional linear-convex systems without constraints: the continuous and the discrete case. Math. Methods Oper. Res. 44 (1996), 189-204.   CrossRef
  14. A. Horsley, A. J. Wrobel and T. Van Zandt: Berge's maximum theorem with two topologies on the action set. Econom. Lett. 61 (1998), 285-291.   CrossRef
  15. J. S. Jordan: The continuity of optimal dynamic decision rules. Econometrica 45 (1977), 1365-1376.   CrossRef
  16. T. Kamihigashi: Stochastic optimal growth with bounded or unbounded utility and with bounded or unbounded shocks. J. Math. Econom. 43 (2007), 477-500.   CrossRef
  17. T. Kamihigashi and S. Roy: A nonsmooth, nonconvex model of optimal growth. J. Econom. Theory 132 (2007), 435-460.   CrossRef
  18. R. B. King: Beyond Quartic Equation. Birkhauser, Boston 1996.   CrossRef
  19. M. Kitayev: Semi-Markov and jump Markov control models: average cost criterion. Theory Probab. Appl. 30 (1985), 272-288.   CrossRef
  20. D. V. Lindley: The theory of queues with a single server. Proc. Cambridge Philos. Soc. 48 (1952), 277-289.   CrossRef
  21. M. Majumdar and R. Radner: Stationary optimal policies with discounting in a stochastic activity analysis model. Econometrica 51 (1983), 1821-1837.   CrossRef
  22. S. P. Meyn: Ergodic Theorems for discrete time stochastic systems using a stochastic Lyapunov functions. SIAM J. Control Optim. 27 (1989), 1409-1439.   CrossRef
  23. E. A. Ok: Real Analysis with Economic Applications. Princeton University Press, Princeton 2007.   CrossRef
  24. A. L. Peressini, F. E. Sullivan and J. J. Uhl: The Mathematics of Nonlinear Programming. Springer-Verlag, New York 1988.   CrossRef
  25. M. L. Puterman: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York 1994.   CrossRef
  26. U. Rieder: Measurable selection theorems for optimization problems. Manuscripta Math. 24 (1978), 115-131.   CrossRef
  27. H. L. Royden: Real Analysis. Third Edition. Macmillan, New York 1988.   CrossRef
  28. R. H. Stockbridge: Time-average control of martingale problems: a linear programming formulation. Ann. Probab. 18 (1990), 291-314.   CrossRef
  29. R. Sundaram: A First Course in Optimization Theory. Cambridge University Press, Cambridge 1996.   CrossRef
  30. G. Tian and J. Zhou: The maximum theorem and the existence of Nash equilibrium of (generalized) games without lower semicontinuities. J. Math. Anal. Appl. 166 (1992), 351-364.   CrossRef
  31. G. Tian and J. Zhou: Transfer continuities, generalizations of the Weierstrass and maximum theorem: a full characterization. J. Math. Econom. 24 (1995), 281-303.   CrossRef
  32. M. Walker: A generalization of the maximum theorem. Internat. Econom. Rev. 20 (1979), 267-272.   CrossRef
  33. A. Yushkevich: Blackwell optimality in Borelian continuous-in-action Markov decision processes. SIAM J. Control Optim. 35 (1997), 2157-2182.   CrossRef