Kybernetika 58 no. 1, 101-122, 2022

Risk-sensitive Markov stopping games with an absorbing state

Jaicer López-Rivero, Rolando Cavazos-Cadena and Hugo Cruz-SuárezDOI: 10.14736/kyb-2022-1-0101

Abstract:

This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility of player I. The performance of a pair of decision strategies is measured by the risk-sensitive (expected) total reward of player I and, besides mild continuity-compactness conditions, the main structural assumption on the model is the existence of an absorbing state which is accessible from any starting point. In this context, it is shown that the value function of the game is characterized by an equilibrium equation, and the existence of a Nash equilibrium is established.

Keywords:

hitting time, fixed point, equilibrium equation, bounded rewards, monotone operator, certainty equivalent

Classification:

93E20, 93C55, 60J05

References:

  1. A. Alanís-Durán and R. Cavazos-Cadena: An optimality system for finite average Markov decision chains under risk-aversion. Kybernetika 48 (2012), 83-104.   CrossRef
  2. E. Altman and A. Shwartz: Constrained {Markov} games: Nash equilibria. In: Annals of Dynamic Games (V. Gaitsgory, J. Filar, and K. Mizukami, eds.), Birkhauser, Boston 2000, pp. 213-221.   CrossRef
  3. R. Atar and A. Budhiraja: A stochastic differential game for the inhomogeneous Laplace equation. Ann. Probab. 38 (2010), 2, 498-531.   DOI:10.1214/09-aop494
  4. S. Balaji and S. P. Meyn: Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stoch. Proc. Appl. 90 (2000), 1, 123-144.   DOI:10.1016/s0304-4149(00)00032-6
  5. N. Bäuerle and U. Rieder: Markov Decision Processes with Applications to Finance. Springer, New York 2011.   CrossRef
  6. N. Bäuerle and U. Rieder: More risk-sensitive Markov decision processes. Math. Oper. Res. 39 (2014), 1, 105-120.   DOI:10.1287/moor.2013.0601
  7. N. Bäuerle and U. Rieder: Zero-sum risk-sensitive stochastic games. Stoch. Proc. Appl. 127 (2017), 2, 622-642.   DOI:10.1016/j.spa.2016.06.020
  8. T. R. Bielecki, D. Hernández-Hernández and S. R. Pliska: Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management. Mathematical Methods of OR 50 (1999), 167-188.   DOI:10.1007/s001860050094
  9. V. S. Borkar and S. F. Meyn: Risk-sensitive optimal control for Markov decision process with monotone cost. Math. Oper. Res. 27 (2002), 1, 192-209.   DOI:10.1287/moor.27.1.192.334
  10. R. Cavazos-Cadena and D. Hernández-Hernández: A system of Poisson equations for a non-constant {Varadhan} functional on a finite state space. Appl. Math. Optim. 53 (2006), 101-119.   DOI:10.1007/s00245-005-0840-3
  11. R. Cavazos-Cadena and D. Hernández-Hernández: Nash equilibria in a class of Markov stopping games. Kybernetika 48 (2012), 5, 1027-1044.   CrossRef
  12. R. Cavazos-Cadena, L. Rodríguez-Gutiérrez and D. M. Sánchez-Guillermo: Markov stopping games with an absorbing state and total reward criterion. Kybernetika 57 (2021), 474-492.   DOI:10.14736/kyb-2021-3-0474
  13. E. V. Denardo and U. G. Rothblum: A turnpike theorem for a risk-sensitive Markov decision process with stopping. SIAM J. Control Optim. 45 (2006), 2, 414-431.   DOI:10.1137/S0363012904442616
  14. G. B. Di Masi and L. Stettner: Risk-sensitive control of discrete time Markov processes with infinite horizon. SIAM J. Control Optim. 38 (1999), 1, 61-78.   DOI:10.1137/S0363012997320614
  15. G. B. Di Masi and L. Stettner: Infinite horizon risk sensitive control of discrete time {Markov} processes with small risk. Syst. Control Lett. 40 (2000), 15-20.   DOI:10.1016/S0167-6911(99)00118-8
  16. G. B. Di Masi and L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optim. 46 (2007), 1, 231-252.   DOI:10.1137/040618631
  17. J. A.Filar and O. J. Vrieze: Competitive Markov Decision Processes. Springer, New York 1996.   CrossRef
  18. O. Hernández-Lerma: Adaptive Markov Control Processes. Springer, New York 1989.   CrossRef
  19. R. A. Howard and J. E. Matheson: Risk-sensitive Markov decision processes. Manage. Sci. 18 (1972), 7, 349-463.   DOI:10.1287/mnsc.18.7.356
  20. A. Jaśkiewicz: Average optimality for risk sensitive control with general state space. Ann. Appl. Probab. 17 (2007), 2, 654-675.   DOI:10.1214/105051606000000790
  21. V. N. Kolokoltsov and O. A. Malafeyev: Understanding Game Theory. World Scientific, Singapore 2010.   CrossRef
  22. I. Kontoyiannis and S. P. Meyn: Spectral theory and limit theorems for geometrically ergodic {Markov} processes. Ann. Appl. Probab. 13 (2003), 1, 304-362.   DOI:10.1214/aoap/1042765670
  23. V. M. Martínez-Cortés: Bi-personal stochastic transient Markov games with stopping times and total reward criterion. Kybernetika 57 (2021), 1, 1-14.   DOI:10.14736/kyb-2021-1-0001
  24. G. Peskir: On the American option problem. Math. Finance 15 (2007), 169-181.   DOI:10.1111/j.0960-1627.2005.00214.x
  25. G. Peskir and A. Shiryaev: Optimal Stopping and Free-Boundary Problems. Birkhauser, Boston 2006.   CrossRef
  26. M. Pitera and L. Stettner: Long run risk sensitive portfolio with general factors. Math. Meth. Oper. Res. 82 (2016), 2, 265-293.   DOI:10.1007/s00186-015-0514-0
  27. M. L. Puterman: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York 1994.   CrossRef
  28. L. S. Shapley: Stochastic games. Proc. National Academy Sci. 39 (1953), 10, 1095-1100.   CrossRef
  29. A. Shiryaev: Optimal Stopping Rules. Springer, New York 2008.   CrossRef
  30. K. Sladký: Growth rates and average optimality in risk-sensitive {Markov} decision chains. Kybernetika 44 (2008), 2, 205-226.   CrossRef
  31. K. Sladký: Risk-sensitive average optimality in {Markov} decision processes. Kybernetika 54 (2018), 6, 1218-1230.   DOI:10.14736/kyb-2018-6-1218
  32. L. Stettner: Risk sensitive portfolio optimization. Math. Meth. Oper. Res. 50 (1999), 3, 463-474.   DOI:10.1007/s001860050081
  33. L. E. Zachrisson: Markov Games. Princeton University Press 12, Princeton 1964.   CrossRef