Kybernetika 60 no. 1, 1-18, 2024

Denumerable Markov stopping games with risk-sensitive total reward criterion

Manuel A. Torres-Gomar, Rolando Cavazos-Cadena and Hugo Cruz-SuárezDOI: 10.14736/kyb-2024-1-0001

Abstract:

This paper studies Markov stopping games with two players on a denumerable state space. At each decision time player II has two actions: to stop the game paying a terminal reward to player I, or to let the system to continue it evolution. In this latter case, player I selects an action affecting the transitions and charges a running reward to player II. The performance of each pair of strategies is measured by the risk-sensitive total expected reward of player I. Under mild continuity and compactness conditions on the components of the model, it is proved that the value of the game satisfies an equilibrium equation, and the existence of a Nash equilibrium is established.

Keywords:

Nash equilibrium, hitting time, fixed point, equilibrium equation, bounded rewards, monotone operator

Classification:

91A10, 91A15

References:

  1. A. Alanís-Durán and R. Cavazos-Cadena: An optimality system for finite average Markov decision chains under risk-aversion. Kybernetika 48 (2012), 1, 83-104.   CrossRef
  2. N. Bäuerle and U. Rieder: Markov Decision Processes with Applications to Finance. Springer-Verlag, New York 2011.   CrossRef
  3. N. Bäuerle and U. Rieder: More risk-sensitive Markov decision processes. Math. Oper. Res. 39 (2014), 1, 105-120.   DOI:10.1287/moor.2013.0601
  4. S. Balaji and S. P. Meyn: Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stoch. Proc. Appl. 90 (2000), 1, 123-144.   DOI:10.1016/S0304-4149(00)00032-6
  5. T. Bielecki, D. Hernández-Hernández and S. R. Pliska: Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management. Math. Methods Oper. Res. 50 (1999), 167-188.   DOI:10.1007/s001860050094
  6. V. S. Borkar and S. P. Meyn: Risk-sensitive optimal control for Markov decision process with monotone cost. Math. Oper. Res. 27 (2002), 1, 192-209.   DOI:10.1287/moor.27.1.192.334
  7. R. Cavazos-Cadena and D. Hernández-Hernández: A system of Poisson equations for a non-constant Varadhan functional on a finite state space. Appl. Math. Optim. 53 (2006), 101-119.   DOI:10.1007/s00245-005-0840-3
  8. R. Cavazos-Cadena and D. Hernández-Hernández: Nash equilibrium in a class of Markov stopping games. Kybernetika 48 (2012), 1027-1044.   CrossRef
  9. R. Cavazos-Cadena, L. Rodríguez-Gutiérrez and D. M. Sánchez-Guillermo: Markov stopping game with an absorbing state. Kybernetika 57 (2021), 3, 474-492.   DOI:10.14736/kyb-2021-3-0474
  10. R. Cavazos-Cadena, M. Cantú-Sifuentes and I. Cerda-Delgado: Nash equilibria in a class of Markov stopping games with total reward criterion. Math. Methods Oper. Res. 94 (2021), 319-340.   DOI:10.1007/s00186-021-00759-5
  11. E. V. Denardo and U. G. Rothblum: A turnpike theorem for A risk-sensitive Markov decision process with stopping. SIAM J. Control Optim. 45 (2006), 2, 414-431.   DOI:10.1137/S0363012904442616
  12. G. B. Di Masi and L. Stettner: Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J. Control Optim. 38 (1999), 1, 61-78.   DOI:10.1137/S0363012997320614
  13. G. B. Di Masi and L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Systems Control Lett. 40 (2000), 1, 305-321.   DOI:10.1016/S0167-6911(00)00018-9
  14. G. B. Di Masi and L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optim. 46 (2007), 1, 231-252.   DOI:10.1137/040618631
  15. O. Hernández-Lerma: Adaptive Markov Control Processes. Springer, New York 1988.   CrossRef
  16. R. Howard and J. Matheson: Risk-sensitive Markov decision processes. Management Science 18 (1972), 356-369.   DOI:10.1287/mnsc.18.7.356
  17. A. Jaśkiewicz: Average optimality for risk sensitive control with general state space. Ann. App. Probab. 17 (2007), 2, 654-675.   DOI:10.1214/105051606000000790
  18. I. Kontoyiannis and S. P. Meyn: Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. App. Probab. 13 (2003), 1, 304-362.   DOI:10.1214/aoap/1042765670
  19. J. López-Rivero, R. Cavazos-Cadena and H. Cruz-Suárez: Risk-sensitive Markov stopping games with an absorbing state. Kybernetika 58 (2022), 1, 101-122.   DOI:10.14736/kyb-2022-1-0101
  20. V. M. Martínez-Cortés: Bipersonal stochastic transient Markov games with stopping times and total reward criteria. Kybernetika 57 (2021), 1, 1-14.   DOI:10.14736/kyb-2021-1-0001
  21. M. Pitera and L. Stettner: Long run risk sensitive portfolio with general factors. Math. Meth. Oper. Res. 82 (2016), 2, 265-293.   DOI:10.1007/s00186-015-0514-0
  22. M. Puterman: Markov Decision Processes. Wiley, New York 1994.   CrossRef
  23. K. Sladký: Ramsey growth model under uncertainty. In: Proc. 27th International Conference Mathematical Methods in Economics 2009 (H. Brozová, ed.), Kostelec nad Cernými lesy 2009, pp. 296-300.   CrossRef
  24. K. Sladký: Risk-sensitive Ramsey growth model. In: Proce. 27th International Conference Mathematical Methods in Economics 2010 (M. Houda and J. Friebelová, eds.), Ceské Budějovice 2010, pp. 1-6.   CrossRef
  25. K. Sladký: Risk-sensitive average optimality in Markov decision processes. Kybernetika 54 (2018), 1218-1230.   DOI:10.14736/kyb-2018-6-1218
  26. L. Stettner: Risk sensitive portfolio optimization. Math. Meth. Oper. Res. 50 (1999), 3, 463-474.   DOI:10.1007/s001860050081