Kybernetika 55 no. 1, 152-165, 2019

Nash ε-equilibria for stochastic games with total reward functions: an approach through Markov decision processes

Francisco J. González-Padilla and Raúl Montes-de-OcaDOI: 10.14736/kyb-2019-1-0152

Abstract:

The main objective of this paper is to find structural conditions under which a stochastic game between two players with total reward functions has an $\epsilon$-equilibrium. To reach this goal, the results of Markov decision processes are used to find $\epsilon$-optimal strategies for each player and then the correspondence of a better answer as well as a more general version of Kakutani's Fixed Point Theorem to obtain the $\epsilon$-equilibrium mentioned. Moreover, two examples to illustrate the theory developed are presented.

Keywords:

Nash equilibrium, Markov decision processes, stochastic games, total rewards

Classification:

91A15, 91A50, 90C40

References:

  1. C. D. Aliprantis and K. C. Border: Infinite Dimensional Analysis. Springer 2006.   CrossRef
  2. R. B. Ash: Real Analysis and Probability. Academic Press, New York 1972.   CrossRef
  3. R. Bartle: The Elements of Real Analysis. John Wiley and Sons, Inc. 1964.   DOI:10.1002/zamm.19650450519
  4. R. Cavazos-Cadena and R. Montes-de-Oca: Optimal and nearly optimal policies in Markov decision chains with nonnegative rewards and risk-sensitive expected total-reward criterion. In: Markov Processes and Controlled Markov Chains 2002 (Z. Hou, J. A. Filar and A. Chen, eds.), Kluwer Academic Publishers, pp. 189-221.   DOI:10.1007/978-1-4613-0265-0\_11
  5. J. Filar and K. Vrieze: Competitive Markov Decision Processes. Springer-Verlag, New York 1997.   CrossRef
  6. E. D. Habil: Double sequences and double series. The Islamic Univ. J., Series of Natural Studies and Engineering 14 (2006), 1-32. (This reference is available at the Islamic University Journal's site: http://journal.iugaza.edu.ps/index.php/IUGNS/article/view/1594/1525.)   CrossRef
  7. O. Hernández-Lerma and J. B. Lasserre: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York 1996.   DOI:10.1007/978-1-4612-0729-0
  8. A. Hordijk: Dynamic Programming and Markov Potential Theory. Mathematical Centre Tracts 51, Amsterdam 1974.   CrossRef
  9. A. Jaśkiewicz and A. S. Nowak: Stochastic games with unbounded payoffs: Applications to robust control in Economics. Dyn. Games Appl. 1 (2011), 2, 253-279.   CrossRef
  10. S. Kakutani: A generalization of Brouwer's fixed point theorem. Duke Math. J. 8 (1942), 457-459.   DOI:10.1215/s0012-7094-41-00838-4
  11. J. L. Kelley: General Topology. Springer, New York 1955.   CrossRef
  12. G. Köthe: Topological Vector Spaces I. Springer-Verlag, 1969.   CrossRef
  13. M. Puterman: Markov Decision Processes. John Wiley and Sons, Inc. New Jersey 1994.   CrossRef
  14. L. S. Shapley: Stochastic games. Proc. Nat. Acad. Sci. U. S. A. 39 (1953), 1095-1100.   DOI:10.1073/pnas.39.10.1095
  15. F. Thuijsman: Optimality and Equilibria in Stochastic Games. CW1 Tract-82, Amsterdam 1992.   CrossRef
  16. E. Zeidler: Nonlinear Functional Analysis and its Applications. Springer-Verlag, New York Inc. 1988.   CrossRef