This work deals with a class of discrete-time zero-sum Markov games whose state process $\left\{ x_{t}\right\} $ evolves according to the equation $% x_{t+1}=F(x_{t},a_{t},b_{t},\xi _{t}),$ where $a_{t}$ and $b_{t}$ represent the actions of player 1 and 2, respectively, and $\left\{ \xi _{t}\right\} $ is a sequence of independent and identically distributed random variables with unknown distribution $\theta .$ Assuming possibly unbounded payoff, and using the empirical distribution to estimate $\theta ,$ we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.
Markov games, empirical estimation, discounted and average criteria
91A15, 62G07