Kybernetika 36 no. 2, 195-210, 2000

Estimates of stability of Markov control processes with unbounded costs

Evgueni I. Gordienko and Francisco Salem

Abstract:

For a discrete-time Markov control process with the transition probability $p$, we compare the total discounted costs $V_\beta$ $(\pi_\beta)$ and $V_\beta(\tilde{\pi}_\beta)$, when applying the optimal control policy $\pi_\beta$ and its approximation $\tilde{\pi}_\beta$. The policy $\tilde{\pi}_\beta$ is optimal for an approximating process with the transition probability $\tilde{p}$. A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index $[V_\beta(\tilde{\pi}_\beta)-V_\beta(\pi_\beta)]/V_\beta(\pi_\beta)$. This bound does not depend on a discount factor $\beta\in (0,1)$ and this is given in terms of the total variation distance between $p$ and $\tilde{p}$.

paper.pdf

Kybernetika

Journal

Account

Kybernetika 36 no. 2, 195-210, 2000

Estimates of stability of Markov control processes with unbounded costs

Abstract: