In this paper we are concerned with a class of time-varying discounted Markov decision models $\mathcal{M}_n$ with unbounded costs $c_n$ and state-action dependent discount factors. Specifically we study controlled systems whose state process evolves according to the equation $x_{n+1}=G_n(x_n,a_n,\xi_n),\ n=0,1,\ldots$, with state-action dependent discount factors of the form $\alpha_n(x_n,a_n)$, where $a_n$ and $\xi_n$ are the control and the random disturbance at time $n$, respectively. Assuming that the sequences of functions $\lbrace\alpha_n\rbrace$,$\lbrace c_n\rbrace$ and $\lbrace G_n\rbrace$ converge, in certain sense, to $\alpha_\infty$, $c_\infty$ and $G_\infty$, our objective is to introduce a suitable control model for this class of systems and then, to show the existence of optimal policies for the limit system $\mathcal{M}_\infty$ corresponding to $\alpha_\infty$, $c_\infty$ and $G_\infty$. Finally, we illustrate our results and their applicability in a class of semi-Markov control models.
discounted optimality, non-constant discount factor, time-varying Markov decision processes
93E20, 90C40