Kybernetika 50 no. 3, 378-392, 2014

About stability of risk-seeking optimal stopping

Raúl Montes-de-Oca and Elena ZaitsevaDOI: 10.14736/kyb-2014-3-0378

Abstract:

We offer the quantitative estimation of stability of risk-sensitive cost optimization in the problem of optimal stopping of Markov chain on a Borel space $X$. It is supposed that the transition probability $p(\cdot |x)$, $x\in X$ is approximated by the transition probability $\widetilde{p}(\cdot |x)$, $x\in X$, and that the stopping rule $\widetilde{f}_*$ , which is optimal for the process with the transition probability $\widetilde{p}$ is applied to the process with the transition probability $p$. We give an upper bound (expressed in term of the total variation distance: $\sup_{x\in X}\|p(\cdot |x)-\widetilde{p}(\cdot |x)\|)$ for an additional cost paid for using the rule $\widetilde{f}_*$ instead of the (unknown) stopping rule $f_*$ optimal for $p$.

Keywords:

discrete-time Markov process, optimal stopping rule, stability index, total variation metric, risk-seeking expected total cost

Classification:

60G40, 62L15

paper.pdf

References:

G. Avila-Godoy and E. Fernández-Gaucherand: Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications. In: Decision and Control 1998. Proc. 37th IEEE Conference. Vol. 1, IEEE, pp. 778-783. CrossRef
N. Bäuerle and U. Rieder: Markov Decision Processes with Applications to Finance. Springer-Verlag, Berlin 2011. CrossRef
V. S. Borkar and S. P. Meyn: Risk-sensitive optimal control for Markov decision processes with monotone cost. Math. Oper. Res. 27 (2002), 192-209. CrossRef
R. Cavazos-Cadena: Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains. Math. Methods Oper. Res. 71 (2010), 47-84. CrossRef
R. Cavazos-Cadena and E. Fernández-Gaucherand: Controlled Markov chains with risk-sensitive criteria: Average costs, optimality equations, and optimal solutions. Math. Methods Oper. Res. 49 (1999), 299-324. CrossRef
R. Cavazos-Cadena and R. Montes-de-Oca: Optimal stationary policies in risk-sensitive dynamic programs with finite state space and nonegative rewards. Appl. Math. 27 (2000), 167-185. CrossRef
N. M. Van Dijk and K. Sladký: Error bounds for nonnegative dynamic models. J. Optim. Theory Appl. 101 (1999), 449-474. CrossRef
L. Devroye and L. Györfy: Nonparametric Density Estimation: The $L_1$ View. John Wiley, New York 1986. CrossRef
E. B. Dynkin and A. A. Yushkevich: Controlled Markov Processes. Springer Verlag, New York 1979. CrossRef
E. I. Gordienko and A. A. Yushkevich: Stability estimates in the problem of average optimal switching of a Markov chain. Math. Methods Oper. Res. 57 (2003), 345-365. CrossRef
E. I. Gordienko, E. Lemus-Rodríguez and R. Montes-de-Oca: Average cost Markov control processes: stability with respect to the Kantorovich metric. Math. Methods Oper. Res. 70 (2009), 13-33. CrossRef
E. I. Gordienko and F. Salem: Robustness inequalities for Markov control processes with unbounded costs. Syst. Control Lett. 33 (1998), 125-130. CrossRef
O. Hernández-Lerma and J. B. Lasserre: Further Topics on Discrete-time Markov Control Processes. Springer-Verlag, New York 1999. CrossRef
A. Jaśkiewicz: Average optimality for risk-sensitive control with general state space. Ann. Appl. Probab. 17 (2007), 654-675. CrossRef
N. V. Kartashov: Strong Stable Markov Chains. VSP, Utrecht 1996. CrossRef
S. I. Marcus, E. Fernández-Gaucherand, D. E. Hernández-Hernández, S. Coraluppi and P. Fard: Risk sensitive Markov decision processes. Progress in System and Control Theory 22 (1997), 263-280. CrossRef
G. B. Di Masi and L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Systems Control Lett. 40 (2000), 15-20. CrossRef
S. P. Meyn and R. L. Tweedie: Markov Chains and Stochastic Stability. Springer-Verlag, London 1993. CrossRef
R. Montes-de-Oca and F. Salem-Silva: Estimates for perturbations of an average Markov decision processes with a minimal state and upper bounded stochastically ordered Markov chains. Kybernetika 41 (2005), 757-772. CrossRef
B. K. Muciek: Optimal stopping of risk processes: model with interest rates. J. Appl. Probab. 39 (2002), 261-270. CrossRef
A. N. Shiryaev: Optimal Stopping Rules. Springer-Verlag, New York 1978. CrossRef
A. N. Shiryaev: Essential of Stochastic Finance. Facts, Models, Theory. World Scientific Publishing Co., Inc., River Edge, N. J. 1999. CrossRef
K. Sladký: Bounds on discrete dynamic programming recursions I. Kybernetika 16 (1980), 526-547. CrossRef
E. Zaitseva: Stability estimating in optimal stopping problem. Kybernetika 44 (2008), 400-415. CrossRef

Kybernetika

Journal