Kybernetika 55 no. 4, 618-640, 2019

Handling a Kullback--Leibler divergence random walk for scheduling effective patrol strategies in Stackelberg security games

César U. S. Solis, Julio B. Clempner and Alexander S. PoznyakDOI: 10.14736/kyb-2019-4-0618

Abstract:

This paper presents a new model for computing optimal randomized security policies in non-cooperative Stackelberg Security Games (SSGs) for multiple players. Our framework rests upon the extraproximal method and its extension to Markov chains, within which we explicitly compute the unique Stackelberg/Nash equilibrium of the game by employing the Lagrange method and introducing the Tikhonov regularization method. We also consider a game-theory realization of the problem that involves defenders and attackers performing a discrete-time random walk over a finite state space. Following the Kullback--Leibler divergence the players' actions are fixed and, then the next-state distribution is computed. The player's goal at each time step is to specify the probability distribution for the next state. We present an explicit construction of a computationally efficient strategy under mild defenders and attackers conditions and demonstrate the performance of the proposed method on a simulated target tracking problem.

Keywords:

Markov chains, security, Stackelberg games, patrolling

Classification:

91A10, 91A35, 91A80, 91B06, 91B70, 91B74

paper.pdf

References:

N. Agmon, G. A. Kaminka and S. Kraus: Multi-robot adversarial patrolling: facing a full-knowledge opponent. J. Artif. Intell. Res. 42 (2011), 1, 887-916. CrossRef
S. Albarran and J. B. Clempner: A Stackelberg security Markov game based on partial information for strategic decision making against unexpected attacks. Engrg. Appl. Artif. Intell. 81 (2019), 408-419. DOI:10.1016/j.engappai.2019.03.010
A. S. Antipin: An extraproximal method for solving equilibrium programming problems and games. Comput. Mathematics and Math. Phys. 45 (2005), 11, 1893-1914. CrossRef
A. Blum, N and; Haghtalab and A. D. Procaccia: Lazy defenders are almost optimal against diligent attackers. In: Proc. 28th AAAI Conference on Artificial Intelligence, Québec 2014, pp. 573-579. CrossRef
J. B. Clempner: A continuous-time Markov Stackelberg security game approach for reasoning about real patrol strategies. Int. J. Control 91 (2018), 2494-2510. DOI:10.1080/00207179.2017.1371853
J. B. Clempner and A. S. Poznyak: Simple computing of the customer lifetime value: A fixed local-optimal policy approach. J. Systems Sci. Systems Engrg. 23 (2014), 4, 439-459. DOI:10.1007/s11518-014-5260-y
J. B. Clempner and A. S. Poznyak: Stackelberg security games: Computing the shortest-path equilibrium. Expert Syst. Appl. 42 (2015), 8, 3967-3979. DOI:10.1016/j.eswa.2014.12.034
J. B. Clempner and A. S. Poznyak: Analyzing an optimistic attitude for the leader firm in duopoly models: A strong Stackelberg equilibrium based on a lyapunov game theory approach. Econ. Comput. Econ. Cybern. Stud. Res. 4 (2016), 50, 41-60. CrossRef
J. B. Clempner and A. S. Poznyak: Conforming coalitions in Stackelberg security games: Setting max cooperative defenders vs. non-cooperative attackers. Appl. Soft Comput. 47 (2016), 1-11. DOI:10.1016/j.asoc.2016.05.037
J. B. Clempner and A. S. Poznyak: Conforming coalitions in Stackelberg security games: Setting max cooperative defenders vs. non-cooperative attackers. Appl. Soft Comput. 47 (2016), 1-11. DOI:10.1016/j.asoc.2016.05.037
J. B. Clempner and A. S. Poznyak: Convergence analysis for pure and stationary strategies in repeated potential games: Nash, lyapunov and correlated equilibria. Expert Systems Appl. 46 (2016), 474-484. DOI:10.1016/j.eswa.2015.11.006
J. B. Clempner and A. S. Poznyak: Using the extraproximal method for computing the shortest-path mixed lyapunov equilibrium in Stackelberg security games. Math. Comput. Simul. 138 (2017), 14-30. DOI:10.1016/j.matcom.2016.12.010
J. B. Clempner and A. S. Poznyak: A Tikhonov regularization parameter approach for solving lagrange constrained optimization problems. Engrg. Optim. 50 (2018), 11, 1996-2012. DOI:10.1080/0305215x.2017.1418866
J. B. Clempner and A. S. Poznyak: A Tikhonov regularized penalty function approach for solving polylinear programming problems. J. Comput. Appl. Math. 328 (2018), 267-286. DOI:10.1016/j.cam.2017.07.032
V. Conitzer and T. Sandholm: Computing the optimal strategy to commit to. In: Seventh ACM Conference on Electronic Commerce, Ann Arbor 2006, pp. 82-90. CrossRef
D. Guerrero, A. A. Carsteanu, R. Huerta and J. B. Clempner: An iterative method for solving Stackelberg security games: A Markov games approach. In: 14th International Conference on Electrical Engineering, Computing Science and Automatic Control, Mexico City 2017, pp. 1-6. DOI:10.1109/iceee.2017.8108857
D. Guerrero, A. A. Carsteanu, R. Huerta and J. B. Clempner: Solving Stackelberg security Markov games employing the bargaining nash approach: Convergence analysis. Computers Security 74 (2018), 240-257. DOI:10.1016/j.cose.2018.01.005
M. Jain, E. Kardes, C. Kiekintveld, F. Ordoñez and M. Tambe: Security games with arbitrary schedules: A branch and price approach. In: Proc. National Conference on Artificial Intelligence (AAAI), Atlanta 2010. DOI:10.1016/j.cose.2018.01.005
C. Kiekintveld, M. Jain, J. Tsai, J. Pita, F. Ordñez and M. Tambe: Computing optimal randomized resource allocations for massive security games. In: Proc. Eighth International Conference on Autonomous Agents and Multiagent Systems, volume 1, Budapest 2009, pp. 689-696. DOI:10.1017/cbo9780511973031.008
D. Korzhyk, Z. Yin, C. Kiekintveld, V. Conitzer and M. Tambe: Stackelberg vs. nash in security games: An extended investigation of interchangeability equivalence, and uniqueness. J. Artif. Intell. Res. 41 (2011), 297-327. DOI:10.1613/jair.3269
J. Letchford, L. MacDermed, V. Conitzer, R. Parr and C. L. Isbell: Computing optimal strategies to commit to in stochastic games. In: Proc. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), Toronto 2012, pp. 1380-1386. DOI:10.1145/2509002.2509011
J. Letchford and Y. Vorobeychik: Optimal interdiction of attack plans. In: Proc. Twelfth International Conference of Autonomous Agents and Multi-agent Systems (AAMAS) CrossRef
P. Paruchuri, J. P. Pearce, J. Marecki, M. Tambe, F. Ordoñez and S. Kraus: Playing games with security: An efficient exact algorithm for bayesian stackelberg games. In: Proc. Seventh International Conference on Autonomous Agents and Multiagent Systems, Estoril 2008, pp. 895-902. CrossRef
A. S. Poznyak: Advance Mathematical Tools for Automatic Control Engineers. Vol. 2 Deterministic Techniques. Elsevier, Amsterdam 2008. DOI:10.1016/b978-008044674-5.50015-8
A. S. Poznyak, K. Najim and E. Gomez-Ramirez: Self-learning Control of Finite Markov Chains. Marcel Dekker, New York 2000. CrossRef
M. Salgado and J. B. Clempner: Measuring the emotional distance using game theory via reinforcement learning: A kullback-leibler divergence approach. Expert Systems Appl. 97 (2018), 266-275. DOI:10.1016/j.eswa.2017.12.036
E. Shieh, B. An, R. Yang, M. Tambe, C. Baldwin, J. DiRenzo, B. Maule and G. Meyer: Protect: A deployed game theoretic system to protect the ports of the united states. In: Proc. 11th International Conference on Autonomous Agents and Multiagent Systems, 2012. DOI:10.1609/aimag.v33i4.2401
M. Skerker: Binary Bullets: The Ethics of Cyberwarfare, chapter Moral Concerns with Cyberspionage: Automated Keyword Searches and Data Mining, pp. 251-276. Oxford University Press, NY 2016. CrossRef
C. Solis, J. B. Clempner and A. S. Poznyak: Modeling multi-leader-follower non-cooperative Stackelberg games. Cybernetics Systems 47 (2016), 8, 650-673. DOI:10.1080/01969722.2016.1232121
K. K. Trejo, J. B. Clempner and A. S. Poznyak: Computing the stackelberg/nash equilibria using the extraproximal method: Convergence analysis and implementation details for Markov chains games. Int. J. Appl. Math. Computer Sci. 25 (2015), 2, 337-351. DOI:10.1515/amcs-2015-0026
K. K. Trejo, J. B. Clempner and A. S. Poznyak: A Stackelberg security game with random strategies based on the extraproximal theoretic approach. Engrg. Appl. Artif. Intell. 37 (2015), 145-153. DOI:10.1016/j.engappai.2014.09.002
K. K. Trejo, J. B. Clempner and A. S. Poznyak: Adapting strategies to dynamic environments in controllable stackelberg security games. In: IEEE 55th Conference on Decision and Control (CDC), Las Vegas 2016, pp. 5484-5489. DOI:10.1109/cdc.2016.7799111
K. K. Trejo, J. B. Clempner and A. S. Poznyak: An optimal strong equilibrium solution for cooperative multi-leader-follower Stackelberg Markov chains games. Kybernetika 52 (2016), 2, 258-279. DOI:10.14736/kyb-2016-2-0258
K. K. Trejo, J. B. Clempner and A. S. Poznyak: Computing the lp-strong nash equilibrium for Markov chains games. Appl. Math. Modell. 41 (2017), 399-418. DOI:10.1016/j.apm.2016.09.001
K. K. Trejo, J. B. Clempner and A. S. Poznyak: Adapting attackers and defenders preferred strategies: A reinforcement learning approach in stackelberg security games. J. Comput. System Sci. 95 (2018), 35-54. DOI:10.1016/j.jcss.2017.12.004
R. Yang, C. Kiekintveld, F. Ordonez, M. Tambe and R. John: Improving resource allocation strategy against human adversaries in security games. In: Proc. International Joint Conference on Artificial Intelligence (IJCAI), Barcelona 2011, pp. 458-464. CrossRef
Z. Yin, M. Jain, M. Tambe and F. Ordonez: Risk-averse strategies for security games with execution and observational uncertainty. In: Proc. AAAI Conference on Artificial Intelligence (AAAI), San Francisco 2011, pp. 758-763. CrossRef
Z. Yin and M. Tambe: A unified method for handling discrete and continuous uncertainty in bayesian stackelberg games. In: Proc. Eleventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Valencia 2012, pp. 234-242. CrossRef

Kybernetika

Journal