Kybernetika 57 no. 3, 546-566, 2021

Neural network optimal control for nonlinear system based on zero-sum differential game

Fu Xingjian and Li ZizhengDOI: 10.14736/kyb-2021-3-0546

Abstract:

In this paper, for a class of the complex nonlinear system control problems, based on the two-person zero-sum game theory, combined with the idea of approximate dynamic programming(ADP), the constrained optimization control problem is solved for the nonlinear systems with unknown system functions and unknown time-varying disturbances. In order to obtain the approximate optimal solution of the zero-sum game, the multilayer neural network is used to fit the evaluation network, the execution network and the disturbance network of ADP respectively. The Lyapunov stability theory is used to prove the uniform convergence, and the system control output converges to the neighborhood of the target reference value. Finally, the simulation example verifies the effectiveness of the algorithm.

Keywords:

neural network, nonlinear system, approximate dynamic programming, zero-sum game

Classification:

93C10, 93D21, 91A80

paper.pdf

References:

A. L'Afflitto: Differential games, continuous Lyapunov functions, and stabilization of non-linear dynamical systems. IET Control Theory Appl. 11 (2017), 2486-2496. DOI:10.1049/iet-cta.2017.0271
R. E. Bellman: Dynamic Programming. Princeton University Press, Princeton 1957. CrossRef
T. Bian, Y. Jiang and Z. P. Jiang: Adaptive dynamic programming and optimal control of nonlinear nonaffine systems. Automatica 50 (2014), 2624-2632. DOI:10.1016/j.automatica.2014.08.023
Y. Chai, J.-J. Luo, N. Han and J.-F. Xie: Attitude takeover control of failed spacecraft using SDRE based differential game approach. J. Astronaut. 41 (2020), 191-198. CrossRef
F. F. M. El-Sousy and K. A. Abuhasel: Nonlinear robust optimal control via adaptive dynamic programming of permanent-magnet linear synchronous motor drive for uncertain two-axis motion control system. IEEE Trans. Industry Appl. 56 (2020), 1940-1952. DOI:10.1109/ias.2018.8544612
S. Federico and E. Tacconi: Dynamic programming for optimal control problems with delays in the control variable. SIAM J. Control Optim. 52 (2014), 1203-1236. DOI:10.1137/110840649
Y. H. Garcia and J. Gonzalez-Hernandez: Discrete-time Markov control processes with recursive discount rates. Kybernetika 52 (2016), 403-426. CrossRef
D. Gromov and E. Gromova: On a class of hybrid differential games. Dynamic Games Appl. 7 (2017), 266-288. DOI:10.1007/s13235-016-0185-3
W. Hua, Q. Meng and J. Zhang: Differential game guidance law for dual and bounded controlled missiles. J. Bejing Univ. Aeronaut. Astronaut. 42 (2016), 1851-1856. CrossRef
R. Isaacs: Differential Games. John Wiley and Sons, New York 1965. CrossRef
A. Krasnosielska-Kobos: Construction of Nash equilibrium based on multiple stopping problem in multi-person game. Math. Methods Oper. Res. 83 (2016), 53-70. DOI:10.1007/s00186-015-0519-8
L. Lei, L. Yan-Jun, Ch. Aiqing, T. Shaocheng and C. L. P. Chen: Integral barrier Lyapunov function based adaptive control for switched nonlinear systems. Science China Inform. Sci. 63 (2020), 132203. DOI:10.1007/s11432-019-2714-7
L. Lei, J. Yan-Jun, L. Dapeng, T. Shaocheng and W. Zhanshan: Barrier Lyapunov function based adaptive fuzzy FTC for switched systems and its applications to resistance inductance capacitance circuit system. IEEE Trans. Cybernet. 50 (2020), 3491-3502. DOI:10.1109/TCYB.2019.2931770
J.-M. Li and H.-N. Zhu: Nash differential games for delayed nonlinear stochastic systems with state-and control-dependent noise. J. Guangdong Univ. Technol. 35 (2018), 41-45. CrossRef
D. R. Liu, H. L. Li and D. Wang: Neural-network-based zero-sum game for discrete time nonlinear systems via iterative adaptive dynamic programming algorithm. Neurocomputing 110 (2013), 92-100. DOI:10.1016/j.neucom.2012.11.021
M. Majid and Seyed Kamal Hosseini Sani: A Novel distributed optimal adaptive control algorithm for nonlinear multi-agent differential graphical games. IEEE/CAA J. Automat. Sinica 5 (2018), 331-341. DOI:10.1109/JAS.2017.7510784
M. Marzieh, B. Karimi and M. Mahootchi: A differential Stackelberg game for pricing on a freight transportation network with one dominant shipper and multiple oligopolistic carriers. Scientia Iranica 23 (2016), 2391-2406. DOI:10.24200/sci.2016.3964
J. Moon: Necessary and sufficient conditions of risk - sensitive optimal control and differential games for stochastic differential delayed equations. Int. J. Robust Nonlinear Control 29 (2019), 4812-4827. DOI:10.1002/rnc.4655
C. Mu and K. Wang: Approximate-optimal control algorithm for constrained zero-sum differential games through event-triggering mechanism. Nonlinear Dynamics 95 (2019), 2639-2657. DOI:10.1007/s11071-018-4713-0
J. F. Nash: Equilibrium points in n-person games. Proc. Nat. Acad. Sci. USA 36 (1950), 1, 48-49. CrossRef
J. Nash: Non-cooperative games. Ann. Math. 54 (1951), 286-295. CrossRef
J. von Neumann and O. Morgenstern: Theory of Games and Economic Behavior. Princeton University Press, Princeton 1944. CrossRef
H. Pham and X. Wei: Dynamic programming for optimal control of stochastic McKean-Vlasov dynamics. SIAM J. Control Optim. 55 (2016), 1069-1101. DOI:10.1137/16M1071390
Ch. Qin: Research on Optimal Control Based on Approximate Dynamic Programming Application in Power System. Doctoral dissertation, Northeastern University 2014. CrossRef
C. Rui-Feng, L. Wei-Dong and E. G. Li: Differential game guidance of underwater nonlinear tracking control based on continuous time generalized predictive correction. Acta Physica Sinica 67 (2018), 050501. DOI:10.7498/aps.67.20171185
R. Song, J. Li and F. L. Lewis: Robust optimal control for disturbed nonlinear zero-sum differential games based on single NN and least qquares. IEEE Trans. Systems Man Cybernet. Systems PP99 (2019), 4009-4019. DOI:10.1109/TSMC.2019.2897379
Q. Wei, D. Liu and H. Lin: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybernet. 46 (2016), 840-853. DOI:10.1109/TCYB.2015.2492242
P. J. Werbos: Advances forecasting methods for global crisis warning and models of intelligence. General Systems Yearbook 22 (1977), 25-38. CrossRef
B. Yang and C. L. Xuesong: Heuristic dynamic programming based optimal control for multiple time delay systems. J. Theoret. Appl. Inform. Technol. 48 (2013), 876-881. CrossRef
H. Zhang, Ch. Qin, B. Jiang and Y. Luo: Online adaptive policy learning algorithm for Hinf state feedback control of unknown affine nonlinear discrete-time systems. IEEE Trans. Cybernet. 44 (2014), 2706-2718. DOI:10.1109/TCYB.2014.2313915
F. Zhang, G. S. Shan and H. Gao: Rheumatoid arthritis analysis by Nash equilibrium game analysis. J. Medical Imaging Health Inform. 9 (2019), 1382-1385. DOI:10.1166/jmihi.2019.2760
Q. Zhu, Y. Liu and G. Wen: Adaptive neural network output feedback control for stochastic nonlinear systems with full state constraints. ISA Trans. 101 (2020), 60-68. DOI:10.1016/j.isatra.2020.01.021
Q. Zhu, K. Wang and Z. Shao: Distributed dynamic optimization for chemical process networks based on differential games. Industr. Engrg. Chem. Res. 59 (2020), 2441-2456. DOI:10.1021/acs.iecr.9b04663

Kybernetika

Journal