×

Event-triggered adaptive dynamic programming for multi-player zero-sum games with unknown dynamics. (English) Zbl 1492.91015

Summary: In this paper, a novel event-triggered optimal control approach is developed to solve zero-sum game problems for continuous-time multi-player nonlinear systems with unknown dynamics. To begin with, a model neural network (NN) is employed to reconstruct the unknown multi-player nonlinear system by measured input and output data. Then, a critic NN is used to solve the event-triggered Hamilton-Jacobi-Isaacs (HJI) equation for multi-player zero-sum game. Meanwhile, the optimal control law and the worst disturbance law are approximated with the help of critic NN only, respectively. Compared with time-triggered method, the developed control law and the disturbance law are updated only when the triggering condition is violated; thus, the computational and communication burden are reduced. The Lyapunov stability analysis shows that the closed-loop system can be guaranteed to be stable. Finally, two simulation examples are provided to validate the effectiveness of the proposed method.

MSC:

91A06 \(n\)-person games, \(n>2\)
90C39 Dynamic programming
Full Text: DOI

References:

[1] Aliyu, MDS, An iterative relaxation approach to the solution of the Hamilton-Jacobi-Bellman-Isaacs equation in nonlinear optimal control, IEEE/CAA J Automatica Sinica, 5, 1, 360-366 (2018) · doi:10.1109/JAS.2017.7510682
[2] Dong, L.; Zhong, X.; Sun, C.; He, H., Event-triggered adaptive dynamic programming for continuous-time systems with control constraints, IEEE Trans Neural Netw Learn Syst, 28, 8, 1941-1952 (2017) · doi:10.1109/TNNLS.2016.2586303
[3] Fu, X.; Chai, T., Online solution of two-player zero-sum games for continuous-time nonlinear systems with completely unknown dynamics, IEEE Trans Neural Netw Learn Syst, 27, 12, 2577-2587 (2016) · doi:10.1109/TNNLS.2015.2496299
[4] Jiang, H.; Zhang, H., Iterative ADP learning algorithms for discrete-time multi-player games, Artif Intell Rev, 50, 1, 75-91 (2018) · doi:10.1007/s10462-017-9603-1
[5] Jiang, H.; Zhang, H.; Han, J., Iterative adaptive dynamic programming methods with neural network implementation for multi-player zero-sum games, Neurocomputing, 307, 54-60 (2018) · doi:10.1016/j.neucom.2018.04.005
[6] Liu, D.; Li, H.; Wang, D., Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics, IEEE Trans Syst Man Cybern, 44, 8, 1015-1027 (2014) · doi:10.1109/TSMC.2013.2295351
[7] Liu, D.; Wei, Q.; Wang, D., Adaptive dynamic programming with applications in optimal control (2017), Cham: Springer, Cham · Zbl 1390.93003 · doi:10.1007/978-3-319-50815-3
[8] Liu, D.; Xu, Y.; Wei, Q.; Liu, X., Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming, IEEE/CAA J Automatica Sinica, 5, 1, 36-46 (2018) · doi:10.1109/JAS.2017.7510739
[9] Luo, B.; Yang, Y.; Liu, D.; Wu, H., Event-triggered optimal control with performance guarantees using adaptive dynamic programming, IEEE Trans Neural Netw Learn Syst (2019) · doi:10.1109/TNNLS.2019.2899594
[10] Song, R.; Wei, Q.; Song, B., Neural-network-based synchronous iteration learning method for multi-player zero-sum games, Neurocomputing, 242, 73-82 (2017) · doi:10.1016/j.neucom.2017.02.051
[11] Wang, D.; Mu, C.; Liu, D.; Ma, H., On mixed data and event driven design for adaptive-critic-based nonlinear \(H_{\infty }\) control, IEEE Trans Neural Netw Learn Syst, 29, 4, 993-1005 (2018) · doi:10.1109/TNNLS.2016.2642128
[12] Wang, B.; Zhao, D.; Cheng, J., Adaptive cruise control via adaptive dynamic programming with experience replay, Soft Comput, 23, 12, 4131-4144 (2019) · Zbl 1418.93129 · doi:10.1007/s00500-018-3063-7
[13] Wei, Q.; Liu, D.; Xu, Y., Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach, Soft Comput, 20, 2, 697-706 (2016) · Zbl 1369.93318 · doi:10.1007/s00500-014-1533-0
[14] Wei, Q.; Liu, D.; Liu, Y.; Song, R., Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming, IEEE/CAA J Automatica Sinica, 4, 2, 168-176 (2017) · doi:10.1109/JAS.2016.7510262
[15] Xue, S.; Luo, B.; Liu, D., Event-triggered adaptive dynamic programming for zero-sum game of partially unknown continuous-time nonlinear systems, IEEE Trans Syst Man Cybern (2018) · doi:10.1109/TSMC.2018.2852810
[16] Yang, X.; Liu, D.; Wang, D., Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints, Int J Control, 87, 3, 553-566 (2013) · Zbl 1317.93158 · doi:10.1080/00207179.2013.848292
[17] Zhang, Q.; Zhao, D., Data-based reinforcement learning for nonzero-sum games with unknown drift dynamics, IEEE Trans Cybern, 49, 8, 2874-2885 (2019) · doi:10.1109/TCYB.2018.2830820
[18] Zhang, H.; Su, H.; Zhang, K.; Luo, Y., Event-triggered adaptive dynamic programming algorithm for non-zero-sum games of unknown nonlinear systems via generalized fuzzy hyperbolic models, IEEE Trans Fuzzy Syst (2019) · doi:10.1109/TFUZZ.2019.2896544
[19] Zhao, B.; Liu, D., Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming, IEEE Trans Ind Electron (2019) · doi:10.1109/TIE.2019.2914571
[20] Zhao, D.; Zhang, Q.; Wang, D.; Zhu, Y., Experience replay for optimal control of nonzero-sum game systems with unknown dynamics, IEEE Trans Cybern, 46, 3, 854-865 (2016) · doi:10.1109/TCYB.2015.2488680
[21] Zhao, B.; Jia, L.; Xia, H.; Li, Y., Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation, Nonlinear Dyn, 93, 4, 2089-2103 (2018) · Zbl 1398.93305 · doi:10.1007/s11071-018-4309-8
[22] Zhao, B.; Wang, D.; Shi, G.; Liu, D.; Li, Y., Decentralized control for large-scale nonlinear systems with unknown mismatched interconnections via policy iteration, IEEE Trans Syst Man Cybern, 48, 10, 1725-1735 (2018) · doi:10.1109/TSMC.2017.2690665
[23] Zhong, X.; He, H.; Wang, D.; Ni, Z., Model-free adaptive control for unknown nonlinear zero-sum differential game, IEEE Trans Cybern, 48, 5, 1633-1646 (2018) · doi:10.1109/TCYB.2017.2712617
[24] Zhu, Y.; Zhao, D., A data-based online reinforcement learning algorithm satisfying probably approximately correct principle, Neural Comput Appl, 26, 4, 775-787 (2015) · doi:10.1007/s00521-014-1738-2
[25] Zhu, Y.; Zhao, D., Comprehensive comparison of online ADP algorithms for continuous-time optimal control, Artif Intell Rev, 49, 4, 531-547 (2018) · doi:10.1007/s10462-017-9548-4
[26] Zhu, Y.; Zhao, D.; Li, X., Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data, IEEE Trans Neural Netw Learn Syst, 28, 3, 714-725 (2017) · doi:10.1109/TNNLS.2016.2561300
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.