×

Discounted near-optimal regulation of constrained nonlinear systems via generalized value iteration. (English) Zbl 1527.93258

Summary: In this article, a generalized value iteration algorithm is developed to address the discounted near-optimal control problem for discrete-time systems with control constraints. The initial cost function is permitted to be an arbitrary positive semi-definite function without being zero. First, a nonquadratic performance functional is utilized to overcome the challenge caused by saturating actuators. Then, the monotonicity and convergence of the iterative cost function sequence with the discount factor are analyzed. For facilitating the implementation of the iterative algorithm, two neural networks with Levenberg-Marquardt training algorithm are constructed to approximate the cost function and the control law. Furthermore, the initial control law is obtained by employing the fixed point iteration approach. Finally, two simulation examples are provided to validate the feasibility of the present strategy. It is emphasized that the established control laws are successfully constrained for randomly given initial state vectors.
{© 2021 John Wiley & Sons Ltd.}

MSC:

93C55 Discrete-time control/observation systems
93C10 Nonlinear systems in control theory
Full Text: DOI

References:

[1] ZhangH, LuoY, LiuD. Neural‐network‐based near‐optimal control for a class of discrete‐time affine nonlinear systems with control constraints. IEEE Trans Neural Netw. 2009;20(9):1490‐1503.
[2] HaM, WangD, LiuD. Event‐triggered adaptive critic control design for discrete‐time constrained nonlinear systems. IEEE Trans Syst Man Cybern Syst. 2020;50(9):3158‐3168.
[3] HaM, WangD, LiuD. Event‐triggered constrained control with DHP implementation for nonaffine discrete‐time systems. Inf Sci. 2020;519:110‐123. · Zbl 1461.93306
[4] ModaresH, LewisFL, Naghibi‐SistaniM. Adaptive optimal control of unknown constrained‐input systems using policy iteration and neural networks. IEEE Trans Neural Netw Learn Syst. 2013;24(10):1513‐1525.
[5] ProkhorovDV, WunschDC. Adaptive critic designs. IEEE Trans Neural Netw. 1997;8(5):997‐1007.
[6] WerbosPJ. ch. 13. Approximate dynamic programming for real‐time control and neural modeling. In: WhiteDA (ed.), SofgeDA (ed.), eds. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold; 1992.
[7] WangD, HaM, QiaoJ. Self‐learning optimal regulation for discrete‐time nonlinear systems under event‐driven formulation. IEEE Trans Automat Contr. 2020;65(3):1272‐1279. · Zbl 1533.93653
[8] DongL, ZhongX, SunC, HeH. Adaptive event‐triggered control based on heuristic dynamic programming for nonlinear discrete‐time systems. IEEE Trans Neural Netw Learn Syst. 2017;28(7):1594‐1605.
[9] WangD, LiuD. Learning and guaranteed cost control with event‐based adaptive critic implementation. IEEE Trans Neural Netw Learn Syst. 2018;29(12):6004‐6014.
[10] FanQ, YangG. Event‐based fuzzy adaptive fault‐tolerant control for a class of nonlinear systems. IEEE Trans Fuzzy Syst. 2018;26(5):2686‐2698.
[11] WangD, XuX, ZhaoM. Neural critic learning toward robust dynamic stabilization. Int J Robust Nonlinear Control. 2020;30(5):2020‐2032. · Zbl 1465.93174
[12] WangD. Intelligent critic control with robustness guarantee of disturbed nonlinear plants. IEEE Trans Cybern. 2020;50(6):2740‐2748.
[13] YangY, GaoW, ModaresH, XuCZ. Robust actor‐critic learning for continuous‐time nonlinear systems with unmodeled dynamics. IEEE Trans Fuzzy Syst. 2021. https://doi.org/10.1109/TFUZZ.2021.3075501 · doi:10.1109/TFUZZ.2021.3075501
[14] ZhangH, WeiQ, LuoY. A novel infinite‐time optimal tracking control scheme for a class of discrete‐time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern. 2008;38(4):937‐942.
[15] KiumarsiB, LewisFL. Actor‐critic‐based optimal tracking for partially unknown nonlinear discrete‐time systems. IEEE Trans Neural Netw Learn Syst. 2015;26(1):140‐151.
[16] SongR, XieY, ZhangZ. Data‐driven finite‐horizon optimal tracking control scheme for completely unknown discrete‐time nonlinear systems. Neurocomputing. 2019;356:206‐216.
[17] WangD, HaM, QiaoJ. Data‐driven iterative adaptive critic control toward an urban wastewater treatment plant. IEEE Trans Ind Electron. 2021;68(8):7362‐7369.
[18] LuoB, LiuD, WuH. Adaptive constrained optimal control design for data‐based nonlinear discrete‐time systems with critic‐only structure. IEEE Trans Neural Netw Learn Syst. 2018;29(6):2099‐2111.
[19] WangD, ZhaoM, HaM, RenJ. Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application. Neural Netw. 2021;143:121‐132. · Zbl 1526.93141
[20] WangD, ZhaoM, QiaoJ. Intelligent optimal tracking with asymmetric constraints of a nonlinear wastewater treatment system. Int J Robust Nonlinear Control. 2021.
[21] KiumarsiB, VamvoudakisKG, ModaresH, LewisFL. Optimal and autonomous control using reinforcement learning: a survey. IEEE Trans Neural Netw Learn Syst. 2018;29(6):2042‐2062.
[22] LiuD, WeiQ. Policy iteration adaptive dynamic programming algorithm for discrete‐time nonlinear systems. IEEE Trans Neural Netw Learn Syst. 2014;25(3):621‐634.
[23] YangY, VamvoudakisKG, ModaresH, YinY, WunschDC. Hamiltonian‐driven hybrid adaptive dynamic programming. IEEE Trans Syst Man Cybern Syst. 2019. https://doi.org/10.1109/TSMC.2019.2962103 · doi:10.1109/TSMC.2019.2962103
[24] FanQ, WangD, XuB. \( H \operatorname{\infty}\) codesign for uncertain nonlinear control systems based on policy iteration method. IEEE Trans Cybern. 2021. https://doi.org/10.1109/TCYB.2021.3065995 · doi:10.1109/TCYB.2021.3065995
[25] Al‐TamimiA, LewisFL, Abu‐KhalafM. Discrete‐time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern. 2008;38(4):943‐949.
[26] WangD, LiuD, WeiQ, ZhaoD, JinN. Optimal control of unknown nonaffine nonlinear discrete‐time systems based on adaptive dynamic programming. Automatica. 2012;48(8):1825‐1832. · Zbl 1269.49042
[27] MuC, WangD, HeH. Novel iterative neural dynamic programming for data‐based approximate optimal control design. Automatica. 2017;81:240‐252. · Zbl 1373.90170
[28] WeiQ, LiuD, LinH. Value iteration adaptive dynamic programming for optimal control of discrete‐time nonlinear systems. IEEE Trans Cybern. 2016;46(3):840‐853.
[29] LiH, LiuD. Optimal control for discrete‐time affine non‐linear systems using general value iteration. IET Control Theory Appl. 2012;6(18):2725‐2736.
[30] HaM, WangD, LiuD. Generalized value iteration for discounted optimal control with stability analysis. Syst Control Lett. 2021;147:104847:1‐104847:7. · Zbl 1454.93226
[31] WeiQ, LiuD, LinQ. Discrete‐time local value iteration adaptive dynamic programming: admissibility and termination analysis. IEEE Trans Neural Netw Learn Syst. 2017;28(11):2490‐2502.
[32] WeiQ, LewisFL, LiuD, SongR, LinH. Discrete‐time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst. 2018;48(6):875‐891.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.