2024 AIChE Annual Meeting

(578f) Lyapunov Neural ODE Control (L-NODEC) for Robust Policy Search in Nonlinear Systems

Checkout You must be logged in to view this content. Log in now.

Pricing

Individuals

List Price	225.00
AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free

Authors

Joshua Hang Sai Ip - Presenter

Georgios Makrygiorgos, UC Berkeley

Ali Mesbah, University of California, Berkeley

Optimal control is pivotal to decision-making for complex dynamical systems [1],
[2]. Solving continuous-time optimal control problems (OCPs) is generally chal-
lenging, especially when nonlinear dynamics and path constraints are present,
since they involve infinitely many time-varying decision variables. Various opti-
mization techniques to solve OCPs have been developed - direct methods that
discretize the time-varying functions and indirect methods that solve the nec-
essary conditions of optimality, but there is also growing interest in learning
methods that parameterize control policies using neural networks (NNs) [3].
They are particularly attractive due to their scalability for problems of higher
dimensions as well as their representation capacity due to the universal approx-
imation theorem [4].
We solve a continuous-time OCP with a NN control policy via neural ordi-
nary differential equations (NODEs) [5] since they replace the discrete nature
of hidden layers with a parameterized ODE that represents continuous depth
models. The perspective of treating function learning as a dynamical system
offers significant advantages in time-series modeling (e.g. [6], [7]). Though
NODEs are used in systems with unknown dynamics to simulatenously learn
and control dynamics (e.g. [8], [9]), we take a different approach by leveraging
known physics. Consequently, our NODE structure consists of only applying
the NN representation to the control policy, and we embed this into known dif-
ferential equations describing the temporal evolution of states [10], which can
be viewed as an example of the universal differential equation framework [11].
In particular, we are interested in the Mayer problem, a terminal cost continuous-
time OCP, with the goal of steering the system to a desired equilibrium point
[12]. We build upon the aforementioned “physics-embedded” NODE control
structure by incorporating Lyapunov theory in the policy’s loss [13] during learn-
ing, hence the name Lyapunov NODE control (L-NODEC). This is achieved by
defining the deviation of the system states from the desired equilibrium as an
exponentially stable control Lyapunov function (ES-CLF). The policy is learnt
such that the system dynamics satisfy the condition of exponential stability,

and this is achieved by quantifying the violation of the local invariance prop-
erty [14]. To address the issue of constraints, we explicitly enforce the input
constraints by appropriately parameterizing the policy via a sigmoid function
in the output layer. Path constraints are enforced as nonlinear constraints and
enforced as soft constraints via quadratic penalty functions in practice [15]. We
guarantee for the unconstrained OCP, L-NODEC is exponentially stable and
is capable of converging to equilibrium even without complete information on
terminal states. Furthermore, as a consequence of stability, a theoretical upper
bound for adversarial robustness can be established with respect to uncertainty
in initial conditions.
We demonstrate L-NODEC outperforms NODEC on a benchmark continuous-
time double integrator where the policy recommends alternative trajectories to
reach the desired terminal equilibrium state in lower inference time. Further-
more, its robustness to adversarial attacks also confirms the theoretical upper
bound on deviations in final states due to variance in initial conditions. In the
case of a constrained OCP, we also demonstrate the systematic tradeoff between
stability and constraint satisfaction, though constraint enforcement via penalty
functions are still an active area of research [16].

[1] M. Athans and P. L. Falb, Optimal control: an introduction to the theory
and its applications. Courier Corporation, 2007.
[2] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal control. John Wiley
& Sons, 2012.
[3] M. Hertneck, J. Kohler, S. Trimpe, and F. Allgower, “Learning an
approximate model predictive controller with guarantees,” IEEE Control
Systems Letters, vol. 2, no. 3, p. 543–548, Jul. 2018. [Online]. Available:
http://dx.doi.org/10.1109/LCSYS.2018.2843682
[4] A. R. Barron, “Universal approximation bounds for superpositions of a
sigmoidal function,” IEEE Transactions on Information Theory, vol. 39,
no. 3, pp. 930–945, 1993.
[5] R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural
ordinary differential equations,” Advances in Neural Information Process-
ing Systems, vol. 31, 2018.
[6] A. Rahman, J. Drgoˇna, A. Tuor, and J. Strube, “Neural ordinary differ-
ential equations for nonlinear system identification,” in Proceedings of the
American Control Conference, 2022, pp. 3979–3984.
[7] A. J. Linot, J. W. Burby, Q. Tang, P. Balaprakash, M. D. Graham, and
R. Maulik, “Stabilized neural ordinary differential equations for long-time

forecasting of dynamical systems,” Journal of Computational Physics, vol.
474, p. 111838, 2023.
[8] S. Bachhuber, I. Weygers, and T. Seel, “Neural ODEs for data-driven au-
tomatic self-design of finite-time output feedback control for unknown non-
linear dynamics,” IEEE Control Systems Letters, 2023.
[9] C. Chi, “Nodec: Neural ode for optimal control of unknown dynamical
systems,” arXiv preprint arXiv:2401.01836, 2024.
[10] I. O. Sandoval, P. Petsagkourakis, and E. A. del Rio-Chanona, “Neural odes
as feedback policies for nonlinear optimal control,” IFAC-PapersOnLine,
vol. 56, no. 2, pp. 4816–4821, 2023.
[11] C. Rackauckas, Y. Ma, J. Martensen, C. Warner, K. Zubov, R. Supekar,
D. Skinner, A. Ramadhan, and A. Edelman, “Universal differential equa-
tions for scientific machine learning,” arXiv preprint arXiv:2001.04385,
2020.
[12] A. E. Bryson, Applied optimal control: optimization, estimation and con-
trol. Routledge, 2018.
[13] I. D. J. Rodriguez, A. D. Ames, and Y. Yue, “Lyanet: A lyapunov frame-
work for training neural odes,” 2022.
[14] A. D. Ames, K. Galloway, K. Sreenath, and J. W. Grizzle, “Rapidly expo-
nentially stabilizing control lyapunov functions and hybrid zero dynamics,”
IEEE Transactions on Automatic Control, vol. 59, no. 4, pp. 876–891, 2014.
[15] R. M. Freund, “Penalty and barrier methods for constrained optimization,”
2004.
[16] T. Antony and M. J. Grant, “Path constraint regularization in optimal
control problems using saturation functions,” AIAA Atmospheric Flight
Mechanics Conference, 2018.

Breadcrumb

2024 AIChE Annual Meeting

(578f) Lyapunov Neural ODE Control (L-NODEC) for Robust Policy Search in Nonlinear Systems

Authors