2018 AIChE Annual Meeting

(393e) Distributed Approximate Dynamic Programming (dADP) for Data-Driven Optimal Control of Nonlinear Systems

Authors

Wentao Tang - Presenter, University of Minnesota

Prodromos Daoutidis, University of Minnesota-Twin Cities

Data-driven techniques have been developed in the field of chemical engineering for multiple purposes, including process monitoring and control. Data-driven control offers an alternative to model-based control strategies that does not rely on an accurate control-oriented process model. Among the wide spectrum of methods of data-driven control [1], approximate dynamic programming (ADP) is a type of algorithms which has been successfully used for discrete-state systems in artificial intelligence applications, and discussed for process control (see e.g. [2]).

ADP is a data-driven optimal control strategy, where historical datasets are exploited to train the control policy and value function iteratively towards the optimal solution which satisfies the Bellmanâs principle of optimality. For systems with continuous (infinite) states, the optimality principle assumes a specific form of the Hamilton-Jacobi-Bellman (HJB) equations. For nonlinear input-affine systems, the HJB equations can be transformed such that the model functions can be substituted with some data information, so that by choosing suitable basis functions for the optimal control policy and the value function, the policy and value iterations can be approximately solved as a regression problem [3].

In this work, we propose a novel approach different from that of [3], which directly formulates the HJB equations as a nonlinear regression problem, so that the approximate control policy and value function can be directly obtained. The framework is also extended to the cases where input constraints are present. This formulation is suitable for solving ADP in a big-data setting where a centralized optimization exploiting all the data in the regression procedures is infeasible. Specifically, we employ the alternating direction of multipliers (ADMM) [4], which is the most widely used distributed optimization algorithm, as well as its accelerated version [5] to regress the parameters of the optimal control policy and value function. We call the resulting framework distributed adaptive dynamic programming (dADP) as it adaptively updates the parameters to approach the optimum throughout the distributed optimization iterations, and we will illustrate this method in a chemical reactor example.

References

[1] Hou, Z. S., & Wang, Z. (2013). From model-based control to data-driven control: Survey, classification and perspective. Inf. Sci., 235, 3-35.

[2] Lee, J. H., & Wong, W. (2010). Approximate dynamic programming approach for process control. J. Process Control, 20(9), 1038-1048.

[3] Luo, B., et al. (2014). Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica, 50(12), 3281-3290.

[4] Boyd, S., et al. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn., 3(1), 1-122.

[5] Goldstein, T., et al. (2014). Fast alternating direction optimization methods. SIAM J. Imaging Sci., 7(3), 1588-1623.

List Price	225.00
AIChE Pro Members	150.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free

Breadcrumb

2018 AIChE Annual Meeting

(393e) Distributed Approximate Dynamic Programming (dADP) for Data-Driven Optimal Control of Nonlinear Systems

Authors