Reinforcement Learning (RL) is a machine learning technology in which a computer agent learns, through trial and error, the best way to accomplish a particular task [1]. The recent development of Deep Reinforcement Learning (DRL) technology, in which deep learning networks [2] are used to parameterize the RL agentâs policy and value functions, has enabled superhuman performance for some tasks, most especially games such as chess and GO [3]. Researchers have devised DRL algorithms specifically for continuous control problems [4,5], and this has led many in the process control community to wonder what impact DRL technology will have on our industry. Two recent papers provide an introduction to DRL technology and an analysis of its appropriateness for industrial process control applications [6,7]. This presentation builds on the previous analysis efforts by proposing a that an industrial process control technology must be:
- Intelligent
- Consistent
- Offset-free
- Nominally stable
- Flexible
DRL technology will be assessed with respect to each requirement, exposing existing deficiencies, useful research directions, and potential solutions. It will be shown that DRL technology is not likely to replace currently used Proportional-Integral-Derivative (PID) or Model-Predictive Control (MPC) algorithms, however it may prove useful in managing control systems by helping to tune controllers [8,9,10], advising operators during upsets or transient operations [6], and by managing plant-wide disturbances such as diurnal changes and weather events [6].
A final section will outline several promising research directions for DRL technology [11].
References
[1] RS Sutton, AG Barto, âReinforcement Learning â An Introductionâ, The MIT Press, (2018).
[2] I Goodfellow, Y Bengio, A Courville, âDeep Learningâ, The MIT Press, (2016).
[3] D Silver, A Huang, CJ Maddison, A Guez, L Sifre, G Van Den Driessche, J Schrittwieser, I. Antonoglou, V Panneershelvam, M Lanctot, âMastering the game of Go with deep neural networks and tree searchâ, Nature 529, 484â489 (2017).
[4] TP Lillicrap, JJ Hunt, A Pritzel, N Heess, T Erez, Y Tassa, D Silver, D Wierstra, âContinuous control with deep reinforcement learningâ, arXiv:1509.02971 (2015).
[5] L Busoniu, T deBruin, D Tolic, J Kober, I Palunko, âReinforcement learning for control: Performance, stability, and deep approximatorsâ, Annual Reviews in Control, 46, 8-28, (2018).
[6] J Shin, TA Badgwell, KH Liu, JH Lee, âReinforcement Learning â Overview of recent progress and implications for process controlâ, Computers & Chemical Engineering, 127, 282-294 (2019).
[7] RN Nian, J Liu, B Huang, âA review on reinforcement learning: Introduction and applications in industrial process controlâ, Computers & Chemical Engineering 139, 106886 (2020).
[8] T Badgwell, K Liu, N Subrahmanya, M Kovalski, âAdaptive PID controller tuning via deep reinforcement learningâ, US Patent 10,915,073, (2021).
[9] O Dogru, K Velswamy, F Ibrahim, Y Wu, AS Sundaramoorthy, B Huang, S Xu, M Nixon, N Bell, âReinforcement learning approach to autonomous PID tuningâ, Computers & Chemical Engineering, 161, 107760 (2022).
[10] NP Lawrence, MG Forbes, PD Loewen, DG McClement, JU Backstrom, RB Gopaluni, âDeep Reinforcement Learning with Shallow Controllers: An Experimental Application to PID Tuningâ, arXiv, 2111.07171v1 (2021).
[11] A Mesbah, KP Wabersich, AP Schoellig, MN Zeilinger, S Lucia, TA Badgwell, JP Paulson, âFusion of Machine Learning and MPC under Uncertainty: What Advances Are on the Horizon?â, American Control Conference, June 8-10, Atlanta, GA (2022).