2019 AIChE Annual Meeting

(443c) Symbolic Regression for the Automated Physical Model Identification in Reaction Engineering

Authors

Cao, L. - Presenter, Cambridge Centre for Advanced Research and Education in Singapore (CARES) Ltd
Neumann, P., Aachener Verfahrenstechnik - Process Systems Engineering
Russo, D., University of Cambridge
Vassiliadis, V. S., University of Cambridge
Lapkin, A. A., Cambridge Centre for Advanced Research and Education in Singapore Ltd
Symbolic regression for the automated physical model identification in reaction engineering

Liwei Caoa,b, Pascal Neumanna,c, Danilo Russoa, Vassilios S. Vassiliadisa, Alexei A. Lapkin a,b*

a Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.

b Cambridge Centre for Advanced Research and Education in Singapore (CARES) Ltd., Campus for Research Excellence and Technological Enterprise (CREATE), CREATE Tower, 1 CREATE Way, Singapore, 138602

c Aachener Verfahrenstechnik - Process Systems Engineering, RWTH Aachen University, Aachen 52062, Germany

* Corresponding author; email: aal@cam.ac.uk

Abstract

Understanding of a complex reaction system at a fundamental level is crucial as it reduces the time and resources required for process development and implementation at scale. The two distinct paradigms in developing fundamental knowledge of a chemical system start from either experimental observations (data-driven modeling), or from mechanistic a priori knowledge (physical models). With the rise of automation and tremendous modern advancements in data science the two approaches are gradually merging, although model identification for multivariable complex systems remains challenging in practice. In this work, the identification of interpretable and generalizable physical models is targeted by means of automatable, data-driven methods without a priori knowledge. A revised mixed-integer nonlinear programming (MINLP) formulation is proposed for symbolic regression (SR) to identify physical models from noisy experimental data. The identification of interpretable and generalizable models was enabled by assessing model complexity and extrapolation capability. The method is demonstrated by successful application for the identification of a kinetic model of the 4-nitrophenyl acetate (PNPA) hydrolysis reaction.