2023 AIChE Annual Meeting

Predicting Vapor-Liquid Equilibria Using Physic-Inspired Machine Learning Models

The prediction of Vapor-Liquid Equilibrium (VLE) properties has been one of the ongoing issues in modeling complex systems to produce desired products. The Traditional methods of predicting VLE data often need significant amount of data to capture irregular molecular interactions, which constrains the predictive scope of said methods. In our method, we utilize Chemprop which is a state-of-the-art software package that implements D-MPNN (Directed Message-Passing Neural Networks) architecture to offer fast and versatile open-source software solutions that predict complex chemical properties. We apply this D-MPNN architecture to predict VLE binary mixture properties of complex systems. Our models take in the Simplified Molecular-Input Line-Entry System (SMILES) representation of molecules along with temperature, liquid composition, and pure component vapor pressures (T, , xi) in order to predict the gas phase composition and pressure y1, y2, and P.

Three experimental models have been developed to demonstrate the utilization of this approach, all models take in individual SMILES for both components 1 and 2 in a mixture and then process them into separate D-MPNN’s resulting in individual molecule vectors while having their temperatures and compositions (T, x1, and x2) embedded to this molecule vector before being fed to the FFN to get the target properties. The base model predicts the target properties directly as outputs of the models. The second and third models predict intermediate values used in traditional activity coefficient models to calculate target properties. In detail, the second model predicts activity coefficients (γ1 and γ2) which combined with the input properties (T, , and xi) can be used to calculate the target properties. Instead of predicting activity coefficients, the third model predicts parameters of the Wohl activity coefficient model (A1, A2, and A3), which is used to subsequently calculate activity coefficients to predict more accurate VLE data. The three models developed were trained on approximately 28,000 data-points that includes isobaric and isothermal systems sourced from the Dortmund Databank explorer version and a high-quality benchmark paper by Jean-Noël Jaubert et al. With our models, we show how the inclusion of traditional activity coefficient structures leads to improved representation of equilibrium behavior. These layered methodologies set the stage for more robust understanding of VLE predictions via machine learning.