2024 AIChE Annual Meeting
(364af) Integrating Data-Driven and Knowledge-Based Methodologies: Designing a Framework for Chemical Process Modeling
Author
My interest includes process modeling, process optimization, chemometrics, data science, hybrid modeling (i.e. Physics-informed neural networks, Artificial Intelligence/Machine Learning), membrane separation, electrochemical processes, inverse design, and synthesis, characterization and testing of catalyst.
Bio:
The central theme of my interests is in utilizing computational methods to address challenges in process and materials science. Recent endeavors have centered on employing data-driven approaches, including molecular simulations, numerical modeling, and machine learning, to tackle various issues: (a) elucidating the fundamental mechanisms of ion transport in membranes, (b) optimizing molecule properties for specific applications in surfactant formulation, and (c) constructing robust multiscale models for electrochemical systems like electrodialysis, electrodeionization, and capacitive deionization, as well as supercapacitors. Also, I have showcased my expertise through internships at Dow, where I contributed to developing data-driven models to expedite chemical formulations, and at Dhahran Techno Valley, leveraging molecular simulations to investigate water adsorption on shale surfaces and design novel polymers for high-temperature, high-salinity environments. Additionally, I possess experience in catalysis, particularly in synthesizing and evaluating supported catalysts for petrochemical applications. Currently, I am actively pursuing full-time opportunities commencing in either Fall 2024 or Spring 2025.
Abstract
Developing accurate models has become crucial in various engineering domains due to limitations faced by traditional experimental methods, such as time, cost, and resource constraints. In sectors like the chemical industry, meeting demands for reduced energy consumption while maintaining high performance requires optimized processes and materials. Similarly, in experiment formulation, there's a growing need to tailor molecules with specific properties. Machine learning (ML) techniques offer promise in determining optimal attribute ranges, but they often necessitate extensive datasets. My doctoral research addresses this challenge by integrating data-driven and knowledge-based approaches to enhance chemical modeling in data-limited scenarios.
In the first project, I developed a hybrid modeling framework, integrating compositional modeling and machine learning, is formulated for an electrodialysis (ED) and electrodeionization (EDI) employed in brackish water desalination. The devised approach leverages a physics-based compositional model to characterize the unit’s behavior of the two devices followed by generating synthetic data to train a machine learning-based, multi-output surrogate model. This model is fine-tuned using limited experimental data. This approach’s ability to predict experimental data signals its accurate representation of the system's behavior. Through the ML-based model, feature importance analysis is conducted, revealing the intricate interplay between the chosen ion-exchange resin wafer type and ED/EDI operational parameters. Notably, it is determined that the applied cell voltage predominantly influences separation efficiency and energy consumption in both electrodialysis and electrodeionization devices. Utilizing multi-objective optimization, experimental conditions are identified for achieving 99% separation efficiency with energy consumption below 1 kWh/kg.
Secondly, I leveraged the success of transfer learning in resolving the challenges with incomplete data, which hampers the development of successful data-driven models in chemical modeling. Previous effort has employed strategies such as imputation with ML or data augmentation. These efforts highlight the significance of data augmentation in enhancing ML model performance. Nonetheless, the mixing generated and real data might disrupt the correlation within the original dataset posing a risk that could lead to inaccurate models if not carefully implemented. To resolve these challenges, I proposed a two-step data-driven ML modeling framework to model critical parameters such as salt adsorption capacity (SAC) in capacitive deionization and specific capacitance (CAP) in supercapacitors. The approach named ‘ImputeNet’ involving training with ML-imputed datasets and then with clean dataset was explored. Through data imputation and transfer learning, it is possible to develop data-driven model with acceptable metrics mirroring experimental measurement. Using the model, optimization studies were conducted to analyze the pareto solutions, resulting in an impressive achievement of 200 mg/g for SAC and 6000 F/g for CAP. These values surpass the maximum reported experimental data by threefold for adsorption capacity and tenfold for specific capacitance.. This early insight can be used at an initial stage of experimental measurements to rapidly identify experimental conditions worthy of further investigation.
Finally, an approach for tailored molecular design, which holds significant importance in chemical formulation, will be presented. The application of generative design in surfactant formulation necessitates a large dataset of experimental data pertaining to molecular structures of surfactants, which is often unavailable, posing a significant limitation to its application. In this study, we propose a comprehensive 4-fold methodology design that harnesses the power of generative modeling (specifically, a Variational Autoencoder obtained using transfer learning), predictive modeling (utilizing a Graph Neural Network), reinforcement learning, and molecular dynamics to generate tailored molecules with desired properties. Experimental data on critical micellar concentration of surfactants were employed to validate the framework. Saliency maps for the generated surfactants were generated to elucidate the features influencing the property values. Finally, molecular dynamics simulations were conducted to verify the stability of the generated molecules. The results demonstrate that the proposed framework effectively generates valid molecules within the predefined property threshold.
This holistic research framework, combining data-driven and knowledge-based approaches, seeks to advance the frontiers of chemical modeling. By integrating computational techniques, machine learning, and experimental validation, my PhD research aims to contribute to the development of sustainable solutions in the field of chemical engineering and materials science.
References
- Olayiwola et al (2024). A Hybrid Modeling Framework for Electrochemical Separation Systems: Combining Compositional Modeling & Machine Learning. Manuscript in preparation
- Olayiwola et al (2024). Empowering Capacitive Devices: Harnessing Transfer Learning for Enhanced Data-Driven Optimization in Energy-Efficient Separation and Generation. Available on ChemrXiv preprint
- Nnadili et al (2024). Surfactant-Specific AI-Driven Molecular Design: Integrating Generative Models, Predictive Modeling, and Reinforcement Learning for Tailored Surfactant Synthesis. Eng. Chem. Res. ASAP
- Olayiwola et al (2023). Determining ion activity coefficients in ion-exchange membranes with machine learning and molecular dynamics. Eng. Chem. Res. 2023, 62, 24, 9533–9548
- Olayiwola et al (2023). Feature Embedding of Molecular Dynamics-Based Descriptors for Modeling Electrochemical Separation Processes. Computer Aided Chemical Engineering 52, 1451-1456