The rapid growth of industrialization and global consumption has amplified environmental pressures, with the chemical sector emerging as one of the largest contributors to greenhouse gas emissions
1. Life Cycle Assessment (LCA) became a popular tool to measure these environmental impacts, but traditional LCA methods depend highly on the amount of process data. This data is often missing for new or developing chemicals. To fill this gap, Machine Learning (ML) offers a way by predicting environmental impacts using chemical and process information.
This work presents a computational framework that combines ML models with a technology-based index to predict the environmental impacts of chemicals throughout their life cycle. For the cradle-to-gate phase, we developed Artificial Neural Network (ANN) and Extreme Gradient Boosting (XGBoost) models which were trained on a dataset of 350 solvents2. The feature set incorporated both molecular descriptors and key thermodynamic properties, while target metrics included Global Warming Potential (GWP), Resource Utilization (RU), Ecosystem Quality (EQ), and Human Health Impact (HHI). The accuracy and reliability of the models were improved by cleaning the data, removing unusual values and fine-tuning which showed that using fewer but well-chosen features makes the method more practical to determine target metrics for new chemicals.
For the gate-to-gate and end-of-life phases, a framework was introduced to evaluate the environmental impact of chemical manufacturing. These impacts were estimated by normalizing process data obtained from literature, including raw material usage and energy consumption by each separation technology and making regression models to get scaling factors. This approach was inspired by the cost-estimation methodology usually employed in chemical engineering plant design. This allows for systematic comparison of each separation technologies based on their environmental burdens.
This methodology was validated through a case study on solvent recovery from a mixed waste stream (21.3% dimethoxy ethane, 1.3% 1-ethoxy-1-methoxy ethane, 41.3% toluene, 35.3% water)3. A test of eight recovery technologies was screened, and the optimal pathway—sedimentation, pervaporation, and ultrafiltration—was obtained to recover DME from the stream. Using this data the ML model and regression model were tested for cradle-to-gate and gate-to-gate phase. The impacts estimated by predictive models and scaling-based equations provided consistent estimates of process-level GWP. Integrating the ML with Technology Index framework demonstrates a scalable, data-efficient approach for conducting the LCA, enabling environmental performance estimation where data are limited and helps in making more informed decisions for sustainable chemical process development.
References:
(1) US EPA, O. GHGRP Reported Data. https://www.epa.gov/ghgreporting/ghgrp-reported-data (accessed 2025-09-18).
(2) Aboagye, E. A.; Lehr, A. L.; Shumaker, E.; Longo, J.; Pazik, J.; Hesketh, R. P.; Yenkie, K. M. Machine Learning Methods for the Forecasting of Environmental Impacts in Early-Stage Process Design; Breckenridge, Colorado, USA, 2024; pp 621–628. https://doi.org/10.69997/sct.141240.
(3) Aboagye, E. A.; Chea, J. D.; Lehr, A. L.; Stengel, J. P.; Heider, K. L.; Savelski, M. J.; Slater, C. S.; Yenkie, K. M. Systematic Design of Solvent Recovery Pathways: Integrating Economics and Environmental Metrics. ACS Sustain. Chem. Eng. 2022, 10 (33), 10879–10887. https://doi.org/10.1021/acssuschemeng.2c02497.