2025 Spring Meeting and 21st Global Congress on Process Safety

(91a) Machine Learning-assisted virtual technical data generation

Technical data about product grades are frequently requested by various internal and external stakeholders at materials companies. Laboratory testing for data generation is often time and resource-intensive leading to long wait times and increased overhead expenses. Due to this, there are considerable gaps in the available experimentally measured technical data. To alleviate this problem and make technical data more accessible across the organization, we are implementing an initiative called Virtual Testing. While commonly requested data include properties spanning mechanical, rheological, thermal, and electrical properties, we focused on stress-strain (SS) curves in the first phase of this project. SS curves—a function describing the tensile stress at a particular strain and a key mechanical property—are one of the most requested properties of materials. Measured at various temperatures and humidity conditions, they are important inputs for finite-element analysis (FEA) simulations of molded components and are often part of customer product specifications. This makes them important for performing advanced simulations, responding to customer queries, and engaging in rapid prototyping for new product development. Measuring the SS curves for a grade in the laboratory at the full range of temperatures and humidity conditions incurs costs upwards of a few thousand dollars per product grade and may sometimes take weeks due to backlog in testing laboratories.

Thus, we built predictive models for SS curves by using cheap-to-measure properties such as formulation, tensile strength at room temperature, etc., as inputs. Of special interest is the fact that SS curves are multi-point data. Multi-point data, also sometimes referred to as functional data, pose a special challenge in that data points are not independent of each other and together, they must obey certain physics-based rules. Such data are commonly observed in the materials industry and include data such as spectroscopic data, time series data, etc. To ensure physical validity, our approach integrates machine learning and domain knowledge-based rules, i.e. hybrid machine learning.

Our model has achieved an accuracy of over 90% and has been deployed internally as a web application. It is fully interpretable and explainable, allowing us to develop rational design rules. Furthermore, it allows for statistical inference enabling hypothesis testing and confidence interval estimation. Thus, the model can not only predict SS curves but also their precision, enabling us to measure confidence in predictions and increase reliability. Leveraging domain knowledge and machine learning for SS curve prediction has allowed us to significantly increase technical data availability in the organization leading to a reduction in testing costs, delivering faster responses to customer queries, and enabling faster new product development.