2025 AIChE Annual Meeting

(681d) Data-Driven Optimization through a Combined ML/MINLP Approach

Authors

Nikolaos Sahinidis - Presenter, Georgia Institute of Technology
Kaiwen Ma, Carnegie Mellon University
Considerable progress has been made in the optimization area over the past few decades, leading to increasingly sophisticated algorithms for the solution of algebraic optimization problems. The solution of algebraic NLP and MINLP problems has progressed so much that we can now routinely solve problems with hundreds of thousands or even millions of variables and constraints. Yet, in the absence of algebraic models, optimization problems with even a few dozen variables are still very challenging to solve, especially to global optimality. The need for hyperparameter tuning in machine learning, process optimization over simulators, and experimental design of novel materials and processes motivates further development of algorithms for data-driven optimization that can be utilized in the absence of algebraic models. This area of research is sometimes also referred to as derivative-free optimization, simulation optimization, and black-box optimization and has recently received increased attention [1, 2].

We have recently proposed the branch-and-model (BAM) algorithm for data-driven optimization. Similar to branch-and-bound for MINLP, BAM partitions the search space of a data-driven optimization problem. However, unlike branch-and-bound, BAM relies on a novel domain partitioning scheme that redraws subdomain boundaries in every iteration based on recently collected measurements [3]. BAM relies on machine learning (ALAMO [4]) to build local surrogate models that are solved to global optimality with global MINLP technology (BARON [5]) to identify new points in the domain where measurements should be taken.

The current work addresses a fundamental question in global data-driven optimization, namely when to perform local search. Local search is needed to obtain good solutions. However, repeated local searches in the same basin of a local optimum are wasteful, slow down the search, and increase the number of measurements required to obtain high quality solutions. Excessive local searches can be detrimental, especially when measurement collection relies on expensive experimentation. We rely on machine learning (clustering) to develop a systematic means for clustering measurements to identify clusters corresponding to local basins of the objective function. We perform extensive computational experiments on over 500 publicly available test problems. The results demonstrate that timing of local search is critical in data-driven optimization and a specific type of clustering accurately reflects BAM’s progress. When BAM is equipped with this cluster learning algorithm, it correctly identifies unique local basins, avoids redundant local searchers, and expedites convergence.

References

  1. Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to derivative-free optimization. SIAM, Philadelphia, 2009.
  2. Rios, L. M. and N. V. Sahinidis, Derivative-free optimization: A review of algorithms and comparison of software implementations, Journal of Global Optimization, 56, 1247-1293, 2013.
  3. Ma, K., L. M. Rios, A. Bhosekar, N. V. Sahinidis and S. Rajagopalan, Branch-and-Model: A derivative-free global optimization algorithm, Computational Optimization and Applications, 85, 337-367, 2023.
  4. Wilson, Z. T and N. V. Sahinidis, The ALAMO approach to machine learning, Computers & Chemical Engineering, 106, 785-795, 2017.
  5. Zhang, Y. and N. V. Sahinidis, Solving continuous and discrete nonlinear programs with BARON, Computational Optimization and Applications, 2024 https://doi.org/10.1007/s10589-024-00633-0.