2025 AIChE Annual Meeting

(260c) Numerical Studies of Decomposition Algorithms for the Global Solution of Two-Stage Stochastic Minlps

Authors

Joseph Scott, Clemson University
This talk will present results from a detailed numerical study designed to identify the key performance bottlenecks in existing decomposition algorithms for the global solution of nonconvex two-stage stochastic mixed-integer nonlinear programs (SMINLPs). Stochastic MINLPs are a widely used framework for decision-making under uncertainty across domains such as energy systems, supply chains, and finance. Decisions are made in two stages: the first stage occurs before the uncertainty is revealed, and the second stage (recourse) follows after the outcomes are known. SMINLPs are nonconvex and often very large, which makes them difficult to solve. The full-space method, which applies branch-and-bound (B&B) on all variables, scales poorly—its complexity grows exponentially with the number of scenarios.

To address this challenge, decomposition strategies have been developed that project the problem into the first-stage space and execute a B&B search in that space only. The objective of this projected problem is defined implicitly in terms of the optimal value functions for the recourse problems in each scenario, and the lower bounds on these functions required for B&B are obtained through the global solution of appropriate subproblems for each scenario. Two specific methods following this approach are the methods proposed by Cao & Zavala (CZ) and Li & Grossmann (LG) [1,2]. By branching on the first-stage variables only, these methods achieve linear scaling with respect to the number of scenarios, and yet still provide a guarantee of global optimality.

Despite these favorable theoretical properties, the CZ and LG methods both have serious practical limitations. Specifically, while both methods have solved problems with impressive numbers of scenarios, successful applications have been limited to problems with relatively few first-stage decisions. Further investigation shows that, for most successful applications, the optimality gap was closed in the root node, or with very little branching, due to cuts or bounds-tightening techniques that were particularly effective for that problem. In contrast, for problems that did not terminate early, both methods often converge very slowly, or fail to converge within the allotted time limit. Based on these observations, it has been hypothesized that the performance of the CZ and LG methods is being degraded by the cluster problem, which refers to the exploration of a very large number of small B&B nodes in the immediate vicinity of the global solution. Prior studies have shown that the cluster problem, in turn, stems from the use of a weak lower bounding procedure. More specifically, it has been linked to the Hausdorff convergence order of the lower bounding procedure [3].

Robertson et al.~previously investigated this hypothesis by formally analyzing the Hausdorff convergence orders of the lower bounding procedures used by the CZ and LG methods [4]. That analysis resulted in three key conclusions: (i) Both methods can exhibit less-than-first-order convergence when the optimal value functions are non-Lipschitz. Recall that the value functions are defined implicitly by the optimal solutions of the recourse problems and are only guaranteed to be lower semi-continuous under standard assumptions, so this case is possible. At such a low order, clustering cannot be ruled out and is expected to occur. (ii) Both methods achieve first-order convergence when the value functions are Lipschitz continuous. With first-order convergence, clustering can be ruled out for global solutions that lie on a constraint boundary or a non-smooth point of the objective, but this requires that the objective grows at least linearly in all feasible directions, which is not guaranteed. Moreover, clustering cannot be ruled out at smooth unconstrained minimizers with only first-order convergence. (iii) The LG method attains second-order convergence when the value functions are smooth. This is the same convergence order achieved by all modern relaxation techniques used for B&B without decomposition. Clustering can be ruled out for both constrained and unconstrained minimizers in this case.

Despite these conclusions, the analysis from Robertson et al.~stopped short of answering the question of primary concern for new method developers: Is low convergence order really the primary cause of the observed performance limitations of CZ and LG? Several more specific questions immediately arise: (i) Which of the possible convergence orders <1, 1, or 2 most typically occurs in practice? (ii) If first-order convergence is the norm, do solutions typically lie on constraint boundaries or nonsmooth points, where first-order might suffice, or at unconstrained smooth points, where first-order does not suffice? (iii) If solutions typically lie on constraint boundaries or nonsmooth points, does the objective often satisfy the linear growth condition needed to avoid clustering, or not?

To begin answering these questions, we implemented and tested both methods on a curated library of seven practical SMINLP problems drawn from the literature and our own research. All problems were modeled in Pyomo and solved on Georgia Tech’s PACE computing cluster. In three out of seven test problems, both CZ and LG failed to terminate within an eight-hour time limit, while the full-space method successfully terminated in under one hour. Thus, this test set clearly demonstrates the computational limitations we aim to investigate. Regarding the specific questions outlined above, our results can be summarized as follows: (i) Both CZ and LG exhibit first-order convergence for all problems. The theoretical possibility of less-than-first-order convergence did not occur and is likely very rare. Moreover, the possibility of second-order convergence for LG was also not observed. This was primarily because no test problem had smooth value functions at the minimizer. However, we will present evidence that second-order convergence is also highly unlikely for smooth problems because the Lagrangian duality multipliers used by LG must be optimal to achieve second-order convergence in theory, and obtaining optimal multipliers in practice is problematic. We conclude that both CZ and LG have first-order convergence for all practical purposes. (ii) For all seven problems, the global solution lies on a constraint boundary in the first-stage space. Thus, first-order convergence could be sufficient to avoid clustering if the objective function grows at least linearly in all feasible directions. (iii) In one of the problems with the worst empirical performance, we found that there are feasible directions in which the objective only grows quadratically, not linearly. Thus, clustering cannot be ruled out theoretically, and is clearly observed empirically. Five of the remaining six problems also appear to exhibit clustering, but a complete study of their objectives along all feasible directions is still underway. We suspect that many of these will also show sub-linear growth in feasible directions, thereby explaining the observed clustering. If this is the case, it suggests that the development of methods with second-order convergence will be necessary to achieve satisfactory performance in practice. The talk will conclude with a discussion of possible paths forward given these new observations.

References:

1. Cao, Y., and Zavala, V. M. A scalable global optimization algorithm for stochas-tic nonlinear programs. Journal of Global Optimization 75, 2 (Apr 2019), 393–416

2. Li, C., and Grossmann, I. E. A generalized benders decomposition-based branch and cut algorithm for two-stage stochastic programs with nonconvex constraints and mixed-binary first and second stage variables. Journal of Global Optimization 75, 2(Aug 2019), 247–272.

3. Kannan, R., and Barton, P. I. The cluster problem in constrained global optimization. Journal of Global Optimization 69, 3 (11 2017), 629–676.

4. Robertson, D., Cheng, P., and Scott, J. On the convergence order of value function relaxations used in decomposition-based global optimization of nonconvex stochastic programs. Journal of Global Optimization (12 2024), 1–42.