2025 AIChE Annual Meeting

(593a) Examodels: A Nonlinear Optimization Modeling Platform for GPUs

Author

Sungho Shin - Presenter, Uninveristy of Wisconsin-Madison
Recently, there has been a surge of interest in accelerating classical mathematical optimization solution procedures by leveraging GPUs [1–3,3–7]. In particular, recent advancements in algorithms and software have enabled the creation of a fully GPU-resident second-order optimization solver for nonlinear optimization [4,6,8–10]. Key algorithmic developments include condensed-space interior point methods and lifted/hybrid KKT system strategies [4,6]. cuDSS, a sparse linear solver tailored for GPUs, has enabled the solution of sparse positive definite condensed KKT systems. Finally, ExaModels, an algebraic modeling and automatic differentiation platform, as well as MadNLP, a GPU-accelerated nonlinear optimization solver, have been developed to provide the complete GPU-resident optimization capablities [6]. These combined capabilities have resulted in significant speed-ups for various large-scale nonlinear optimization problems; for example, the largest AC optimal power flow instance demonstrated over a tenfold speed increase [4,6].

The computation procedure for nonlinear programming can be segmented into several sub-procedures: derivative evaluation, KKT system resolution, and internal computation within the nonlinear optimization solver. The first two components account for the majority of computational time in large-scale nonlinear programming instances. Derivative evaluation functionality is typically provided by algebraic modeling systems such as the AMPL solver library [11] (the AD backend of Pyomo [12,13]), CasADi [14], JuMP [15–17], Gravity [18], and PyOptInterface [19,20].

ExaModels is an algebraic modeling system that enables efficient derivative evaluations across various hardware platforms, including CPUs (both single-threaded and multi-threaded) and GPUs (from NVIDIA, AMD, and Intel). A novel design feature of ExaModels is its syntax and data structure, which is designed to capture the repeated patterns within the objective and constraint functions, common to many large-scale nonlinear optimization problems. This design allows for the compilation of derivative evaluation procedures into a limited number of GPU-executable kernels. Through just-in-time compilation via Julia Language [21] and KernelAbstractions.jl [22], ExaModels generates these derivative evaluation kernels tailored for the hardware—CPU or accelerator—on-the-fly, while providing users with a seamless interface that requires no GPU-specific coding.

We present numerical results highlighting the performance and flexibility of ExaModels across a range of benchmark instances, including the pglib-opf AC optimal power flow cases and the COPS benchmark instances. The findings indicate that ExaModels significantly outperforms existing implementations, exhibiting superior speed relative to JuMP and the AMPL solver library when executed on the same CPU hardware. When executed on GPUs, ExaModels often achieves more than two orders of magnitude speed-up compared to its performance on CPUs. Moreover, ExaModels operates efficiently on both single-threaded and multi-threaded CPUs, as well as all major GPU platforms (NVIDIA, AMD, and Intel). These results underscore ExaModels as a potent tool for enhancing nonlinear optimization solution procedures. With ExaModels, the time required for derivative evaluations becomes negligible compared to the total solution time, leaving KKT system resolution as the primary bottleneck. We will also discuss future directions for ExaModels, including its potential use as a derivative evaluation engine for other optimization modeling platforms and modeling library development activities within the ExaModels ecosystem.

[1] Y. Cao, A. Seth, and C. D. Laird, An augmented Lagrangian interior-point approach for large-scale NLP problems on graphics processing units, Computers & Chemical Engineering 85, 76 (2016).
[2] M. Schubiger, G. Banjac, and J. Lygeros, GPU acceleration of ADMM for large-scale quadratic programming, Journal of Parallel and Distributed Computing 144, 55 (2020).
[3] H. Lu, J. Yang, H. Hu, Q. Huangfu, J. Liu, T. Liu, Y. Ye, C. Zhang, and D. Ge, cuPDLP-C: A Strengthened Implementation of cuPDLP for Linear Programmingby C Language, https://doi.org/10.48550/arXiv.2312.14832.
[4] F. Pacaud, S. Shin, M. Schanen, D. A. Maldonado, and M. Anitescu, AcceleratingCondensed Interior-Point Methods on SIMD/GPU Architectures, J Optim Theory Appl (2023).
[5] E. Adabag, M. Atal, W. Gerard, and B. Plancher, MPCGPU: Real-Time Nonlinear Model Predictive Control Through Preconditioned Conjugate Gradient on theGPU, https://doi.org/10.48550/arXiv.2309.08079.
[6] S. Shin, M. Anitescu, and F. Pacaud, Accelerating optimal power flow with GPUs: SIMD abstraction of nonlinear programs and condensed-space interior-point methods, Electric Power Systems Research 236, 110651 (2024).
[7] F. Pacaud, M. Schanen, S. Shin, D. A. Maldonado, and M. Anitescu, Parallel interior-point solver for block-structured nonlinear programs on SIMD/GPU architectures, Optimization Methods and Software 39, 874 (2024).
[8] F. Pacaud, M. Schanen, S. Shin, D. A. Maldonado, and M. Anitescu, Parallel Interior-Point Solver for Block-Structured Nonlinear Programs on SIMD/GPU Architectures, https://doi.org/10.48550/arXiv.2301.04869.
[9] F. Pacaud and S. Shin, GPU-accelerated Nonlinear Model Predictive Control with ExaModels and MadNLP, https://doi.org/10.48550/arXiv.2403.15913.
[10] D. Cole, S. Shin, F. Pacaud, V. M. Zavala, and M. Anitescu, ExploitingGPU/SIMD Architectures for Solving Linear-Quadratic MPC Problems*, in 2023American Control Conference (ACC) (2023), pp. 3995–4000.
[11] R. Fourer, D. M. Gay, and B. W. Kernighan, A Modeling Language for Mathematical Programming, Management Science 36, 519 (1990).
[12] W. E. Hart, C. D. Laird, J.-P. Watson, D. L. Woodruff, G. A. Hackebeil, B. L. Nicholson, and J. D. Siirola, Pyomo — Optimization Modeling in Python, Vol. 67 (Springer International Publishing, Cham, 2017).
[13] W. E. Hart, J.-P. Watson, and D. L. Woodruff, Pyomo: Modeling and solving mathematical programs in Python, Math. Prog. Comp. 3, 219 (2011).
[14] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, CasADi: A software framework for nonlinear optimization and optimal control, Math. Prog. Comp. 11, 1 (2019).
[15] M. Lubin, O. Dowson, J. D. Garcia, J. Huchette, B. Legat, and J. P. Vielma, JuMP1.0: Recent improvements to a modeling language for mathematical optimization, Math. Prog. Comp. 15, 581 (2023).
[16] I. Dunning, J. Huchette, and M. Lubin, JuMP: A Modeling Language for Mathematical Optimization, SIAM Rev. 59, 295 (2017).
[17] B. Legat, O. Dowson, J. D. Garcia, and M. Lubin, MathOptInterface: A Data Structure for Mathematical Optimization Problems, INFORMS Journal on Computing 34, 672 (2022).
[19] Y. Yang, C. Lin, L. Xu, X. Yang, W. Wu, and B. Wang, Accelerating Optimal Power Flow With Structure-Aware Automatic Differentiation and Code Generation, IEEE Transactions on Power Systems 40, 1172 (2025).
[21] J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah, Julia: A Fresh Approachto Numerical Computing, SIAM Rev. 59, 65 (2017).
[22] V. Churavy, KernelAbstractions.jl, (2025).