Recently, there has been a surge of interest in accelerating classical mathematical optimization solution procedures by leveraging GPUs
[1–3,3–7]. In particular, recent advancements in algorithms and software have enabled the creation of a fully GPU-resident second-order optimization solver for nonlinear optimization
[4,6,8–10]. Key algorithmic developments include condensed-space interior point methods and lifted/hybrid KKT system strategies
[4,6]. cuDSS, a sparse linear solver tailored for GPUs, has enabled the solution of sparse positive definite condensed KKT systems. Finally, ExaModels, an algebraic modeling and automatic differentiation platform, as well as MadNLP, a GPU-accelerated nonlinear optimization solver, have been developed to provide the complete GPU-resident optimization capablities
[6]. These combined capabilities have resulted in significant speed-ups for various large-scale nonlinear optimization problems; for example, the largest AC optimal power flow instance demonstrated over a tenfold speed increase
[4,6].
The computation procedure for nonlinear programming can be segmented into several sub-procedures: derivative evaluation, KKT system resolution, and internal computation within the nonlinear optimization solver. The first two components account for the majority of computational time in large-scale nonlinear programming instances. Derivative evaluation functionality is typically provided by algebraic modeling systems such as the AMPL solver library [11] (the AD backend of Pyomo [12,13]), CasADi [14], JuMP [15–17], Gravity [18], and PyOptInterface [19,20].
ExaModels is an algebraic modeling system that enables efficient derivative evaluations across various hardware platforms, including CPUs (both single-threaded and multi-threaded) and GPUs (from NVIDIA, AMD, and Intel). A novel design feature of ExaModels is its syntax and data structure, which is designed to capture the repeated patterns within the objective and constraint functions, common to many large-scale nonlinear optimization problems. This design allows for the compilation of derivative evaluation procedures into a limited number of GPU-executable kernels. Through just-in-time compilation via Julia Language [21] and KernelAbstractions.jl [22], ExaModels generates these derivative evaluation kernels tailored for the hardware—CPU or accelerator—on-the-fly, while providing users with a seamless interface that requires no GPU-specific coding.
We present numerical results highlighting the performance and flexibility of ExaModels across a range of benchmark instances, including the pglib-opf AC optimal power flow cases and the COPS benchmark instances. The findings indicate that ExaModels significantly outperforms existing implementations, exhibiting superior speed relative to JuMP and the AMPL solver library when executed on the same CPU hardware. When executed on GPUs, ExaModels often achieves more than two orders of magnitude speed-up compared to its performance on CPUs. Moreover, ExaModels operates efficiently on both single-threaded and multi-threaded CPUs, as well as all major GPU platforms (NVIDIA, AMD, and Intel). These results underscore ExaModels as a potent tool for enhancing nonlinear optimization solution procedures. With ExaModels, the time required for derivative evaluations becomes negligible compared to the total solution time, leaving KKT system resolution as the primary bottleneck. We will also discuss future directions for ExaModels, including its potential use as a derivative evaluation engine for other optimization modeling platforms and modeling library development activities within the ExaModels ecosystem.
[3] H. Lu, J. Yang, H. Hu, Q. Huangfu, J. Liu, T. Liu, Y. Ye, C. Zhang, and D. Ge,
cuPDLP-C: A Strengthened Implementation of cuPDLP for Linear Programmingby C Language,
https://doi.org/10.48550/arXiv.2312.14832.
[8] F. Pacaud, M. Schanen, S. Shin, D. A. Maldonado, and M. Anitescu,
Parallel Interior-Point Solver for Block-Structured Nonlinear Programs on SIMD/GPU Architectures,
https://doi.org/10.48550/arXiv.2301.04869.
[12] W. E. Hart, C. D. Laird, J.-P. Watson, D. L. Woodruff, G. A. Hackebeil, B. L. Nicholson, and J. D. Siirola,
Pyomo — Optimization Modeling in Python, Vol. 67 (Springer International Publishing, Cham, 2017).