Breadcrumb
- Home
- Publications
- Proceedings
- 2009 Annual Meeting
- Computing and Systems Technology Division
- Advances in Optimization I
- (74d) Parallel Solution of Large-Scale Nonlinear Programming Problems On Modern Computing Architectures
In previous work, we have presented an internal decomposition algorithm for the parallel solution of multi-scenario problems. We extend this approach and provide a stable decomposition of time-discretized formulations with pass-on variables. We demonstrate the performance of these parallel decomposition strategies on multiple parallel architectures. Distributed clusters operate using a multiple-instruction-multiple-data (MIMD) architecture, and the decomposition approach is implemented using independent processes that communicate through a message-passing interface (e.g. MPICH). Modern scientific computing architectures like GPUs provide significantly more processing cores per machine (e.g. 128 cores), however, these systems are typically single-instruction-multiple-data (SIMD) and have specialized kernel requirements and memory layouts. We present a fixed pivoting factorization technique that allows efficient parallel solution of structured nonlinear programming problems on GPU architectures. Several case-studies in optimal design and operation show that the distributed architecture is appropriate for coarse grained parallelization with tens of processors, while the GPU architecture is most appropriate for fine grained parallelization in real-time applications.