2025 AIChE Annual Meeting

(10g) Decoding Control: Scalable and Interpretable Approximation of Control Laws with Oblique Decision Trees

Authors

Yankai Cao, The University of British Columbia
Modern control systems often rely on complex strategies like Model Predictive Control (MPC), Reinforcement Learning (RL), or other optimal control methods to achieve high performance. However, these advanced control laws frequently suffer from drawbacks that hinder their practical deployment. MPC often demands substantial online computation, making it challenging for real-time applications with fast dynamics [1]. RL policies, particularly those based on deep neural networks, can act as "black boxes," lacking interpretability and making verification difficult, which is a major concern in safety-critical systems [2]. Furthermore, implementing these complex controllers on resource-constrained hardware platforms can be infeasible [3]. There is a need for methods that can capture the performance benefits of these advanced controllers while offering computational efficiency, interpretability, and ease of implementation.


Studies on explicit MPC have established that the control law for linear MPC systems is a piecewise affine function of the system state (x) [4]. Building on this, theoretical work has shown that such linear MPC control laws can be exactly represented by a finite-depth oblique decision tree with linear predictions (ODT-LP), as shown in Picture (a). Here, the oblique decision tree structure is used to partition the state space into distinct polyhedral regions (polytopes) using linear combinations of features at branch nodes. Each leaf node corresponds to a specific polytope, where a simple linear control law (u = Fx + g) is applied.


Inspired by this equivalence, we propose a data-driven framework using ODT-LP to learn approximations of control laws directly from sampled data, as shown in Picture (b). This framework is applicable not just to linear MPC but also to complex controllers like nonlinear MPC, RL policies, or other optimal control strategies. The core idea involves generating a dataset of state-action pairs by repeatedly solving the target high-performance controller offline across a representative set of initial states. This dataset trains the ODT-LP model in a supervised manner, where the tree learns to partition the state space using oblique splits (linear combinations of features) at branch nodes and predicts the control action using simple linear models at leaf nodes. This process effectively "decodes" the complex control logic into the explicit, rule-based structure of the ODT-LP. The resulting ODT-LP controller acts as a computationally lightweight surrogate: for linear MPC, it can learn the exact control law, while for nonlinear MPC, RL, or other complex strategies, it learns a near-optimal approximation.


One key advantage of this ODT-LP approach is scalability. The method is designed to handle potentially high-dimensional state spaces during offline training and enables extremely fast online execution, significantly reducing the computational burden compared to online optimization. Furthermore, this scalability allows for a tunable trade-off: one can adjust the complexity of the ODT-LP (e.g., tree depth or node number) to balance the fidelity of the control law approximation against the computational cost. This cost includes both offline training effort and online execution speed. Simpler trees offer lower fidelity but faster training and execution, while more complex trees can achieve higher fidelity at the expense of increased computational resources.


Another significant advantage is the inherent interpretability offered by the decision tree structure [5]. For any given state, the tree evaluation follows a single, traceable path to a specific leaf node, where a unique and simple linear control law is applied, as shown in Picture (a). This rule-based structure provides transparency, allowing human operators to follow the step-by-step decision logic, enhancing trustworthiness and simplifying verification. By combining these benefits, ODT-LP offers a practical and scalable pathway to deploy near-optimal control performance with significantly reduced computational burden and enhanced interpretability, bridging the gap between advanced control theory and real-world implementation.