2025 AIChE Annual Meeting

(126a) Disentangling Common and Uncommon Variables from Multi-Sensor Observations

Authors

George Kevrekidis, University of Massachusetts at Amherst
In many scientific domains, a system of interest is observed through multiple heterogeneous sensors, each capturing different, often overlapping, aspects of its behavior. These observations are typically high-dimensional, entangled, and contain both information that is common across the sensors and information that is unique to each. We introduce Conformal Disentanglement, a neural autoencoder framework designed to address this challenge by performing Perspective Synthesis (identifying common variables) and Perspective Differentiation (disentangling sensor-specific variables) in a unified, geometry-aware setting.

Our architecture consists of paired encoder–decoder networks, one for each sensor, structured to separate latent representations into common and uncommon components. The disentanglement is guided by a two-phase training process. In the first phase, the model aligns the common latent spaces across sensors by minimizing a reconstruction loss and an inter-sensor consistency loss. In the second phase, we introduce a geometric orthogonality constraint to enforce local independence between common and uncommon components, effectively block-diagonalizing the local metric tensor in the latent space.

This approach is unsupervised and non-statistical in nature, leveraging differential-geometric concepts rather than conditional independence or mutual information. It enables interpretable latent representations, robust cross-sensor inference, and the synthesis of observation-consistent level sets, i.e., families of sensor outputs that share a fixed interpretation by another sensor. Additionally, our method is capable of handling asynchronous observations, allowing it to identify causal relationships in temporally offset sensor data streams.

We demonstrate the efficacy of our framework on both synthetic and real-world data, including systems of coupled oscillators, chaotic attractors, and a well-known bobblehead dataset involving multiple rotating objects observed from two camera perspectives. Our results show successful identification and separation of latent structures, even when sensor inputs are entangled or corrupted by irrelevant features. Moreover, we show how the learned latent representations can be used for generative modeling and predictive tasks across modalities.

The proposed method extends naturally to more than two sensors and holds potential for broad applications in system identification, sensor fusion, manifold learning, and unsupervised representation learning in high-dimensional, multi-view data.