Data scarcity remains a significant challenge in applying machine learning to chemical process systems, where collecting labeled data in real-time under diverse operating conditions is impractical and expensive [1]. As a result, it becomes difficult to train robust and generalizable models using conventional supervised learning approaches. To address this limitation, few-shot learning has emerged as a promising strategy that enables model adaptation to new tasks or conditions using only a small number of labeled samples [2]. It aims to improve data efficiency, flexibility, and generalization in low-data scenarios. Despite its potential, there is a lack of comparative studies that systematically evaluate different few-shot learning strategies in chemical processes.
This study investigates and compares three representative few-shot learning approaches applicable to chemical process systems. The first approach is meta-learning, which includes Siamese networks [3] and Model-Agnostic Meta-Learning (MAML) [4]. Siamese networks learn to assess the similarity between paired inputs, allowing efficient extraction of relational information from limited data, while MAML facilitates rapid adaptation to novel tasks by learning generalizable initialization parameters through multi-task training. The second approach is fine-tuning [5], in which a model is pre-trained on related datasets or tasks and subsequently adapted to new settings by updating only a subset of its parameters. This strategy reduces the risk of overfitting in low-data regimes while leveraging previously acquired knowledge. The third approach, retrieval-augmented learning [6], improves inference by retrieving relevant samples from historical data, enabling the model to incorporate contextual information without retraining. By referencing the historical examples, retrieval-based methods are expected to enhance prediction accuracy and generalization, particularly when new conditions are encountered.
The objective of this study is to identify flexible and data-efficient learning strategies that facilitates the adaptation to new conditions and tasks in data-limited chemical process systems. To demonstrate the applicability of the approaches, we apply them to two representative cases. First, we simulate a real-world scenario of limited fault information in the Tennessee Eastman process and show that the few-shot strategies can extract key features and insights about abnormal conditions, even with minimal labeled data. Second, we address the challenge of sparse experimental data for hole transport layer (HTL) in perovskite-based solar cells, which plays a critical role in both stability and efficiency. By leveraging HTL data from other perovskite systems, proposed strategies efficiently identify and transfer meaningful design factors for a small number of available samples.
[1] Bansal, Ms Aayushi, Dr Rewa Sharma, and Dr Mamta Kathuria. "A systematic review on data scarcity problem in deep learning: solution and applications." ACM Computing Surveys (Csur) 54.10s (2022): 1-29.
[2] Song, Yisheng, et al. "A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities." ACM Computing Surveys 55.13s (2023): 1-40.
[3] Chicco, Davide. "Siamese neural networks: An overview." Artificial neural networks (2021): 73-94.
[4] Finn, Chelsea, Pieter Abbeel, and Sergey Levine. "Model-agnostic meta-learning for fast adaptation of deep networks." International conference on machine learning. PMLR, 2017.
[5] Xu, Lingling, et al. "Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment." arXiv preprint arXiv:2312.12148 (2023).
[6] Izacard, Gautier, et al. "Atlas: Few-shot learning with retrieval augmented language models." Journal of Machine Learning Research 24.251 (2023): 1-43.