2025 AIChE Annual Meeting

(394c) Multi-Dataset Feature Selection Enhances Lithium-Ion Battery End-of-Life Prediction

Authors

Seongmin Heo, Korea Advanced Institute of Science and Technology (KAIST)
With the increasing global adoption of electric vehicles and the corresponding surge in lithium-ion battery usage, the accurate prediction of battery end-of-life has become essential for ensuring operational safety and the early detection of potential faults. Conventional prediction methods have largely relied on extensive, homogeneous datasets [1]; however, the diversification of battery materials and cell chemistries has made it increasingly difficult to obtain comprehensive data across all scenarios. This limitation undermines the reliability of battery management systems, particularly when data is scarce.

To address these challenges, the current research employs multiple datasets that represent a broad spectrum of cell chemistries and charge–discharge protocols. In the previous studies, that focused on extracting invariant common features across multiple datasets to predict lifespan of lithium-ion batteries [2]. However, these approaches have an inherent limitation in that when a dataset with a novel composition is introduced, the common features identified may not be significant, or it may be necessary to recalculate for all datasets.

In this work, we propose a systematic feature selection from a source dataset to identify key predictive features for forecasting end-of-life in a target dataset. In addition, this strategy enables the extraction of generalizable features that, when integrated into transfer learning frameworks for battery management system applications, can significantly enhance model robustness. The proposed method leverages the abundant data available in the source dataset to improve predictive performance for target datasets characterized by extremely limited data, and it is readily applicable to various new target datasets.

Elastic net linear regression is utilized to identify and extract the key predictive features that significantly influence battery degradation and lifespan from the source dataset which collected with open-source dataset. These selected features are then applied within predictive models to enhance the accuracy of end-of-life prediction under conditions of limited data of target dataset. The target dataset is intentionally limited to mimic the realistic conditions of constrained data availability encountered during practical battery evaluations. This methodology is incorporated into a transfer learning framework designed to improve the robustness and reliability of battery management system while maintaining the scalability needed to adapt to diverse real-world charge–discharge conditions. The approach is expected to serve as a pivotal resource for the development of more resilient battery management systems, ultimately contributing to improved safety and operational efficiency in electric vehicle applications.

References

[1] Severson, Kristen A., et al. "Data-driven prediction of battery cycle life before capacity degradation." Nature Energy 4.5 (2019): 383-391.

[2] Zhang, Han, et al. "Battery lifetime prediction across diverse ageing conditions with inter-cell deep learning." Nature Machine Intelligence (2025): 1-8.