Battery capacity, defined as the maximum available charge or power in a fully charged battery, is a key indicator of its health [1]. Over time, the capacity of batteries declines due to energy loss and aging. Accurately predicting battery capacity is crucial for improving system reliability; however, this task is challenging due to the complex and nonlinear nature of battery systems. Therefore, capacity prediction has been an active area of research. Various methods, broadly classified into physics-based and data-driven approaches, have been proposed. The physics-based approach considers electrochemical characteristics and reactions in the battery and is stable, accurate, and interpretable. However, these models are complex, computationally expensive to solve, and require frequent parameter updates to account for battery aging [2]. In contrast, data-driven methods offer high computational efficiency, but their accuracy heavily depends on the availability of large, high-quality run-to-failure training datasets. Such data is not always accessible due to the high cost and time required for experiments, especially when data representing significant aging or run-to-failure is needed [3].
Generative models, such as generative adversarial networks (GANs), can generate high-quality synthetic data by learning the distribution of real data through adversarial training between a generator and a discriminator [4]. This approach serves as an effective data augmentation technique for enhancing existing battery datasets [3, 5-6]. However, challenges such as training instability (e.g., one network overpowering the other), mode collapse (i.e., limited diversity in generated data), and poor generalization to unseen scenarios have restrained the wider application of GANs for engineering problems [7]. To address these issues, we present a novel physics-guided GAN (PgGAN) for generating high-fidelity synthetic time-series data for batteries.
In the proposed PgGAN, the generator employs a long short-term memory (LSTM) network that maps noise to realistic-looking time-series battery signals (temperature, voltage, and current). The discriminator, also implemented as an LSTM, receives both the synthetic and real signals as inputs, and attempts to distinguish between them. The training of PgGAN involves adversarial learning, where the generator aims to fool the discriminator while simultaneously minimizing a physics-based loss. This physics-based loss is formulated based on a combined equivalent-circuit voltage and thermal model [8-9]. With such a multi-objective loss function, the generated synthetic data will not only ensure similarity and consistency with the ground truth, but also satisfy the governing laws expressed as equivalent-circuit battery models. As validated by the case studies, such a presented PgGAN method can improve the training stability, allowing the progress even when one network overpowers the other, and addresses the mode collapse by encouraging diversity in the generated outputs. Further, the proposed PgGAN allows for conditioning features in the generator so that it can generate synthetic battery data according to different operating conditions.
The PgGAN model is validated with the NASA prognostics data [10] and its performance is compared with that of the conventional GAN. Each battery's data is divided into training and testing subsets, with the first 100 cycles used for training and the final 60 cycles for testing. PgGAN is trained on the initial cycles and conditioned on corresponding capacity values. To evaluate the performance of the trained generator, we generate synthetic data cycles using the same capacity values as those present in the training set. The assessment of the synthetic profiles are based on dimension reduction methods such as t-SNE and PCA. Further, to test the generalizability, the trained model also generates synthetic cycles for unseen capacity values from the test set. Both t-SNE and PCA results show that the PgGAN outperforms the standard GAN in generating realistic data for both training and testing scenarios. Finally, the synthetic data from both PgGAN and the baseline GAN are used to augment the original dataset for capacity prediction tasks. The results show a significant reduction in the mean squared error in battery capacity prediction, from 0.023 (PgGAN) to 0.0069 (baseline GAN) for NASA Cell B5 and from 0.055 (PgGAN) to 0.03 (baseline GAN) for NASA Cell B6. Such observations demonstrate the effectiveness of PgGAN in addressing the data scarcity issue through high-quality augmentation.
Reference
[1] Chen, Lin, et al. "Prediction of lithium-ion battery capacity with metabolic grey model." Energy 106 (2016): 662-672.
[2] Wang, Fujin, et al. "Physics-informed neural network for lithium-ion battery degradation stable modeling and prognosis." Nature Communications 15.1 (2024): 4332.
[3] Ye, Zhuang, Jiantao Chang, and Jianbo Yu. "Prognosability regularized generative adversarial network for battery state of health estimation with limited samples." Energy (2025): 135922.
[4] Goodfellow, Ian J., et al. "Generative adversarial nets." Advances in neural information processing systems 27 (2014).
[5] Naaz, Falak, Aniruddh Herle, Janamejaya Channegowda, Aditya Raj, and Meenakshi Lakshminarayanan. "A generative adversarial network‐based synthetic data augmentation technique for battery condition evaluation." International Journal of Energy Research 45, no. 13 (2021): 19120-19135.
[6] Hu, Fengshuo, et al. "WGAN-GP with residual network model for lithium battery thermal image data expansion with quantitative metrics." 2023 IEEE 6th International Electrical and Energy Conference (CIEEC). IEEE, 2023.
[7] Huang, Nick, et al. "The GAN is dead; long live the GAN! A Modern GAN Baseline." Advances in Neural Information Processing Systems 37 (2024): 44177-44215.
[8] Tian, Ning, et al. "On parameter identification of an equivalent circuit model for lithium-ion batteries." 2017 IEEE Conference on Control Technology and Applications (CCTA). IEEE, 2017.
[9] Kumar, Pradeep, et al. "Battery thermal model identification and surface temperature prediction." IECON 2021–47th Annual Conference of the IEEE Industrial Electronics Society. IEEE, 2021.
[10] Saha, Bhaskar, and Kai Goebel. "Battery data set." NASA AMES prognostics data repository (2007).