The need for high computational power continues to surge with the advent of AI and ML products
becoming mainstream [3]. To meet the rising demand for these high-power chips, semiconductor design
has become increasingly complex, with 3D transistor structures such as Gate-All-Around and FinFET
designs becoming popular [1, 2]. However, these complex processes are very sensitive to process
disturbances and unoptimized process recipes; thus, data-driven generative models can be used to
generate optimal process recipes and improve semiconductor manufacturing quality. While generative
models are very powerful, they face their own challenges—namely, they require an extremely large
training dataset to function at a high level. While industrial manufacturing settings have a large volume of
process data, it can still be insufficient for generative modeling purposes. In large language models
(LLMs), data aggregation is not typically used to resolve data deficiency; instead, we propose using open-
source models fine-tuned with industrial data to build a general-purpose model. This model may not
necessarily follow the LLM format, but the goal is to develop an all-in-one model through large-scale
aggregation that can be applied across all modules and tools.
While manufacturing companies are beginning to integrate data-driven models into their manufacturing
processes in the form of soft sensors or fault predictors [5], most of these models are restricted to using
data from the same toolset and/or product line [4]. However, this both limits the amount of data available
for training purposes and requires multiple models (one model for each unique toolset/product line
combination) to be created. Process data from multiple unique datasets can be aggregated together to
create a general dataset that represents both the overall and individual trends for each unique dataset. This
solves both problems: the available process data is vastly increased, and only one general model is
created. The general model will be evaluated on its performance across a variety of toolset/product
datasets with an emphasis on having a high minimum score. If successful, this project would demonstrate
the potential of another data-driven tool in industrial manufacturing settings.
[1] Das, R. R., Rajalekshmi, T. R., & James, A. (2024). FinFET to GAA MBCFET: a review and
insights. IEEE Access.
[2] Jurczak, M., Collaert, N., Veloso, A., Hoffmann, T., Biesemans, S., 2009. Review of FINFET
technology. In: 2009 IEEE International SOI Conference, 1–4, Foster City, CA, USA.
[3] Nazir, M., Rasheed, M. Q., Yu, X. H., & Ahmed, Z. (2025). Can Computer Technology,
Semiconductors, and Artificial Intelligence Shape a Sustainable Future? Evidence From Leading
Semiconductor‐Producing Countries. Sustainable Development.
[4] Ou, F., H. Wang, C. Zhang, M. Tom, S. Bom, J. F. Davis and P. D. Christofides, "Industrial Data-
Driven Machine Learning Soft Sensing for Optimal Operation of Etching Tools," Dig. Chem.
Eng., 13, 100195, 2024.
[5] Zhang, C., Yella, J., Huang, Y., Qian, X., Petrov, S., Rzhetsky, A., Bom, S., 2021. Soft sensing
transformer: Hundreds of sensors are worth a single word. In: 2021 IEEE International
Conference on Big Data, 1999–2008, Orlando, FL, USA.