Conference PaperPDF Available

Modeling Multivariate Relations in Multiblock Semiconductor Manufacturing Data Using Process PLS To Enhance Process Understanding

December 2023

December 2023

DOI:10.1109/WSC60868.2023.10408180

Conference: 2023 Winter Simulation Conference (WSC)

Authors:

Geert van Kollenburg

Eindhoven University of Technology

Show all 5 authorsHide

Content uploaded by Geert van Kollenburg

Content may be subject to copyright.

Proceedings of the 2023 Winter Simulation Conference

C. G. Corlu, S. R. Hunter, H. Lam, B. S. Onggo, J. Shortle, and B. Biller, eds.

MODELING MULTIVARIATE RELATIONS IN MULTIBLOCK SEMICONDUCTOR

MANUFACTURING DATA USING PROCESS PLS TO ENHANCE PROCESS

UNDERSTANDING

Geert van Kollenburg

Richard Verhoeven

Mike Holenderski

Nirvana Meratnia

Eindhoven University of Technology

Groene Loper 3

Eindhoven, 5612AE, NETHERLANDS

Daniele Pagano

STMicroelectronics s.r.l.

Stradale Primo Sole, 50

Catania, 95121, ITALY

ABSTRACT

The complexity of manufacturing process data has made it more challenging to extract useful insights. Data-

analytic solutions have therefore become essential for analyzing and optimizing manufacturing processes.

Path modeling, also known as structural equation modeling, is a statistical approach that can provide

new insights into complex multivariate relationships between process variables from different stages of

the manufacturing process. The incorporation of expert process knowledge and subsequent interpretation

of model results can facilitate communication between stakeholders, promoting lean manufacturing and

achieving the sustainability goals of Industry 5.0. This paper describes the use of a path modeling algorithm

called Process Partial Least Squares (Process PLS) to gain new insights into the relationships between

equipment data from several machines within the semiconductor manufacturing process. The methods

used in this study can assist manufacturers in understanding the relations between different machines and

identify the most inﬂuential variables that may be used to develop soft-sensors.

1 INTRODUCTION

The semiconductor industry plays a vital role in modern society, with increasing demand for smaller and

more powerful computer chips and electronic devices. Computer chips are made in hundreds at a time on

silicon wafers. The manufacturing process involves multiple stages, including wafer fabrication, deposition,

lithography, and etching (Timings 2021). Each stage involves multiple process steps and each step may be

dependent on the performance of the previous steps. Variations in any of the hundreds of steps can affect

the ﬁnal product performance, yield, and reliability (Melhem et al. 2015). The complexity and dependency

of the manufacturing processes, together with the abundance of process data has resulted in an increasing

demand for data analytic solutions to analyze and optimize these processes.

Statistical models and machine learning algorithms can provide valuable insights into the manufacturing

processes. Yet most data-analytic applications do not model the relationships between the many process

steps, but instead focus on predicting speciﬁc outcomes like product quality or yield (Biegel et al. 2022;

Sanchez-Marquez and Vivas 2020; Dupret et al. 2005) in order to control the manufacturing processes.

Path modeling, also known as structural equation modeling (SEM), is a valuable approach that can identify

multivariate relationships between process variables from various steps of the manufacturing process (van

Kollenburg et al. 2020; Hair Jr et al. 2021).

Path modeling enables the identiﬁcation of key factors that contribute to the overall variability of the

manufacturing processes. By incorporating all relevant variables and dependencies between manufacturing

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

steps, path models can reveal new insights into the complex relationships between process variables and

their impact on the manufacturing process (Vinzi et al. 2010). Path modeling can assist manufacturers in

identifying the root causes of process variations, implementing corrective actions, and identifying areas

for improvement. Real-time process optimization will become possible when models identify strong

relationships between speciﬁc process variables and a defect-causing situation later in the process (Arteaga

and Ferrer 2002).

Path modeling is particularly useful in process analytics as they allow for the incorporation of expert

process knowledge in an intuitive manner. This facilitates communication between data-analysts and

manufacturing operators and such explainability is critical if models are to be integrated into personnel’s

routine work (Meindl et al. 2021; Cagliano et al. 2019). The inclusion of knowledge of process experts and

subsequent interpretation of model results provides a unique opportunity to promote lean manufacturing

(Tortorella et al. 2019). Collaboration between humans and data-analytic solutions is a core aspect of the

sustainability goals speciﬁed for Industry 5.0 (Breque et al. 2021). Investing in a working environment,

where workers can interact with data-analytics in an intuitive way can lead to increased value creation in

manufacturing processes (Cifone et al. 2021; Senoner et al. 2022).

The aim of the work presented in this paper was to learn more about the complex connections between

distinct steps in the semiconductor manufacturing process. This paper explains the process of obtaining

features from the equipment data, analyzing the data with a path modeling method called Process PLS

(van Kollenburg et al. 2021), and evaluating different model conﬁgurations to understand the relationships

between machines. The research is presented in a way that allows the methods to be applied to other

manufacturing processes, which is why the term ’machine’ will be used throughout to describe parts of

the processes otherwise called ’tools’ or ’chambers’ and so forth.

The remainder of this paper is organized as follows. The next section discusses related work. In

Section 3 the data that was used will be explained. That Section also includes an overview of the Process

PLS algorithm and illustrates multiple model speciﬁcations. Section 4 presents the results of the analyses

using the model speciﬁcations discussed. The paper ends with a discussion and future outlook in Section 5.

2 RELATED WORK

Data-analytic applications are common in manufacturing industry (Moldovan et al. 2017). While many

applications focus on quality predictions (Köksal et al. 2011, provide an overview), statistical process

control has also been standard practice for a long time (Spanos 1992). Path models on the other hand

have mostly been used to analyze company-level indicators. Examples include analysis of environmental

practice and manufacturing performance (Tseng et al. 2008), sustainable manufacturing practices (Vinodh

and Joy 2012), front-end product design (Withanage et al. 2012), corporate governance mechanisms (Fei

et al. 2015) and marketing strategies (Sarstedt et al. 2022). To the best of our knowledge, path models

have not yet been applied to evaluate the interrelations between various part of the manufacturing process

itself.

Path models have recently been used to analyze industrial chemical production processes (van Kollenburg

et al. 2020; Offermans et al. 2021), leading to new insights into relations between process variables that

led to reduced production costs. Sensors in industrial processes provide time series. In process industry,

sensors may measure different bits of material at each point in time as the materials ﬂows through the

machines. If properly aligned, each data point of a sensor can be related to all other data points related to

the same bit of material (Offermans et al. 2021). As such, the data is well-suited for use in correlational

methods like path models.

In semiconductor manufacturing processes, products reside at each machine for some time. This means

that many sensors each produce a time series variable per wafer. Multivariate time series can be analyzed

with statistical models like ARIMA (Hamilton 2020), upcoming data points can be predicted with deep

learning models like LSTM (Hochreiter and Schmidhuber 1997) and anomaly detection can be done with

ensemble methods (Trardi et al. 2022). These methods, however, neither consider the multiple-machine

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

nature of the manufacturing process to model the interrelations between the various sub-processes, nor

predict speciﬁc features in other time series.

To the best of our knowledge, no path modelling extensions for multi-block time series have yet been

developed to accommodate multiple target blocks (Gu and Van Deun 2019). To make use of existing path

models, the time series variables must be transformed in such a way that correlations between sets of data

become meaningful.

3 MATERIALS AND METHODS

3.1 Data

Historical equipment data from seven machines within a semiconductor manufacturing process was provided

by STMicroelectronics s.r.l. The seven machines will be referred to as Machine 1 through 7, in the order

that wafers pass through them. The data consisted of readings from 151 sensors which are distributed over

the seven machines, providing information on the production of over 2000 wafers, which all followed the

same production recipe. While details about the manufacturing steps cannot be disclosed, the goal of the

research was to relate speciﬁc observations at one machine to observations at other machines. For instance,

identifying whether an extreme value in Variable Aof Machine 1 is related to high variability of Variable B

on Machine 3.

As a pre-processing step, features of the time series variables were extracted before further analysis.

From each variable, the average value (avg), minimum value (min), maximum value (max) and the standard

deviation (std) were extracted. This means that each of the 151 time series was transformed into 4 features,

with each feature having a single value per wafer. This transformation is illustrated in Tables 1 and 2. The

data was labeled in the format M_V_feature, where Mindicates the machine at which the measurement

was done ranging from M1 to M7,Vindicates the sensor number, ranging from V1 to V151 and feature

is the label for either the avg,min,max, or std.

Table 1: Illustration of the original time series that was used to create the data set used in further analyses.

Labels are given in format Machine_Variable. Each of the 151 sensors produced a variable and each wafer

thus had observations on these 151 variables. Data is shown for the ﬁrst and last wafer.

Wafer 1 .Wafer 2186

M1_V1 .M7_V151 .M1_V1 .M7_V151

.774 ..499 ..746 ..166

.519 ..753 ..64 ..818

.283 ..041 ..069 ..31

.786 ..19 ..038 ..767

.294 ..196 ..092 ..05

. . . . . . .

.315 ..973 ..821 ..538

.619 ..239 ..981 ..193

.137 ..583 ..859 ..376

.324 . . .327 .

.149 . . .733

.522 . .

The columns in Table 1 represent sensor readings observed for each wafer. Each row in that Table

indicates a time point during which the wafer was in the respective Machine. Also illustrated is the fact

that not all time series were of equal length. In Table 2 each column represents a feature from the time

series and each row represents the observed value of that feature for a given wafer. The boxed number is

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

the minimum value observed in Variable 151 for Wafer 1. The encircled number indicates the maximum

value observed in Variable 1 for Wafer 2186. Each feature (min, max, avg, std) was named according to

the machine and the variable they originated from in the format ’Machine_variable_feature’. For example,

the feature M1_V1_max lists for each wafer the maximum value observed in the time series M1_V1 that

was collected at Machine 1. In the remainder of this paper, the term ’features’ will refer to the data used

in the statistical analyses, as represented in Table 2.

Table 2: Features extracted from the time series that were used in statistical analyses.

M1_V1_max M1_V1_min .M7_V151_min M7_V151_avg} M7_V151_std

Wafer 1 .786 .137 ..041 .434 .322

. . . . . . .

Wafer 2186 .981 .038 ..05 .402 .282

We excluded features that had more than 15% missing values or that had (near-)zero variance. Near-zero

variance may be related to both rare events and to uninformative data, but for the analyses presented below,

such features cannot be used. Then, we removed 26 wafers that had missing data for any of the remaining

features. To prepare the data for analysis, we standardized all features to have a mean of zero and a variance

of one. The resulting data set consisted of 562 features describing the 151 time series during the production

of 2186 wafers. This data set was used for subsequent analyses. The analyzed data and code to reproduce

the results presented below may be provided to the interested reader upon reasonable request.

3.2 Process PLS

Partial least squares (PLS), also known as Projection to Latent Structures (Wold 1982), is a family of

multivariate regression techniques used to evaluate the relationships between two or more blocks of data. PLS

models can handle large number of correlated predictor variables. A high correlation between predictors

is called collinearity and is problematic for most traditional regression techniques because it leads to

rank-deﬁciencies in covariance matrices.

Standard PLS regression only models the relation between two blocks of data. This could be used to

predict the data of, say, Machine 4 from the data of Machine 3 (See Figure 1a). Merging multiple blocks

into one larger block makes it possible to predict, for example, data of Machine 5 from all the data observed

in the machines preceding it (i.e. Machines 1, 2, 3, and 4 as shown in Figure 1b), but this approach does

not model the contributions of individual machines. There are also multi-block PLS extensions that can

handle multiple predictor blocks (Figure 1c), where the relation between each pair of blocks is evaluated

separately. The reader is referred to other literature for a discussion on the standard approach for the

multiple-predictor models (Biancolillo and Næs 2019).

To study the relationships between the machines, information from one machine should be utilized to

predict data from another machine, while itself being predicted by other machines. In essence, any block

of data can simultaneously act as both a predictor and a target, with several targets existing within the

model. We employ a versatile path model algorithm called Process PLS (van Kollenburg et al. 2021), which

provided information on the relations between the data from the various machines in the manufacturing

process.

Like other SEM models, a Process PLS model consists of an outer and inner model. The outer model

partitions the data into blocks. In the current context, this means that features are grouped according to

their respective machines. In Figure 2, the outer model speciﬁcation is represented by the blue components

of the model and the machines as red rectangles. The inner model, represented as the green arrows in

Figure 2, is used to specify which connection between machines to include. For the remainder of this

paper, all following ﬁgures will only present the inner model.

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

Temp.max Power.avg Chem.std Feed.min Pres.min Temp.avg

Machine 1 Machine 2

Feed.avg

M1 M2 M3 M4 M5

𝐽1+ ⋯+ 𝐽4𝐽5

𝐽4

𝐽1

𝐽2

𝐽3

Figure 1: Examples of PLS model speciﬁcations with a single target block. The model a) uses data from

one machine to predict the data from another machine. Sub-ﬁgure b) illustrates forming a single predictor

block. Sub-ﬁgure c) shows a multi-block speciﬁcation. Jmindicates the number of columns in the data of

block m. Standard PLS regression can only be used to analyse models a) and b).

The Process PLS algorithm has two main steps to ﬁnd optimal predictions of all target blocks. First

the outer model is estimated by constructing lower-dimensional latent variables from the features in each

block, based on their relationships with features of target blocks. The proportion of variance in the features

of block mthat is contained in the lower-dimensional representation of that block is indicated by R2

m. In

single-target models, these R2values are equivalent to the R2values obtained in standard PLS regression

(see Figure 1a). While often called ’explained variance’, R2in PLS is better interpreted as the proportion of

variance extracted from the data to ensure optimal prediction. In the current application, the data consists

of manually extracted features. It can be expected that several features will not be informative. If much

data is redundant, interpreting absolute R2values, being a proportion of all variance in a block, should be

avoided. To ensure interpretability, Process PLS has a second step.

In the second step of Process PLS, one PLS regression model is estimated for each target block. All

blocks that predict a particular target block are used as predictors in the model. The primary result of this

step is the explained variance P2(Rho-squared). For a given block m, the explained variance is calculated

as the sum of all (partial) explained variances of every predictor block n, where nrepresents the predictors

of m. In other words,

∑P2

m.n

represents how much of the total variance in the lower-dimensional representation of the target block mcan

be predicted by (latent variables of) other blocks. Conceptually, P2

m.n, as a partial explained variance, can

be compared to the square of a regression coefﬁcient. For instance, a P2

m.nvalue of .5 implies a regression

effect of block non mequivalent to √.5=.71. While this analogy is strictly conceptual in case of multiple

latent variables per block, it may offer readers anintuitive grasp of the meaning of the model results. Please

refer to the foundational paper of Process PLS for details (van Kollenburg et al. 2021).

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

V1.std V2.max V8.min

V9.std

V10.min

V14.avg

V15.max

V15.min

V25.min

V26.std V27.avg V39.std V40.max V41.max V77.min

V78.std

V78.avg

V90.max

V91.std

V92.max

V121.min

...

... ...

...

M3 M6

M4 M5

Figure 2: Example of a Process PLS model. Data from each machine is used to predict the following

machine and each machine is used to predict data from the last machine. The outer model is represented

in blue, the machine representations are in red, and the green arrows represent the inner model.

Explaining a small yet interesting portion of the data could be more valuable than explaining a large

portion of redundant data. As a result, P2values serve as more robust indicators of the strength of

relationships between the blocks compared to R2values. Suppose the R2value of block mis R2

m=.2. Next

to limited relation with other data, factors such as collinearity or the presence of uninformative features

might have caused the R2value to be low. By calculating the P2

mand P2

m.nvalues, which are a proportion

of the R2

mvalue, we can obtain a more robust measure to determine if there is predictable data in block m.

3.3 Model Speciﬁcation

The inner model of a Process PLS model can be speciﬁed in multiple ways. One strategy is to ﬁrst model

the ﬂow of the process and then, with a different model, include relationships between blocks that one

wants to explore. Modelling the ﬂow of a wafer through the manufacturing process can be done with the

inner model speciﬁcation as shown in Figure 3a. Here, the model assumes that data from each machine

is only dependent on the machine before it and that relations between machines can be explained by this

ﬂow. From a substantive point of view, it may be unlikely that such a Markovian process is true, yet it

remains valuable for illustration purposes.

The Process PLS speciﬁcation just discussed is not identical to having multiple two-block models. The

optimization procedure of Process PLS optimizes block representations to be as predictive as possible. For

example, Machine 2 (M2) functions as a target in the relation between M1 and M2. In a two-block model

with only M1 and M2, M2 would be optimally predicted. But since M2 functions as a predictor of M3,

the lower-dimensional representation of M2 is optimized to be as predictive as possible for M3. We stress

that if the goal of ones research is to ﬁnd optimal predictions of each block of data, path modelling is not

the most optimal option. Multiple dedicated prediction models will be better suited for that purpose.

Next to modelling the ﬂow of the wafers, it is also of interest to model the relationship between machines

that are further apart in the process. For the current application, the main interest was in predicting the

data from the last machine from each of the machines preceding it. The model speciﬁcation related to this

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

M3 M6

M4 M5



 













 



 







 

M3 M6

M4 M5























 





 







 



 

a) b)

Figure 3: The left inner model speciﬁcation represents the ﬂow of a wafer through the manufacturing

process. The right speciﬁcation includes relations of all machines with the ﬁnal machine. R2

mvalues are

shown within each rectangle. The P2

m.n-values shown near the arrows indicate how much of the extracted

information in the target block can be predicted from the predictor block.

is shown in Figure 3b. The multiple-predictor, multiple-target structure is illustrative of the added value

of path modelling. That is, the many relations are estimated simultaneously and detailed information can

be gathered about which parts of the process are most strongly related to other parts, conditional on the

other relations. We also note that this method of model speciﬁcation is especially interesting if the ﬁnal

block of data contains quality and/or yield measurements (which it does not in the current case). In this

way, the relations between multiple machines and product quality can be studied simultaneously, which

can beneﬁt process control optimization.

Various other model topologies beyond those presented in this paper are possible, each of which may

offer unique insights into the process. We instead present a data-driven approach to identify a simpliﬁed

model that highlights only the strongest relationships. This can be done by specifying an inner model that

estimates all effects between machines (shown in Figure 4), and removing the inner model connection

with the lowest P2value. This results in a model with one fewer connection. We repeating this process

iteratively until all remaining P2values exceeded .10. This value was subjectively chosen and is comparable

to a regression coefﬁcient of .33. If in any iteration, a machine had no outgoing or incoming arrows, we

removed the data of that block from the data set.

4 RESULTS

The ﬁrst model (Figure 31) indicated a strong relation between data from Machine 1 and Machine 2. Of

the 40% variance extracted from the data related to Machine 2 (i.e., R2

2=.40), 95% could be predicted

from features from Machine 1 (i.e., P2

2.1=.95). From a substantive point of view, this supports prior

expectations expressed by process experts and was a ﬁrst conﬁrmation that the algorithm performed as

intended. The model also showed a strong relation between (features from) Machines 5 and 6 (P2

6.5=.49)

and between Machines 6 and 7 (P2

7.6=.53). There was virtually no relation between the features from

the other Machines. Since the absolute values of R2are not meaningful in the current context (see last

paragraph of Section 3.2), these values are only shown in the ﬁgures, but not will not be discussed. The

discussions will only focus on the regression effects P2.

In the second model (Figure 3b), most effects representing the ﬂow were highly similar to those found

in the ﬁrst model. The notable exception is that the effect of Machine 6 on Machine 7, P2

7.6, dropped from

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

M3 M6

M4 M5













 





 



 

 

Figure 4: Inner model with each machines predicting all following machines. This model was used to

iteratively ﬁnd a simpliﬁed model in which all P2values are at least .10. All inner model effects that

exceeded .10 in this model speciﬁcation are shown in the graph. All effects are provided in Table 3.

.53 in the ﬁrst model to .18 in the second model. This decrease in P2

7.6means that part of the data that could

be explained from Machine 6 can also be explained by data from other machines. The total proportion of

variance of Machine 7 that could be explained in the second model calculated as P2

7=∑6

n=1P2

7.n=.64 (i.e.,

the sum of all effects on Machine 7) is higher than in the ﬁrst model, where P2

7was .53. In the second

model, more data is being used to predict Machine 7, leading not only to a higher explained variance, but

also showing details on the contributions of each machine to the predictions of Machine 7.

The data-driven approach to ﬁnd a simpliﬁed model started with the fully connected model presented

in Figure 4. The inner model effects are provided in Table 3. Due to rounding, some effects are presented

in the table as being equal (e.g., multiple entries being .00). The procedure of estimating the model and

removing the connection with smallest effect and then re-estimating the new model took 16 iterations before

all effects, P2

m.n, exceeded .10. The ﬁnal simpliﬁed model is shown in Figure 5.

Table 3: Inner model coefﬁcients (P2

m.n) from the fully connected model.

Predictors (n)

Target(m) M1 M2 M3 M4 M5 M6

M2 .84

M3 .03 .04

M4 .03 .03 .00

M5 .04 .04 .00 .01

M6 .08 .08 .00 .00 .36

M7 .09 .12 .01 .01 .24 .18

The effects in the simpliﬁed model provide interesting details on the interrelations between the machines.

The strong relations between Machines 1 and 2 is still represented well in this model, as is the relation

between Machines 5 and 6. Model simpliﬁcation hardly affected the the ability to predict the data of

Machine 7 (P2

7=.63). The effect of Machines 1, 2, 4 and 6 on Machine 7 were quite similar, indicating

that each machine has a distinct relation with different parts of the data in Machine 7. This is crucial for

studying possible solutions for process control.

Detailed investigation of the results showed that M1_V7_max,M2_V76_avg,M5_V125_min

M6_V139_avg and M7_V151_min where the most important features in their respective machines

in terms of their contribution to the variance of the lower-dimensional representations. Details are provided

in Table 4, with some description of what these features represented. To give an example, M2_V76_avg

is a feature (the average) extracted from the time series Variable 76. Variable 76 reports the readings of a

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

temperature sensor during the time a wafer is in Machine 2. The average value of this sensor contributed

21.1 times more to the Process PLS model than the other features from Machine 2 contributed on average.

M6M5



 





 





 







 

 

 

Figure 5: Simpliﬁed inner model found through iteratively removing inner model connections from a fully

speciﬁed model (Figure 4 until all P2values were at least .10. As no relations with Machines 2 and 3

reached the threshold, the respective data blocks were removed from the model.

Table 4: Anonimized names of the most important features of each block in terms of contribution to the

(simpliﬁed) Process PLS model. Feature contributions are an indication of how much variance a feature

contributes to the variance of the lower-dimensional representations. The relative contribution is calculated

as the contribution of this feature divided by the average contribution of all features in its block.

Feature Name Relative Contribution Description

M1_V7_max 8.2 The maximum value of a voltage

M2_V76_avg 21.1 Average temperature

M5_V125_min 7.4 Minimum of a certain ﬂow

M6_V139_avg 6.1 Average pressure

M7_V151_min 20.6 Minimum of a certain speed

It is possible to evaluate which features are most important for predicting certain blocks of data.

For example, the single strongest predictor for the data in Machine 2 was feature M1_V36_std, which

indicates the variability in temperature on Machine 1 during manufacturing. The minimum of a certain

ﬂow, M5_V125_min, was the single most important predictor of the data observed in Machine 6. For

Machine 7 the variability of a different ﬂow, represented by feature M5_V126_std was the strongest

predictor. Without providing too many details, it may be clear that this type of information is extremely

valuable for predictive process control and obtaining valuable insights for root-cause analysis.

5 DISCUSSION

This paper showcases how path modeling can serve as a potent instrument to study relationships between

process data across production lines. This approach helped us reveal correlations between feature sets,

leading to a deeper understanding of the intricate interactions in the production process. Path models

are interpretable, and their simple topology and the ease with which models can be speciﬁed, creates an

intuitive representation of the process. By simplifying the models, we were able to identify the essential

relationships.

The ﬁndings obtained from our analyses can be applied towards creating soft-sensors or virtual metrology.

It is imperative to identify the crucial connections between features in order to determine which input

variables should be used. Subsequently, these features can be utilized to forecast the quality of end-products

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

(van Kollenburg et al. 2022), leading to production optimization, cost reduction, and improvement in overall

product quality.

Our research showed the importance of addressing co-linearity in process data in complex systems like

production lines. We illustrated the methodology using a basic feature selection method, which likely missed

key relations between the time series. Directly integrating time series into a path modeling framework

would better capture complex relations and result in more accurate models. Until this becomes possible,

a priori ﬁltering is needed.

6 CONCLUSION

The multitude of possible relations between time series features makes multivariate cross-correlation

infeasible for our purposes. Path modelling allowed us to use domain-speciﬁc insights to enhance the

effectiveness of our models and better understand the relations in the data. Although our current model

has shown some success in addressing our research questions, there is a need for a better-ﬁtting model that

can account for complex interactions between variables not anticipated by domain experts. A promising

direction for future research is the extension of path modeling to tensor-based regression models (Liu

et al. 2021), which may overcome the limitations of manual feature selection by capturing higher-order

relationships in the data.

For future research, we suggest developing a neural-network-based path modeling combined with tensor

data representations from manufacturing processes. Such methods could improve model performance by

learning complex relationships between different types of data. Incorporating expert knowledge into

the model may become even more crucial to ensure interpretability. Neural network models, capable of

adapting to changes and discovering hidden data patterns, could offer more robust solutions to manufacturing

challenges.

ACKNOWLEDGMENTS

This work was in part supported by ECSEL Joint Undertaking, under grant agreement No 826589. The

authors express their gratitude to Paola Giuffre, Caterina Genua and Daniele Li Rosi for their consultation

with respect to the manufacturing process and data presented in this paper. OpenAI’s ChatGPT (GPT-4)

was used to proofread the presented text, checking for spelling and grammar. In no way were algorithms

used to create original content or to generate ideas.

REFERENCES

Arteaga, F., and A. Ferrer. 2002. “Dealing With Missing Data in Mspc: Several Methods, Different Interpretations, Some

Examples”. Journal of Chemometrics: A Journal of the Chemometrics Society 16(8-10):408–418.

Biancolillo, A., and T. Næs. 2019. “The Sequential and Orthogonalized Pls Regression for Multiblock Regression: Theory,

Examples, and Extensions”. In Data handling in Science and Technology, Volume 31, 157–177. Elsevier.

Biegel, T., N. Jourdan, C. Hernandez, A. Cviko, and J. Metternich. 2022. “Deep Learning for Multivariate Statistical In-Process

Control in Discrete Manufacturing: A Case Study in a Sheet Metal Forming Process”. Procedia CIRP 107:422–427.

Breque, M., L. De Nul, and A. Petridis. 2021. “Industry 5.0: Towards a Sustainable, Human-Centric and Resilient European

Industry”. Luxembourg, LU: European Commission, Directorate-General for Research and Innovation.

Cagliano, R., F. Canterino, A. Longoni, and E. Bartezzaghi. 2019. “The Interplay between Smart Manufacturing Technologies

and Work Organization: The Role of Technological Complexity”. International Journal of Operations & Production

Management 39(678):913–934.

Cifone, F. D., K. Hoberg, M. Holweg, and A. P. Staudacher. 2021. “‘Lean 4.0’: How Can Digital Technologies Support Lean

Practices?”. International Journal of Production Economics 241:108258.

Dupret, Y., E. Perrin, J. Grolier, and R. Kielbasa. 2005. “Comparison of Three Different Methods to Model the Semiconductor

Manufacturing Yield”. In IEEE/SEMI Conference and Workshop on Advanced Semiconductor Manufacturing 2005., 118–123.

IEEE.

Fei, G. C., A. M. Rasli, and S. S. Xuan. 2015. “Modeling the Heterogeneity of Corporate Governance Mechanisms Across

Industries: A Multi-Group Analysis Using PLS Path Modeling”. Journal of Contemporary Issues and Thought 5:50–65.

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

Gu, Z., and K. Van Deun. 2019. “Regularizedsca: Regularized Simultaneous Component Analysis of Multiblock Data in R”.

Behavior Research Methods 51:2268–2289.

Hair Jr, J. F., G. T. M. Hult, C. M. Ringle, and M. Sarstedt. 2021. A Primer on Partial Least Squares Structural Equation

Modeling (Pls-sem). Sage Publications.

Hamilton, J. D. 2020. Time Series Analysis. Princeton, New Jersey: Princeton University Press.

Hochreiter, S., and J. Schmidhuber. 1997. “Long Short-term Memory”. Neural computation 9(8):1735–1780.

Köksal, G., I. Batmaz, and M. C. Testik. 2011. “A Review of Data Mining Applications for Quality Improvement in Manufacturing

Industry”. Expert Systems with Applications 38(10):13448–13467.

Liu, J., C. Zhu, Z. Long, Y. Liu et al. 2021. “Tensor Regression”. Foundations and Trends® in Machine Learning 14(4):379–565.

Meindl, B., N. F. Ayala, J. Mendonça, and A. G. Frank. 2021. “The Four Smarts of Industry 4.0: Evolution of Ten Years of

Research and Future Perspectives”. Technological Forecasting and Social Change 168:120784.

Melhem, M., B. Ananou, M. Djeziri, M. Ouladsine, and J. Pinaton. 2015. “Prediction of the Wafer Quality with Respect to

the Production Equipments Data”. IFAC-PapersOnLine 48(21):78–84.

Moldovan, D., T. Cioara, I. Anghel, and I. Salomie. 2017. “Machine Learning for Sensor-Based Manufacturing Processes”. In

2017 13th IEEE international conference on intelligent computer communication and processing (ICCP), 147–154. IEEE.

Offermans, T., L. Hendriks, G. H. van Kollenburg, E. Szyma´

nska, L. M. Buydens, and J. J. Jansen. 2021. “Improved

Understanding of Industrial Process Relationships through Conditional Path Modelling with Process Pls”. Frontiers in

Analytical Science 1:721657.

Offermans, T., E. Szyma´

nska, G. H. van Kolllenburg, L. M. Buydens, and J. J. Jansen. 2021. “Automatically Optimizing

Dynamic Synchronization of Individual Industrial Process Variables for Statistical Modelling”. Computers & Chemical

Engineering 152:107402.

Sanchez-Marquez, R., and J. M. J. Vivas. 2020. “Multivariate Spc Methods for Controlling Manufacturing Processes Using

Predictive Models–a Case Study in the Automotive Sector”. Computers in Industry 123:103307.

Sarstedt, M., J. F. Hair, M. Pick, B. D. Liengaard, L. Radomir, and C. M. Ringle. 2022. “Progress in Partial Least Squares

Structural Equation Modeling Use in Marketing Research in the Last Decade”. Psychology & Marketing 39(5):1035–1064.

Senoner, J., T. Netland, and S. Feuerriegel. 2022. “Using Explainable Artiﬁcial Intelligence to Improve Process Quality:

Evidence from Semiconductor Manufacturing”. Management Science 68(8):5704–5723.

Spanos, C. 1992. “Statistical Process Control in Semiconductor Manufacturing”. Proceedings of the IEEE 80(6):819–830.

Timings, J. 2021. “6 Crucial Steps in Semiconductor Manufacturing”. https://www.asml.com/en/news/stories/2021/semiconductor-

manufacturing-process-steps, accessed 24 March 2023.

Tortorella, G. L., R. Giglio, and D. H. Van Dun. 2019. “Industry 4.0 Adoption as a Moderator of the Impact of Lean

Production Practices on Operational Performance Improvement”. International Journal of Operations & Production

Management 39(678):860–886.

Trardi, Y., B. Ananou, P. Tchatchoua, and M. Ouladsine. 2022. “Ensemble Machine Learning Algorithms for Anomaly Detection

in Multivariate Time-Series”. In 2022 International Conference on Control, Automation and Diagnosis (ICCAD), 1–6.

IEEE.

Tseng, M.-L., W.-W. Wu, Y.-H. Lin, and C.-H. Liao. 2008. “An Exploration of Relationships between Environmental Practice and

Manufacturing Performance Using the Pls Path Modeling”. WSEAS transactions on environment and development 4(6):487–

502.

van Kollenburg, G., R. Bouman, T. Offermans, J. Gerretzen, L. Buydens, H.-J. van Manen, and J. Jansen. 2021. “Process

Pls: Incorporating Substantive Knowledge into the Predictive Modelling of Multiblock, Multistep, Multidimensional and

Multicollinear Process Data”. Computers & Chemical Engineering 154:107466.

van Kollenburg, G., M. Holenderski, and N. Meratnia. 2022. “Value Proposition of Predictive Discarding in Semiconductor

Manufacturing”. Production Planning & Control:1–10.

van Kollenburg, G. H., J. van Es, J. Gerretzen, H. Lanters, R. Bouman, W. Koelewijn, A. N. Davies, L. M. Buydens, H.-J.

van Manen, and J. J. Jansen. 2020. “Understanding Chemical Production Processes by Using Pls Path Model Parameters

as Soft Sensors”. Computers & Chemical Engineering 139:106841.

Vinodh, S., and D. Joy. 2012. “Structural Equation Modeling of Sustainable Manufacturing Practices”. Clean Technologies and

Environmental Policy 14(1):79–84.

Vinzi, V. E., L. Trinchera, and S. Amato. 2010. “PLS Path Modeling: From Foundations to Recent Developments and Open Issues

for Model Assessment and Improvement”. In Handbook of Partial Least Squares: Concepts, Methods and Applications,

edited by V. Esposito Vinzi, W. W. Chin, J. Henseler, and H. Wang, 47–82. Berlin, Heidelberg: Springer Berlin Heidelberg.

Withanage, C., T. Park, T. T. H. Duc, and H.-J. Choi. 2012. “Dynamic Partial Least Square Path Modeling for the Front-end

Product Design and Development”. Journal of mechanical design 134(10):100907.

Wold, H. O. 1982. “Soft modelling: the Basic Design and some Extensions”. In Systems under Indirect Observation, Part II,

36–37. North Holland.

van Kollenburg, Verhoeven, Pagano, Holenderski, and Meratnia

AUTHOR BIOGRAPHIES

GEERT VAN KOLLENBURG is Assistant Professor at Eindhoven University of Technology in the Netherlands. he has worked

with national and international market leaders to develop data-driven solutions for sustainable chemical and semiconductor

industries. His expertise spans statistical process control, structural equation modelling, chemometrics, and more. His dedication

to sustainability and expertise in statistics and machine learning have led to innovative approaches for predictive and prescriptive

analytics. Email address: g.h.v.kollenburg@tue.nl. Website https://research.tue.nl/nl/persons/geert-van-kollenburg

RICHARD VERHOEVEN is a University Researcher and IT Developer at the Department of Mathematics and Computer

Science at the Eindhoven University of Technology in the Netherlands. He has worked together with national and international

partners on the topics of component based systems, wireless sensor networks, internet of things and hardware-in-the-loop

simulations. Email address is p.h.f.m.verhoeven@tue.nl. Website is https://research.tue.nl/nl/persons/richard-verhoeven

DANIELE PAGANO, is Funding Project Manager at STMicroelectronics s.r.l. He has covered various position and responsibilities

in Catania Wafer Fab Operations (Lithography, Dry Etching, APC & SPC, Epitaxy, Process Control), Past experiences in collabora-

tive projects like IMPROVE (2012), INTEGRATE (2015), MADEin4 (2022) and nowadays HiCONNECTS, SATURN, IPCEI. He

is author and co-author of several publications on journals and international conferences. E-mail adress is daniele.pagano@st.com

MIKE HOLENDERSKI is an assistant professor at the Department of Computer Science and Mathematics at the Eindhoven

University of Technology in the Netherlands. He did his PhD in Computer Science on the topic of multi-resource management in

embedded real-time systems. His current research focuses on reliable and trustworthy machine learning for process optimization

and control and hybrid data/knowledge driven modelling of high-dimensional data. His e-mail address is m.holenderski@tue.nl

and homepage is https://www.tue.nl/en/research/researchers/mike-holenderski.

NIRVANA MERATNIA is Full Professor of Pervasive Computing at Eindhoven University of Technology in the Netherlands.

Her research covers various aspects of computing including distributed machine learning and AI, embedded/edge AI, data-driven

networking and smart sensor systems. She has been involved in several national and international projects addressing (distributed)

computation and intelligence in the context of Internet of Things and Cyber Physical Systems creating societal and economic

impacts. Her e-mail address is n.meratnia@tue.nl

ResearchGate has not been able to resolve any citations for this publication.

Value proposition of predictive discarding in semiconductor manufacturing

Article

Full-text available

Jul 2022

Optimizing semiconductor manufacturing processes is needed to solve the current shortage of computer chips. Discarding unfinished chips based on data-driven predictions models can significantly reduce time and resources otherwise spent on finishing faulty chips. The current paper presents the value proposition of predictive discarding at different stages i the manufacturing process, by combining model performance metrics with costs and benefits related to false and correct discards. While applied to the chip manufacturing process in this paper, predictive discarding is a generic methodology to minimize wasted resources by predicting product quality from process data. Through sensitivity analysis, we show that even with weak predictors, predictive discarding can still be beneficial, from both economic and sustainability perspectives. The proposed method is illustrated by analyzing an empirical benchmark data set from the semiconductor manufacturing domain.

Book Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R

Book

Full-text available

Jul 2022

In 2021, the third edition of our introductory book A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) was published (Hair, Hult, Ringle, & Sarstedt, 2022). The book covers the latest developments in the ield, including recent advances in model evaluation (e.g., inference testing in discriminant validity assessment, predictive power assessments using PLS predict comparisons), improved guidelines for minimum sample sizes, and new complementary methods and concepts such as necessary condition analysis and endogeneity. The book has been highly successful as evidenced in its citation count of more than 24,000 times according to Google Scholar (as of August 2021), and the translations into seven other languages, including in German (Hair et al., 2017), Italian (Hair et al., 2020), and Spanish (Hair et al., 2019). One of the book’s features that has likely contributed to its popularity is our strong focus on pedagogical elements, most notably our reliance on a single running case study and the commercial SmartPLS 3 software (Ringle, Wende, & Becker, 2015), which stands out due to its frictionless design, allowing novice researchers to quickly specify and estimate PLS path models (Memon et al., 2021; Sarstedt & Cheah, 2019).

Progress in partial least squares structural equation modeling use in marketing research in the last decade

Article

Full-text available

May 2022
PSYCHOL MARKET

Partial least squares structural equation modeling (PLS-SEM) is an essential element of marketing researchers’ methodological toolbox. During the last decade, the PLS-SEM field has undergone massive developments, and raising the question of whether the method’s users are following the most recent best practice guidelines. This paper extends Hair et al.’s (2012) seminal review by presenting the results of a new analysis of PLS-SEM use in marketing research, focusing on articles published between 2011 and 2020 in the top 30 marketing journals. We find that while researchers were more aware of the when’s and how’s of PLS-SEM use during this period, there continues to be some delay before the adoption of model evaluation’s best practices. Based on our review results, we provide recommendations for future PLS-SEM use, offer guidelines for the method’s application, and identify areas of further research interest.

A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM)

Book

Full-text available

Jan 2022

Partial least squares structural equation modeling (PLS-SEM) has become a standard approach for analyzing complex inter-relationships between observed and latent variables. Researchers appreciate the many advantages of PLS-SEM such as the possibility to estimate very complex models and the method’s flexibility in terms of data requirements and measurement specification. This practical open access guide provides a step-by-step treatment of the major choices in analyzing PLS path models using R, a free software environment for statistical computing, which runs on Windows, macOS, and UNIX computer platforms. Adopting the R software’s SEMinR package, which brings a friendly syntax to creating and estimating structural equation models, each chapter offers a concise overview of relevant topics and metrics, followed by an in-depth description of a case study. Simple instructions give readers the “how-tos” of using SEMinR to obtain solutions and document their results. Rules of thumb in every chapter provide guidance on best practices in the application and interpretation of PLS-SEM.

Improved Understanding of Industrial Process Relationships Through Conditional Path Modelling With Process PLS

Article

Full-text available

Aug 2021

Understanding how different units of an industrial production plant are operationally related is key to improving production quality and sustainability. Data science has proven indispensable in obtaining such understanding from vast amounts of historical process data. Path modelling is a valuable statistical tool to obtain such information from historical production data. Investigating how relationships within a process are affected by multiple production conditions and their interactions can however provide an even deeper understanding of the plant’s daily operation. We therefore propose conditional path modelling as an approach to obtain such improved understanding, demonstrated for a milk protein powder production plant. For this plant we studied how the relationships between different production units and steps are dependent on factors like production line, different seasons and product quality range. We show how the interaction of such factors can be quantified and interpreted in context of daily plant operation. This analysis revealed an augmented insight into the process that can be readily placed in the context of the plant’s structure and behavior. Such insights can be vital to identify and improve upon shortcomings in current plant-wide monitoring and control routines.

Process PLS: Incorporating substantive knowledge into the predictive modelling of multiblock, multistep, multidimensional and multicollinear process data

Article

Full-text available

Nov 2021
COMPUT CHEM ENG

Chemical production processes benefit from intelligent data analysis. Previous work showed how process knowledge can be included in a structural equation modelling framework. While predictive models increase process value, currently available methods have limitations that hinder applicability to many (industrial) processes. This paper describes the Process PLS algorithm which can analyze multi-block, multistep and/or multidimensional processes. Process PLS was benchmarked on a simulated crude oil distillation process. Analysis of 22 empirical data sets from a production process at Nouryon illustrated how Process PLS solves limitations of PLS path modelling. In the analysis of the benchmark Val de Loire data, Process PLS revealed substantially meaningful effects which the recently proposed Sequential Orthogonalized PLS path modelling completely missed. Process PLS is a promising approach that enables data-driven analysis of process data using information on the complex process structure, to demonstrably increase insight in the underlying system, making model-based predictions much more valuable.

Ensemble Machine Learning Algorithms for Anomaly Detection in Multivariate Time-Series

Conference Paper

Jul 2022

Deep learning for multivariate statistical in-process control in discrete manufacturing: A case study in a sheet metal forming process

Article

Jan 2022

Detecting abnormal conditions in manufacturing processes is a crucial task to avoid unplanned downtimes and prevent quality issues. The increasing amount of available high-frequency process data combined with advances in the field of deep autoencoder-based monitoring offers huge potential in enhancing the performance of existing Multivariate Statistical Process Control approaches. We investigate the application of deep auto encoder-based monitoring approaches and experiment with the reconstruction error and the latent representation of the input data to compute Hotelling’s T² and Squared Prediction Error monitoring statistics. The investigated approaches are validated using a real-world sheet metal forming process and show promising results.

Tensor Regression

Article

Sep 2021

The presence of multidirectional correlations in emerging multidimensional data poses a challenge to traditional regression modeling methods. Traditional modeling methods based on matrix or vector, for example, not only overlook the data's multidimensional information and lower model performance, but also add additional computations and storage requirements. Driven by the recent advances in applied mathematics, tensor regression has been widely used and proven effective in many fields, such as sociology, climatology, geography, economics, computer vision, chemometrics, and neuroscience. Tensor regression can explore multidirectional relatedness, reduce the number of model parameters and improve model robustness and efficiency. It is timely and valuable to summarize the developments of tensor regression in recent years and discuss promising future directions, which will help accelerate the research process of tensor regression, broaden the research direction, and provide tutorials for researchers interested in high dimensional regression tasks. The fundamentals, motivations, popular algorithms, related applications, available datasets, and software resources for tensor regression are all covered in this monograph. The first part focuses on the key concepts for tensor regression, mainly analyzing existing tensor regression algorithms from the perspective of regression families. Meanwhile, the adopted low rank tensor representations and optimization frameworks are also summarized. In addition, several extensions in online learning and sketching are described. The second part covers related applications, widely used public datasets and software resources, as well as some real-world examples, such as multitask learning, spatiotemporal learning, human motion analysis, facial image analysis, neuroimaging analysis (disease diagnosis, neuron decoding, brain activation, and connectivity analysis) and chemometrics. This survey can be used as a basic reference in tensor-regression-related fields and assist readers in efficiently dealing with high dimensional regression tasks.

‘Lean 4.0’: How can digital technologies support lean practices?

Article

Aug 2021
INT J PROD ECON

Digital technologies, such as advanced analytics, autonomous vehicles or the Internet of Things, are often touted as means to substantially improve operations. While this potential has been frequently highlighted and evidenced from single case applications, we still lack a deeper theoretical understanding of the underlying mechanisms how digital technologies can support process improvement in general, and lean practices more specifically. In this paper, we use a qualitative study based on focus group design to understand how manufacturing and supply chain management professionals perceive the potential of digital technologies in support of lean practices. We identify eight digital waste reduction mechanisms that illustrate how digital technologies can support lean practices. These include a cluster of mechanisms that augment operational execution in terms of speed and precision of execution, as well as flexibility in space and time. Furthermore, we identify a second cluster of mechanisms that augment decision-making through visibility, feedback, engagement, and prevention. In terms of managerial implications, our findings provide firms with a structured approach how to identify those digital technologies that can most effectively support their respective process improvement activities.

Modeling Multivariate Relations in Multiblock Semiconductor Manufacturing Data Using Process PLS To Enhance Process Understanding

Recommended publications

Modelling semiconductor manufacturing operations

Modelling the effects of loss and fabrication error for second harmonic generation in semiconductor...

Big data analytics for modeling WAT parameter variation induced by process tool in semiconductor man...

Modeling, Fabrication, and Characterization of Semiconductor Materials and... light