Sensor signals acquired during the manufacturing process contain rich information that can be used to facilitate effective monitoring of operational quality, early detection of system anomalies, and quick diagnosis of fault root causes. This paper develops a method for effective monitoring and diagnosis of multisensor heterogeneous profile data based on multilinear discriminant analysis. The proposed method operates directly on the multistream profiles and then extracts uncorrelated discriminative features through tensor-to-vector projection, and thus, preserving the interrelationship of different sensors. The extracted features are then fed into classifiers to detect faulty operations and recognize fault types. The developed method is demonstrated with both simulated and real data from ultrasonic metal welding.

## Introduction

The wide applications of low-cost and smart sensing devices along with fast and advanced computer systems have resulted in a data-rich environment, which makes a large amount of data available in many applications. Sensor signals acquired during the process contain rich information that can be used to facilitate effective monitoring of operational quality, early detection of system anomalies, and quick diagnosis of fault root causes. In discrete manufacturing and many other applications, the sensor measurements provided by online sensing and data capturing technology are time- or spatial-dependent functional data, also called profile data [1,2]. In this paper, we are particularly interested in cycle-based profile data in ultrasonic metal welding [3], which are collected from repetitive operational cycles of the discrete manufacturing process.

Ultrasonic welding is a solid-state bonding process that uses high frequency ultrasonic energy to generate oscillating shears between metal sheets clamped under pressure [4,5], as illustrated in Fig. 1. The advantages of ultrasonic welding in joining dissimilar and conductive materials have been well recognized [6]. As electric car sales accelerate and production scales up in recent years, ultrasonic welding has been increasingly adopted for joining lithium-ion batteries for electric vehicles [4]. Tensile tests are conducted to study the joint's fracture, load-extension relationship, and tensile strength [3,4,6,7]. However, since tensile test is destructive and can only be performed offline, it is important to develop in situ monitoring and evaluation to provide opportunities for a faster implementation of corrective actions.

Fig. 1
Fig. 1
Close modal

To facilitate in-process quality monitoring and fault diagnosis in ultrasonic welding, sensors were installed in the welding machine to collect in situ signals such as the ultrasonic power and the displacement between horn and anvil. Figure 2 shows two signals for four samples. Lee et al. found that there is correlation between online sensor signals and weld attributes [3]. Guo et al. developed an online monitoring algorithm to ensure weld quality and detect bad welds [8]. However, these studies only analyzed certain features from the sensor signals, such as the maximum power and maximum displacement, while the rich information hidden in the real-time signals were not extracted or explored. Moreover, existing studies are limited to either characterizing weld attributes or detecting bad welds while fault diagnosis for bad welds has not been investigated. Furthermore, multiple signals need to be modeled together since a single signal may not be informative enough for fault identification. Therefore, this paper aims to develop a new method in sensor fusion and fault diagnosis to enable in situ nondestructive evaluation of ultrasonic metal welding.

Fig. 2
Fig. 2
Close modal

There is extensive research on the modeling and monitoring of cycle-based profile data in the literature, including both linear profiles and nonlinear profiles. An overview of parametric and nonparametric approaches for profile data as well as application domains can be found in Kuljanic et al. [9]. In recent years, there is a strong industrial interest for multisignal applications, especially in cases where a single signal does not provide enough information for decision making. This leads to an increasing demand for multisensor fusion methods to analyze the multiple signals captured from different sensors for process monitoring and system diagnostics purposes.

There have been many research efforts on multisensor data fusion in manufacturing operations, for example, chatter detection in milling [10], tool condition monitoring [11,12], engine fault diagnosis [13], etc. A large portion of the multisensor data fusion methods is based on extracting a single synthetic index from the monitoring signals, e.g., peak value, a weighted summation of signals, etc. The main limitations of this approach include the loss of information involved in the feature extraction process, the loss of sensor-to-sensor correlations, and the problem-dependent nature of the synthesizing scheme. Although profile monitoring techniques have been demonstrated to be more effective than synthetic index-based methods in monitoring processes characterized by repeating patterns [9], only a few authors have studied profile monitoring approaches in the field of sensor fusion [1416]. Recently, with the fast development of multilinear methods for face recognition, Paynabar et al. [17] proposed a multichannel profile monitoring and fault diagnosis method based on uncorrelated multilinear principal component analysis (UMPCA) [18], whereas Grasso et al. [19] investigated the problem of multistream profile monitoring using multilinear PCA (MPCA) [20]. Multichannel profiles are homogeneous, in which all sensors measure the same variable, whereas multistream signals are heterogeneous, in which various sensors measure different variables.

In this study, we investigate the use of multilinear extensions of linear discriminant analysis (LDA) to deal with multistream signals for the purpose of process monitoring and fault diagnosis. LDA has been widely used as an effective tool for dimension reduction and discriminant analysis of complex data. Regular LDA is a linear algorithm that can only operate on vectors, thus cannot be directly applied to multistream profiles. To apply LDA to multistream profiles, these profiles need to be combined and reshaped (vectorized) into vectors first. Therefore, this method is referred to as vectorized LDA (VLDA). Applying LDA to this high-dimensional vector creates high computational complexity due to the dimension of scatter matrices. Moreover, vectorization breaks the natural structure and correlation in the original data, e.g., sensor-to-sensor correlation, and potentially loses more useful representations that can be obtained in the original form. Lu et al. [21] introduced an uncorrelated multilinear LDA (UMLDA) framework as an alternative to VLDA. UMLDA is a multilinear dimensionality reduction and feature extraction method that operates directly on the multidimensional objects, known as tensor objects, rather than their vectorized versions. The UMLDA extracts uncorrelated discriminative features directly from tensorial data through solving a tensor-to-vector projection (TVP). Although MPCA and UMPCA are also multilinear subspace feature extraction algorithms operating directly on the tensorial representations, similar to PCA, they are both unsupervised methods that do not make use of the class information. In manufacturing and many other applications, training samples from various classes can be easily collected in an efficient manner. In these applications, supervised multilinear methods like UMLDA take class information into consideration and thus may be more suitable for fault recognition. Although there is some exploratory research on the applications of UMLDA to image processing on face and gait recognition tasks [21], very little research could be found in the literature on using the UMLDA technique for analyzing multistream nonlinear profiles for the purpose of fault detection and diagnosis.

Therefore, the main objective of this paper is to develop a UMLDA-based approach for analyzing multistream in situ profiles in ultrasonic welding that considers the interrelationship of sensors. The features extracted by the UMLDA-based method can effectively discriminate different classes and provide fault diagnosis results. The effectiveness of the proposed method is tested on both simulations and a real-world case study in the ultrasonic metal welding process.

The remainder of this paper is organized as follows. Section 2 presents the method for analysis and dimension reduction of multistream profiles using UMLDA. VLDA is also reviewed in this section. Section 3 compares the proposed UMLDA-based method with VLDA and its variants, and other competitor methods including UMPCA-based and MPCA-based methods in the performance of extracting discriminative features and recognizing the type of faults. A case study of an ultrasonic metal welding process is given in Sec. 4. Finally, Sec. 5 concludes the paper with the discussion of broader impacts.

## Dimension Reduction of Multistream Signals: UMLDA and VLDA

Multiway data analysis is the extension of two-way methods to higher-order datasets. This section first reviews the basic notations and concepts in multilinear algebra and then introduces the implementation of UMLDA and VLDA for the purpose of dimensionality reduction in handling multistream signals. More details on the theoretical foundations of the mathematical development of UMLDA can be found in Refs. [2224]. The algorithm we use in this paper for extracting uncorrelated features from tensor data is based on the theories presented in those articles.

### Basic Multilinear Algebra Concepts and Tensor-to-Vector Projection.

An L-way array $A$ is an Lth-order tensor object $A∈RI1×I2×⋯×IL$ such that Il represents the dimension of the l-mode, l = 1, …, L, where the term mode refers to a generic set of entities [25]. The l-mode vectors of $A∈RI1×I2×⋯×IL$ are defined as the Il-dimensional vectors obtained from $A$ by varying the index il (il = 1, …, Il) while keeping all the other indices fixed. In multilinear algebra, a matrix $A$ can be considered to be a second-order tensor. The column vectors and row vectors are considered as the 1-mode and 2-mode vectors of the matrix, respectively. The l-mode product of a tensor $A$ by a matrix $U∈RJl×Il$, denoted by $A×lU$, is a tensor with entries $(A×lU)$$(i1,…,il−1,jl,il+1,…,iL)=∑ilA(i1,…,iL)⋅U(jl,il)$.

To project tensorial data into a subspace for better discrimination, there are two general forms of multilinear projection: the tensor-to-tensor projection (TTP) and the tensor-to-vector projection (TVP). The TVP projects a tensor to a vector and it can be viewed as multiple projections from a tensor to a scalar. A tensor $A∈RI1×I2×⋯×IL$ can be projected to a point y through L unit projection vectors ${u(1)T,u(2)T,…,u(L)T}$ as $y=A×1u(1)T×2u(2)T×⋯×Lu(L)T$, $u(l)∈RIl×1$, $||u(l)=1||$ for l = 1, …, L, where || · || is the Euclidean norm for vectors. This projection ${u(1)T,u(2)T,…,u(L)T}$ is called an elementary multilinear projection (EMP), which is the projection of a tensor on a single line (resulting a scalar) and it consists of one projection vector in each mode. The TVP of a tensor object $A$ to a vector $y∈RP$ in a P-dimensional vector space consists of P EMPs, which can be written as ${up(1)T,up(2)T,…,up(L)T}p=1,…,P={up(l)T,l=1,…,L}p=1P$. The TVP from $A$ to $y$ is then written as $y=A×l=1L$${up(l)T,l=1,…,L}p=1P$, where the pth component of $y$ is obtained from the pth EMP as $y(p)=A×1up(1)T×2up(2)T×⋯×Lup(L)T$.

In the frame of multistream profile data, the simplest L-way array representing the signals is a third-order tensor object $A∈RI1×I2×M$ such that I1 is the number of sensors, I2 is the number of data points collected on each profile, and M is the number of multistream profiles or samples. Note that more articulated datasets may be generated by introducing additional modes, e.g., by adding a further mode to group together different families of sensors.

### The UMLDA Approach.

Multilinear subspace feature extraction algorithms operating directly on tensor objects without changing their tensorial structure are emerging. Since LDA is a classical algorithm that has been very successful and applied widely in various applications, there have been several variants of its proposed multilinear extension, named multilinear discriminant analysis (MLDA) in general. The projected tensors obtained from MLDA, however, are correlated contrary to classical LDA. To overcome this issue, Lu et al. [21] proposed UMLDA, in which a TVP projection is used for projection. In this subsection, we review this UMLDA method.

The derivation of the UMLDA algorithm follows the classic LDA derivation of minimizing the within-class distance and maximizing the between-class distance simultaneously, thus achieving maximum discrimination. A number of EMPs are solved one by one to maximize the discriminant criterion with an enforced zero-correlation constraint. To formulate the UMLDA problem, let ${ymp,m=1,…,M}$ denote the pth projected scalar features, where M is the number of training samples and $ymp$ is the projection of the mth sample $Am$ by the pth EMP ${up(1)T,up(2)T}:$

$ymp=Am×1up(1)T×2up(2)T$. Adapting the classical Fisher discriminant criterion to a scalar sample, the between-class scatter $SBpy$ and the within-class scatter $SWpy$ are
$SBpy=∑c=1CNc(y¯cp−y¯p)2andSWpy=∑m=1M(ymp−y¯cmp)2$
(1)
where C is the number of classes, Nc is the number of samples for class c, cm is the class label for the mth training sample, $y¯p=(1/M)∑mymp=0$ assuming the training samples are zero-mean, $y¯cp=(1/Nc)∑m,cm=cymp$, and $y¯cmp$ is the mean of the class that $ymp$ belongs to. Let $gp$ denote the pth coordinate vector and $gp(m)=ymp$. The objective of UMLDA is to determine a set of P EMPs that maximize the scatter ratio while producing uncorrelated features. The mathematical formulation of UMLDA can be written as
${up(1)T,up(2)T}=argmaxFpy=argmaxSBpy/SWpy$
(2)
subject to
$∥up(1)∥=1,∥up(2)∥=1,gpTgq∥gp∥⋅∥gq∥=δpq,p,q=1,…,P$
where δpq = 1 for p = q and δpq = 0 otherwise.
The solution to this problem is provided using the successive determination approach [21,26]. The P EMPs ${up(1)T,up(2)T}p=1P$ are determined sequentially in P steps, with the pth step obtaining the pth EMP. The implementation of UMLDA given by Lu et al. [21] for the purpose of face recognition introduces a regularization parameter γ (regularized UMLDA (R-UMLDA)). To solve for $up(l*)$ in the $l*−mode$, assuming that ${up(l),l≠l*}$ is given, the tensor samples are projected in these (L − 1 modes) ${l≠l*}$ to obtain vectors $y~mp(l*)=Am×l=1,l≠l*L{up(l)T,l=1,…,l*−1,l*+1,…,L}p=1P$. The regularized within-class scatter matrix $S~Wp(l*)$ is defined as
$S~Wp(l*)=∑m=1M(y~mp(l*)−y~¯cmp(l*))(y~mp(l*)−y~¯cmp(l*))T+γ⋅λmax(SˇW(l*))⋅IIl*$
(3)
where γ ≥ 0 is a regularization parameter, $IIl*$ is an identity matrix of size $Il*×Il*$, and $λmax(SˇW(l*))$ is the maximum eigenvalue of $SˇW(l*)$, which is the within-class scatter matrix for the l-mode vectors of the training samples.

The purpose of introducing the regularization parameter is to improve the UMLDA algorithm under small sample size scenario, where the dimensionality of the input data is high, but the number of training samples for some classes is too small to represent the true characteristics of their classes. This is a common case in small scale production like prototyping or personalized production. This scenario may also occur when a certain type of fault exists but rare, and that the data from that fault case are limited. If the number of training samples is too small, the iterations tend to minimize the within-class scatter toward zero in order to maximize the scatter ratio. Having a regularization parameter in the within-class scatter ensures that during the iteration, less focus is put on shrinking the within-class scatter. The basic UMLDA is obtained by setting γ = 0.

Based on the observations in Ref. [21], the sensitivity of the R-UMLDA to initialization and regularization suggests that R-UMLDA is not a very stable feature extractor and it is good for ensemble-based learning. Regularized UMLDA with aggregation (R-UMLDA-A) is hence introduced to aggregate several differently initialized and regularized UMLDA feature extractors to achieve better classification results. To focus on feature extraction performance, simple aggregation at the matching score level using the nearest-neighbor distance is implemented in R-UMLDA-A. Let A denote the number of R-UMLDA feature extractors to be aggregated. To classify a test sample $A$, it is projected to A feature vectors ${y(a)}a=1,…,A$ using the A TVPs first. Next, for the ath R-UMLDA feature extractor, the nearest-neighbor distance of the test sample $A$ to each candidate class c is
$d(A,c,a)=minm,cm=c||y(a)−ym(a)||$
(4)
$d(A,c,a)$ is then scaled to the interval [0, 1] as $d~(A,c,a)=$$(d(A,c,a)−mincd(A,c,a))/(maxcd(A,c,a)−mincd(A,c,a))$. The aggregated nearest-neighbor distance is obtained using the simple sum rule
$d(A,c)=∑a=1Ad~(A,c,a)$
(5)

Therefore, the test sample $A$ is assigned the label $c*=argmincd(A,c)$.

### The VLDA Approach.

VLDA is a generalization of LDA to tensor data, which applies the regular LDA to a tensor object reshaped into a vector. In the frame of multistream profile data, the third-order tensor object $A∈RI1×I2×M$ representing the signals is unfolded slice by slice; the slices are then rearranged into a large two-dimensional matrix $A∈RI1I2×M$, where I1 is the number of sensors, I2 is the number of data points collected on each profile, and M is the number of samples. The classical LDA is then performed on matrix $A$. What we seek is a transformation matrix $W$ that maximizes the ratio of the between-class scatter to the within-class scatter
$W=argmaxJ(W)=argmax|WTSBW||WTSWW|$
(6)
$subjectto||wi||=1,i=1,…,c−1$
where SB and SW are the between-class scatter and the within-class scatter, respectively, and c is the number of classes. The transformed signal samples can be obtained by $y=WTA$. More details on the calculation of SB and SW using Fisher linear discriminant can be found in Ref. [27].

## Performance Comparison in Simulation

In this section, the performances of the UMLDA and VLDA methodologies are evaluated and compared by numerical studies via simulation. The purpose of using simulation is to generate profile data from statistical models to mimic the profiles under different out-of-control (OOC) scenarios. We do not intend to replace real data with simulated data, but rather we would like to test the proposed method's performance in a larger and more complex dataset before applying it to real data. Real data may be limited in the types of patterns they can show and in the sample size. Simulation study is common when one wishes to test the effectiveness of a method, to explore how the performance is affected by certain parameters, to compare different methods, or to generate data that are otherwise difficult to obtain.

The multistream signals in simulation are generated in a similar manner as in Ref. [19]: A four-stream profile dataset is generated based on three benchmark signals proposed by Donoho and Johnstone [28]. The complex pattern features in the benchmark signals make it difficult for profile modeling using a parametric approach. Figure 3 illustrates the three benchmark signals: “blocks,” “heavysine,” and “bumps,” and they are denoted as x1, x2, and x3, respectively.

Fig. 3
Fig. 3
Close modal
Let χ ∈ ℝN×K×M denote the third-order tensor object that represents the four-stream profile dataset, where N = 4 is the number of streams or sensors, K = 128 is the number of data points for all the signals, and M is the number of samples. χ is generated to contain different types of correlation structures: linear correlation (e.g., χ1,·,m and x1, χ2,·,m and x3, etc.), curvilinear correlation (e.g., χ2,·,m and x1, χ3,·,m and x2, etc.), and no correlation (e.g., χ3,·,m and x1, χ4,·,m and x3, etc.). χ is defined as follows:
$χ1,⋅,m=b1,mx1+b2,mx2+ε1,mχ2,⋅,m=b3,mx12+b4,mx3+ε2,mχ3,⋅,m=b5,mx22+b6,mx32+ε3,mχ4,⋅,m=b7,mx1x2+ε4,m(m=1,…,M)$
(7)
where $εn,m∼N(0,0.52)$ is the random noise and $bm=[b1,m,…,b7,m]T∼MVN(μb,Σb)$ is the model parameter vector, n = 1, …, 4, m = 1, …, M. Similar to the dataset used in Ref. [19], the following settings are used to generate the dataset: μb = [0.2, 1, 1.5, 0.5, 1, 0.7, 0.8]T and $Σb=diag(σb12,…,σb72)=diag(0.08,$$0.015,0.05,0.01,0.09,0.03,0.06)$. Figure 4 shows 100 in-control profile samples generated in this setting. As can be seen in Eq. (7), the four streams of signals are not independent, but the correlation structure is complex for profile modeling.
Fig. 4
Fig. 4
Close modal

Out-of-control (OOC) scenarios are generated to simulate different kinds of deviations from the natural multistream pattern. Each OOC scenario is associated with an assignable cause. In the context of ultrasonic metal welding (and many other manufacturing processes as well), these assignable causes represent different faults, e.g., mislocated weld, sheet metal distortion, surface contamination, etc. In this paper, we assume that multiple faults do not occur simultaneously on one part, i.e., a single part has no more than one fault. The following OOC scenarios are considered:

Scenario (a): Mean shift of the reference signal
$xu→xu+δa1K×1(u=1,2,3)$
(8)
where $δa∈{0.01,0.025,0.05,0.075,0.1}σxu$ is the magnitude of the shift, $σxu$ is the standard deviation of the xu reference signal, u = 1, 2, 3, and 1K×1 is a column vector of ones.
Scenario (b): Superimposition of a sinusoid term on the reference signal
$xu→xu+δbys(u=1,2,3)$
(9)
where $δb∈{0.025,0.05,0.075,0.1,0.125}σxu$, and ys is the sine function over the domain [0, K], with period K and peak-to-peak amplitude equal to 1, u = 1, 2, 3.
Scenario (c): Standard deviation increase of the error term
$σεn.m→δcσεn.m(n=1,2,3,4)$
(10)
where δc ∈ {1.1, 1.5, 2, 2.5, 3} and $σεn.m$ is the standard deviation of the error term ɛn.m.
Scenario (d): Mean shift of the model parameter
$μbw→μbw+δd(w=1,…,7)$
(11)
where $δd∈{1,2,3,4,5}σbw$, $μbw$ and $σbw$ are the mean value and standard deviation of the wth model parameter bw, w = 1, …, 7.
Scenario (e): Standard deviation increase of the model parameter
$σbw→δeσbw(w=1,…,7)$
(12)
where δe ∈ {1.5, 2, 2.5, 3, 4}.
Scenario (f): Gradual mean shift of the reference signal
$xu→xu+δf1K×1(u=1,2,3)$
(13)
where δf is the magnitude of the shift and 1K×1 is a column vector of ones. This scenario is introduced to represent the effects of tool wear on profile data. As tool wear develops, the reference signal of the (m + 1)th sample would have a larger mean shift than that of the mth sample. Considering the severeness of tool wear, let $δf1∈[0.01,0.05]σxu$ represents the deviations caused by a lightly worn tool, $δf2∈(0.05,0.1]σxu$ represents the deviations caused by a tool with intermediate level of worn, and $δf3∈(0.1,0.15]σxu$ represents a severely worn tool, u = 1, 2, 3.

### Methods in Comparison.

The general framework of profile monitoring and fault diagnosis using multistream signals is illustrated in Fig. 5. For multilinear methods like UMLDA, the multistream signals can be directly represented in a tensor object, and then, the tensor is normalized so that the training samples are in the same dimension and zero-mean. For linear methods like VLDA, the multistream signals need to be vectorized to a matrix, and then followed by normalization. Feature extraction method, e.g., UMLDA or VLDA, then produces vector features that can be fed into standard classifiers for classification. The output is a tensor class label which represents “normal” or some fault type.

Fig. 5
Fig. 5
Close modal

Performance comparison is conducted in two levels: (1) feature extraction performance and (2) classification performance. To compare feature extraction performance, we use the following four multilinear and three linear methods to extract features: R-UMLDA, R-UMLDA with aggregation (R-UMLDA-A), UMPCA, MPCA, VLDA, uncorrelated LDA (ULDA), and regularized LDA (RLDA). The feature vectors obtained are then fed into the nearest-neighbor classifier (NNC) with the Euclidean distance measure for classification.

In R-UMLDA, the regularization parameter γ is empirically set to 0.001. If we let Q denote the number of training samples per class, then intuitively, stronger regularization is more desirable for a smaller Q, and weaker regularization is recommended for a larger Q. Since the tensor object χ ∈ ℝ4×128×M, one R-UMLDA will extract up to four features. In R-UMLDA-A, up to A = 20 differently initialized and regularized UMLDA extractors are combined with each producing up to 4 features, resulting in a total of 80 features. The γ parameter ranges from 10−7 to 10−2.

UMPCA and MPCA are unsupervised multilinear methods that seek a set of projections to maximize the variability captured by the projected tensor. UMPCA will produce up to 4 features which are uncorrelated, while MPCA will produce as many as approximately 80 features which are correlated in order to capture at least 99% of the variation in each mode. Details on the theoretical development of UMPCA and MPCA can be found in Refs. [18,20].

In addition to VLDA, two more linear methods are included in comparison, ULDA and RLDA. ULDA and RLDA improve LDA on undersampled problems and small sample size problems, respectively. Each method will project to up to C − 1 features with C being the number of classes. Details on the theoretical development of ULDA and RLDA can be found in Refs. [29,30].

In order to further improve classification performance, we feed the features extracted by multiple R-UMLDA extractors into random subspace method and compare its performances with the R-UMLDA-A which adopts the simple nearest-neighbor aggregation. Since classification is not the main focus of this work, we will not discuss the ensemble learning methods in detail. Readers interested in random subspace method and ensemble learning are referred to Refs. [31,32].

### Simulation Results.

This subsection discusses simulation results in three main cases A, B, and C.

#### Case A.

Case A focuses on identifying the faults in out-of-control scenarios (a)–(e). Generate a total of 1200 profile samples with 200 samples in each class, as plotted in Fig. 6. The five OOC scenarios are specified as follows: (a) mean shift of the “block” reference signals: $x1→x1+0.1σx11K×1$, resulting in $χ~1,⋅,m=b1,m(x1+0.1σx11K×1)+b2,mx2+ε1,m$, $χ~2,⋅,m=b3,m(x1+$$0.1σx11K×1)2+0.1σx11K×1)2+b4,mx3+ε2,m$, and $χ~4,⋅,m=b7,m(x1+$$0.1σx11K×1)+ε4,m$; (b) superimposition of a sinusoid term on the “block” signal: $x1→x1+0.1σx1ys$, ys is a sine function, resulting in $χ~1,⋅,m=b1,m(x1+0.1σx1ys)+b2,mx2+ε1,m$, $χ~2,⋅,m=b3,m(x1+$$σx1ys)2+b4,mx3+ε2,m$, and $χ~4,⋅,m=b7,m(x1+0.1σx1ys)+ε4,m$; (c) increase in the standard deviation of the error term e1: $σε1.m→3σε1.m$, leading to $χ~1,⋅,m=b1,mx1+b2,mx2+ε~1,m$, where $ε~1,m∼N(0,(3×0.5)2)$; (d) mean shift of the model parameter b1: $μb1→μb1+5σb1$, yielding $χ~1,⋅,m=b~1,mx1+b2,mx2+ε1,m$, where $b~1,m∼N(μb1+5σb1,σb12)$; and (e) increase in the standard deviation of the model parameter b1: $σb1→4σb1$, giving $χ~1,⋅,m=b~1,mx1+b2,mx2+ε1,m$, where $b~1,m∼N(μb1,(4σb1)2)$.

Fig. 6
Fig. 6
Close modal

Of the five OOC scenarios above, all profiles in streams 1, 2, and 4 are affected in (a) and (b), while in (c), (d), and (e), only the profiles in stream 1 have out-of-control patterns. Since a large amount of the $ε~1,m′s$ generated in fault (c) would overlap with the in-control $ε1,m′s$, and that the $b~1,m′s$ generated by $b~1,m∼N(μb1,(4σb1)2)$ in fault (e) would greatly overlap with the in-control $b1,m′s$, faults (c) and (e) would be very difficult to separate from the in-control class.

Half of these 1200 samples are used as training. Before UMLDA modeling, generated data are normalized by taking away the grand mean of all training samples from the original data. Using the procedures described in Secs. 2 and 3.1, regularized UMLDA is applied to the normalized data. In UMLDA, the eigentensors corresponding to the pth EMP, $Up∈R4×128$, p = 1, 2, 3, 4, are obtained by $up(1)∘up(2)$, where $up(1)∈R4×1$ and $up(2)∈R128×1$. Figure 7 shows $Up$ obtained from the training dataset in a single simulation run of case A. As can be seen from Fig. 7, the eigenvectors corresponding to the first EMP show an efficient discrimination against streams 1 and 4, whereas those corresponding to the second EMP show a strong discrimination against stream 2. The eigenvectors corresponding to the third and fourth EMPs show weak discriminations against stream 4, whereas limited useful information is extracted from stream 3 for discriminant analysis. These results are exactly compatible with the data generation model, thus implying that R-UMLDA can effectively extract information for discriminant analysis about multistream profiles.

Fig. 7
Fig. 7
Close modal

Using the first p EMPs (p = 1, 2, 3, 4), multistream profiles can be projected to p uncorrelated features, which are then fed into the NNC. The classification performance in the test dataset is shown in Fig. 8 and Table 1. Figure 8 plots the following detailed results against the number of features used: correct classification rate: $∑m=1MtestI(c^m=cm)/Mtest$, where $c^m$ is the predicted class for sample m, cm is the true class, and Mtest is the number of test samples; correct passing rate: $∑m=1MtestI(c^m=0|cm=0)/Mtest$, where “0” indicates the “normal” class; correct detection rate: $∑m=1MtestI(c^m>0|cm>0)/Mtest$, where c > 0 indicates a fault class; true fault classification rate: $∑m=1MtestI(c^m=cm|cm>0)/Mtest$; rate of true detection but wrong fault classification: $∑m=1MtestI(c^m≠cm|c^m>0,cm>0)/Mtest$. As can be seen in Fig. 8, the first two features extracted by R-UMLDA are the most powerful features for classification. Adding the third and fourth features improves the correct classification rate slightly.

Fig. 8
Fig. 8
Close modal
Table 1

Confusion matrix of NNC for R-UMLDA features in case A test dataset

Accuracy = 35%Classified asAccuracy = 63%Classified as
NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)
ActualOne featureActualTwo features
Normal2323232515Normal420541111
Fault (a)1825252606Fault (a)0981001
Fault (b)1827262405Fault (b)41741704
Fault (c)1729212319Fault (c)380124109
Fault (d)10017721Fault (d)00018118
Fault (e)81212191534Fault (e)1702281340
Accuracy = 35%Classified asAccuracy = 63%Classified as
NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)
ActualOne featureActualTwo features
Normal2323232515Normal420541111
Fault (a)1825252606Fault (a)0981001
Fault (b)1827262405Fault (b)41741704
Fault (c)1729212319Fault (c)380124109
Fault (d)10017721Fault (d)00018118
Fault (e)81212191534Fault (e)1702281340
Accuracy = 66%Classified asAccuracy = 67%Classified as
NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)
ActualThree featuresActualFour features
Normal380146114Normal450040015
Fault (a)0981001Fault (a)0981001
Fault (b)0198001Fault (b)0099100
Fault (c)380244115Fault (c)490133116
Fault (d)00027919Fault (d)00028018
Fault (e)1600291738Fault (e)1600251544
Accuracy = 66%Classified asAccuracy = 67%Classified as
NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)NormalFault (a)Fault (b)Fault (c)Fault (d)Fault (e)
ActualThree featuresActualFour features
Normal380146114Normal450040015
Fault (a)0981001Fault (a)0981001
Fault (b)0198001Fault (b)0099100
Fault (c)380244115Fault (c)490133116
Fault (d)00027919Fault (d)00028018
Fault (e)1600291738Fault (e)1600251544

More detailed classification results with respect to the number of features fed into the classifier are shown in the confusion matrices in Table 1. From Table 1, we can easily observe an improvement in classification accuracy when two or more EMPs are used instead of using only the first one. We also notice that when two or more features are used, most of the classification errors come from separating the in-control class, fault (c), and fault (e) from each other. This observation is exactly compatible with the data generation model, based on which we have expected that faults (c) and (e) are the most difficult classes to separate from the in-control class.

Applying the competitor methods described in Sec. 3.1, Fig. 9(a) shows the classification performance of NNC for various feature extraction methods in case A test dataset. The plotted results are the average correct classification rates in 100 simulation runs. In Fig. 9(a), the curves with triangle markers correspond to classification performance for UMPCA and MPCA features. It is obvious that these results are significantly worse than LDA-based methods, regardless of the number of features used. This agrees with our understanding of PCA-based feature extractors which do not make use of the class information and only seek projections to maximize the captured variability instead of class discrimination.

Fig. 9
Fig. 9
Close modal

The curves with cross, star, and asterisk markers in Fig. 9(a) correspond to vectorized LDA methods (including LDA, ULDA, and RLDA), whereas the curves with square and circle markers correspond to UMLDA methods. It can be seen from Fig. 9(a) that the first two features extracted by R-UMLDA are the most powerful features in classification. Beyond the first two features, the performance from R-UMLDA improves very slowly with an increased number of features used. The first three features extracted by vectorized LDA methods are also powerful, but the improvement from using the first two R-UMLDA features is not significant.

The best correct classification rate is achieved using R-UMLDA-A. Figure 9(a) shows that R-UMLDA-A outperforms all other algorithms. This demonstrates that aggregation is an effective procedure and there is indeed complementary discriminative information from differently regularized R-UMLDA feature extractors.

#### Case B.

Case B focuses on identifying the faults in OOC scenario (f), which mimics the deviations caused by tool wear. We generate a total of 800 profile samples with 200 samples in each of the following four classes: in-control and three OOC (f) scenarios, where three magnitudes of gradual mean shift are added to the “block” signal to reflect machine tools with light worn, medium worn, and severe worn. Half of these samples are used as training. Table 2 presents the confusion matrix of the nearest-neighbor classifier for R-UMLDA (with γ = 0.001) features in case B test dataset. As more features are fed into the classifier, the classification accuracy improves significantly. We also observe that classification errors only occur in the following three situations: distinguishing between the normal class and fault (f − 1) light tool wear, distinguishing between (f − 1) light tool wear and (f − 2) medium wear, and distinguishing between (f − 2) light wear and (f − 3) severe wear.

Table 2

Confusion matrix of NNC for R-UMLDA features in case B test dataset

Accuracy = 69%Classified asAccuracy = 71%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)
ActualOne featureActualTwo features
Normal673300Normal683200
Fault (f − 1)164350Fault (f − 1)364330
Fault (f − 2)004456Fault (f − 2)005248
Fault (f − 3)000100Fault (f − 3)000100
Accuracy = 69%Classified asAccuracy = 71%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)
ActualOne featureActualTwo features
Normal673300Normal683200
Fault (f − 1)164350Fault (f − 1)364330
Fault (f − 2)004456Fault (f − 2)005248
Fault (f − 3)000100Fault (f − 3)000100
Accuracy = 80%Classified asAccuracy = 87%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)
ActualThree featuresActualFour features
Normal683200Normal732700
Fault (f − 1)374230Fault (f − 1)09190
Fault (f − 2)007624Fault (f − 2)008416
Fault (f − 3)000100Fault (f − 3)000100
Accuracy = 80%Classified asAccuracy = 87%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)
ActualThree featuresActualFour features
Normal683200Normal732700
Fault (f − 1)374230Fault (f − 1)09190
Fault (f − 2)007624Fault (f − 2)008416
Fault (f − 3)000100Fault (f − 3)000100

Figure 9(b) shows the classification performance in terms of average correct classification rate in 100 simulation runs of NNC for various feature extraction methods in case B test dataset. Similar to case A, the features extracted by UMPCA and MPCA are the weakest features in classification. Although the first few (1–2) features extracted by VLDA, ULDA, and RLDA are the most discriminative, using three or more R-UMLDA features lead to notably enhanced results. Figure 9(b) also shows the significant improvement introduced by aggregation. In all, R-UMLDA and R-UMLDA-A outperform all other algorithms.

#### Case C.

Case C investigates in-control and five OOC scenarios: (d) mean shift of the model parameter b1, (e) standard deviation increase of b1, and the three (f) OOC scenarios as described in case B. A total of 1200 profile samples with 200 samples in each class are generated and half of which are used as training. Table 3 presents the confusion matrix of NNC for R-UMLDA (with γ = 0.001) features in case C test dataset. As more features are fed into the classifier, the classification accuracy improves significantly. From Table 3, we also observe that almost all classification errors occur in the following four situations: distinguishing between the normal class and fault (f − 1), distinguishing between (f − 1) and (f − 2), distinguishing between (f − 2) and (f − 3), and separating fault (e) from normal. It is very difficult to separate fault (e) from the in-control class due to the fact that the $b~1,m′s$ generated in fault (e) would greatly overlap with the in-control $b1,m′s$.

Table 3

Confusion matrix of NNC for R-UMLDA features in case C test dataset

Accuracy = 42%Classified asAccuracy = 56%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)
ActualOne featureActualTwo features
Normal2930135419Normal472820221
Fault (f − 1)1830278116Fault (f − 1)93146905
Fault (f − 2)7134322015Fault (f − 2)15464503
Fault (f − 3)361649026Fault (f − 3)00128800
Fault (d)34008310Fault (d)11008117
Fault (e)111215212219Fault (e)2119301740
Accuracy = 42%Classified asAccuracy = 56%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)
ActualOne featureActualTwo features
Normal2930135419Normal472820221
Fault (f − 1)1830278116Fault (f − 1)93146905
Fault (f − 2)7134322015Fault (f − 2)15464503
Fault (f − 3)361649026Fault (f − 3)00128800
Fault (d)34008310Fault (d)11008117
Fault (e)111215212219Fault (e)2119301740
Accuracy = 65%Classified asAccuracy = 75%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)
ActualThree featuresActualFour features
Normal512100127Normal522000325
Fault (f − 1)36135001Fault (f − 1)08812000
Fault (f − 2)00663400Fault (f − 2)00901000
Fault (f − 3)00010000Fault (f − 3)00010000
Fault (d)12007324Fault (d)30007819
Fault (e)2616101839Fault (e)2911001941
Accuracy = 65%Classified asAccuracy = 75%Classified as
NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)NormalFault (f − 1)Fault (f − 2)Fault (f − 3)Fault (d)Fault (e)
ActualThree featuresActualFour features
Normal512100127Normal522000325
Fault (f − 1)36135001Fault (f − 1)08812000
Fault (f − 2)00663400Fault (f − 2)00901000
Fault (f − 3)00010000Fault (f − 3)00010000
Fault (d)12007324Fault (d)30007819
Fault (e)2616101839Fault (e)2911001941

Figure 9(c) shows the classification performance in terms of average correct classification rate in 100 simulation runs of NNC for various feature extraction methods in case C test dataset. Similar to cases A and B, the features extracted by UMPCA and MPCA are not as powerful as the other features in classification. Although the first few (1–2) features extracted by VLDA, ULDA, and RLDA are the most discriminative, using three or more R-UMLDA features lead to notably enhanced results. Figure 9(c) also shows that aggregation can effectively enhance the results, and that R-UMLDA and R-UMLDA-A outperform all other algorithms.

Under the framework of case C, we further investigate how the number of training samples in each class would affect feature extraction results. We consider a variant of case C, denoted as C′, that 20 profile samples are generated in each of the 6 classes. Figure 9(d) shows the correct classification rate of NNC for various feature extraction methods in cases C′ test dataset. Comparing Fig. 9(c) with Fig. 9(d), we notice that although the correct classification rates in Fig. 9(d) are slightly worse than those in Fig. 9(c) due to the smaller sample sizes, the classification performance does not vary significantly given the different number of samples. In both cases, the best result is always achieved by R-UMLDA-A. If we want to limit the number of selected features to 3 or 4, then the first 3–4 features extracted by R-UMLDA are always the most powerful ones in classification. The same conclusion can be drawn when the sample size is further reduced to 10 per class. On the other hand, when comparing these four simulation experiments in analysis of variance, the P-value is found to be less than 0.01, confirming that these four cases are indeed different. Therefore, simulation results demonstrate that R-UMLDA-A achieves the best overall performance in all the simulation experiments, and that R-UMLDA-A is a robust and effective feature extraction and dimension reduction algorithm for multistream profiles.

#### Improving Classification Via Ensemble Learning.

This subsection explores the possibility of further improving classification performance in fault diagnosis via ensemble learning. In R-UMLDA-A, 20 differently initialized and regularized UMLDA feature extractors are aggregated at the matching score level using the nearest-neighbor distance. Although R-UMLDA-A achieves the best results in previous simulation studies, more advanced ensemble-based learning algorithms such as boosting, bagging, and random subspace method are expected to achieve better results. Investigating alternative combination methods, however, is not the main focus of this chapter. Therefore, we will only show the classification performance using the random subspace method and leave the in-depth studies in this direction to future work.

Random subspace method is an ensemble classifier that consists of several classifiers each operating in a subspace of the original feature space and outputs the class based on the outputs of these individual classifiers. The k-nearest-neighbor classifiers are used here as individual classifiers. As an example, we consider the dataset from a single simulation run of case A as described in Sec. 3.2.1. Using the same 20 R-UMLDA feature extractors as in R-UMLDA-A, we plot the classification results in Fig. 10. The curves with circle or cross markers correspond to random subspace classification with different number of nearest neighbors (k). Comparing these results to R-UMLDA-A, which are plotted in square markers, we see that the random subspace ensemble significantly increases the accuracy of classification, given a proper choice of k. With k = 20 to 25, random subspace ensemble can achieve a relatively high correct classification rate using only 15 features, whereas R-UMLDA-A needs at least 20 features to achieve a similar performance. This also indicates more promising opportunities of using UMLDA for feature extraction and dimension reduction in handling multistream signals.

Fig. 10
Fig. 10
Close modal

## Case Study in Multilayer Ultrasonic Metal Welding

In this case study, welding experiments of joining three layers of copper with one layer of nickel-plated copper are investigated. The clamping pressure is 34 psi, and the vibration amplitude is 40 µm. Four sensors are used to collect in situ signals: the power meter records controller power signal, the force sensor measures the clamping force, the linear variable differential transformer (LVDT) sensor measures the displacement between horn and anvil, and the microphone captures the sound in vibration. Table 4 summarizes the sensors and signals [33,34]. Note that sensors use the same sampling rate in this case study. If sensors have different sampling rates, extra steps in data preprocessing would be needed.

Table 4

Applied sensors, signal types, and purposes (adapted from Ref. [34])

SensorSignal typePurpose
Watt (power) meterUltrasonic power output at a piezoceramic moduleMonitor controller power input signal
Force sensorClamping force output at the piezoceramic moduleMeasure clamping force at the ultrasonic transducer
LVDT sensorDisplacement between horn and anvilMeasure indentation and sheet thickness variation during welding
MicrophoneSound wave formDetect cracking and slipping during welding
SensorSignal typePurpose
Watt (power) meterUltrasonic power output at a piezoceramic moduleMonitor controller power input signal
Force sensorClamping force output at the piezoceramic moduleMeasure clamping force at the ultrasonic transducer
LVDT sensorDisplacement between horn and anvilMeasure indentation and sheet thickness variation during welding
MicrophoneSound wave formDetect cracking and slipping during welding

Figure 11(a) shows the welded tabs from the normal welding process and three faulty processes: (1) surface contamination, (2) abnormal thickness, and (3) mislocated/edge weld. Figure 11(b) shows signals associated with these welds from the four sensors. In general, the normal welding process produces good welds with strong connections, while the faulty processes tend to create poor quality connections which may have adverse effects on the performance of the battery pack. If samples are contaminated, for example, with oil, there is less friction between the metal layers, causing insufficient vibration at the beginning of the weld. Therefore, the power signal does not rise as fast as a normal weld does. Once oil gets removed by vibration, the power signal picks up. Abnormal welding thickness may be caused by material handling errors, or sheet metal distortion, or operation errors. The displacement signal clearly shows how the displacement between horn and anvil is affected by thicker layers. Mislocated/edge weld may be caused by operation errors or alignment errors. With edge weld, all clamping force is applied to a smaller weld region, resulting in more displacement between horn and anvil toward the end of the weld. It can be seen from Fig. 11 that on the one hand, each signal contains richer information about product quality and process condition than any single point can provide, and on the other hand, a single stream of signals is not informative enough for recognizing the type of faults.

Fig. 11
Fig. 11
Close modal

Sample data are organized in tensor object $A∈R4×700×17$, which includes 4 sensors, 700 data points in each profile, and 17 samples. Samples are divided into training and test sets. Both R-UMLDA and VLDA methods are trained using eight normal samples, two samples with fault 1 (oily surface), one sample with fault 2 (abnormal thickness), and one sample with fault 3 (edge weld).

Using one R-UMLDA feature extractor with γ = 0.001, the eigentensors corresponding to the four EMPs are shown in Fig. 12. Recall that the eigentensors corresponding to the pth EMP are obtained by $Up=up(1)∘up(2)$, where $Up∈R4×700$, $up(1)∈R4×1$, $up(2)∈R700×1$, and p = 1, 2, 3, 4. It can be seen from this figure that the eigentensors corresponding to the first EMP show an efficient discrimination and strong negative correlation in streams 2 and 3. The eigentensors corresponding to the second EMP show a strong discrimination in stream 1, whereas those corresponding to the third and fourth EMPs deliver similar information on discrimination in stream 4.

Fig. 12
Fig. 12
Close modal

After training UMLDA and VLDA, the feature extractors and NNC are applied to five testing samples: two from the normal process, two from fault 1, and one from fault 2. Figure 13 plots the correct classification rate of NNC for UMLDA and VLDA in the testing samples. For the five testing samples, it can be seen that R-UMLDA-A can easily achieve 100% correct classification using only four features while R-UMLDA achieves 80%. The vectorized LDA methods, however, do not perform as well as UMLDA. The features extracted by RLDA achieve the same level of classification accuracy as R-UMLDA, whereas LDA and ULDA extract much weaker features. The results indicate that UMLDA-based methods, especially R-UMLDA-A, outperform VLDA methods (including LDA, ULDA, and RLDA) in detecting abnormal processes and fault diagnosis.

Fig. 13
Fig. 13
Close modal

## Conclusion

In this paper, based on UMLDA, we proposed a method for effective analysis of multisensor heterogeneous profile data. With various sensors measuring different variables, information from each sensor, sensor-to-sensor correlation, and class-to-class correlation should all be considered. A simulation study was conducted to evaluate the performance of the proposed method and its performance superiority over VLDA and other competitor methods. The results showed that the features extracted by VLDA and competitor methods are not as powerful as UMLDA in discriminating profiles and classification. The possibility of improving classification performance in fault diagnosis using ensemble learning with UMLDA was further explored. We also applied both UMLDA and VLDA to a multilayer ultrasonic metal welding process for the purpose of process characterization and fault diagnosis. The results indicate that UMLDA outperforms VLDA in not only detecting the faulty operations but also classifying the type of faults.

Since the proposed method employs the tensor notation, all samples need to have the same number of data points so that the measurement data can be organized in a tensor format. In real manufacturing environment, the time variability of the signal could be common. To handle this situation, signal preprocessing would be needed. For example, to deal with inconsistent sampling rate, we can use downsampling or interpolation; to deal with inconsistent time duration of important patterns, we can crop to a specific segment of the profile or take the longest duration. Research on multimodal and multiresolution data fusion is a booming direction, yet it is beyond the scope of this work.

In the future, several remaining issues in this framework will be studied in more depth, such as integrating ensemble learning with R-UMLDA, developing method self-updating when unseen faults are observed, and adding probability output to classification. More comprehensive case study will be performed in the future as we collect more samples from welding experiments. Developing tensor-based methods for monitoring manufacturing processes with vision technology will be an interesting topic for future research. Furthermore, the extension of the developed method to online process monitoring and online learning would be an interesting development.

## Funding Data

• General Motors Collaborative Research Lab in Advanced Vehicle Manufacturing at The University of Michigan.

## References

1.
Woodall
,
W. H.
,
Spitzner
,
D. J.
,
Montgomery
,
D. C.
, and
Gupta
,
S.
,
2004
, “
Using Control Charts to Monitor Process and Product Quality Profiles
,”
J. Qual. Technol.
,
36
(
3
), pp.
309
320
.
2.
Woodall
,
W. H.
,
2007
, “
Current Research on Profile Monitoring
,”
Produção
,
17
(
3
), pp.
420
425
.
3.
Lee
,
S. S.
,
Shao
,
C.
,
Kim
,
T. H.
,
Hu
,
S. J.
,
Kannatey-Asibu
,
E.
,
Cai
,
W. W.
,
Spicer
,
J. P.
,
Wang
,
H.
, and
Abell
,
J. A.
,
2014
, “
Characterization of Ultrasonic Metal Welding by Correlating Online Sensor Signals With Weld Attributes
,”
ASME J. Manuf. Sci. Eng.
,
136
(
5
), p.
051019
.
4.
Lee
,
S. S.
,
Kim
,
T. H.
,
Hu
,
S. J.
,
Cai
,
W. W.
, and
Abell
,
J. A.
,
2010
, “
Joining Technologies for Automotive Lithium-Ion Battery Manufacturing—A Review
,”
Proceedings of ASME 2010 International Manufacturing Science and Engineering Conference
,
Erie, PA
,
Oct. 12–15
, pp.
541
549
.
5.
Kalpakjian
,
S.
, and
Schmid
,
S. R.
,
2008
,
Manufacturing Processes for Engineering Materials
,
Pearson Education
,
.
6.
Kim
,
T. H.
,
Yum
,
J.
,
Hu
,
S. J.
,
Spicer
,
J. P.
, and
Abell
,
J. A.
,
2011
, “
Process Robustness of Single Lap Ultrasonic Welding of Thin Dissimilar Materials
,”
CIRP Ann. Manuf. Technol.
,
60
(
1
), pp.
17
20
.
7.
Shao
,
C.
,
Kim
,
T. H.
,
Hu
,
S. J.
,
Jin
,
J.
,
Abell
,
J. A.
, and
Spicer
,
J. P.
,
2015
, “
Tool Wear Monitoring for Ultrasonic Metal Welding of Lithium-Ion Batteries
,”
ASME J. Manuf. Sci. Eng.
,
138
(
5
), p.
051005
.
8.
Guo
,
W.
,
Shao
,
C.
,
Kim
,
T. H.
,
Hu
,
S. J.
,
Jin
,
J.
,
Spicer
,
J. P.
, and
Wang
,
H.
,
2016
, “
Online Process Monitoring With Near-Zero Misdetection for Ultrasonic Welding of Lithium-Ion Batteries: An Integration of Univariate and Multivariate Methods
,”
J. Manuf. Syst.
,
38
, pp.
141
150
.
9.
Kuljanic
,
E.
,
Totis
,
G.
, and
Sortino
,
M.
,
2009
, “
Development of an Intelligent Multisensor Chatter Detection System in Milling
,”
Mech. Syst. Signal Process.
,
23
(
5
), pp.
1704
1718
.
10.
Cho
,
S.
,
Binsaeid
,
S.
, and
Asfour
,
S.
,
2010
, “
Design of Multisensor Fusion-Based Tool Condition Monitoring System in End Milling
,”
,
46
, pp.
681
694
.
11.
Noorossana
,
R.
,
Saghaei
,
A.
, and
Amiri
,
A.
,
2012
,
Statistical Analysis of Profile Monitoring
,
Wiley
,
New York
.
12.
Grasso
,
M.
,
Albertelli
,
P.
, and
Colosimo
,
B. M.
,
2013
, “
An Adaptive SPC Approach for Multi-Sensor Fusion and Monitoring of Time-Varying Processes
,”
Proc. CIRP
,
12
, pp.
61
66
.
13.
Basir
,
O.
, and
Yuan
,
X.
,
2007
, “
Engine Fault Diagnosis Based on Multisensory Information Fusion Using Dempster–Shafer Evidence Theory
,”
Inf. Fusion
,
8
(
4
), pp.
379
386
.
14.
Kim
,
J.
,
Huang
,
Q.
,
Shi
,
J.
, and
Chang
,
T.-S.
,
2006
, “
Online Multichannel Forging Tonnage Monitoring and Fault Pattern Discrimination Using Principal Curve
,”
ASME J. Manuf. Sci. Eng.
,
128
(
4
), pp.
944
950
.
15.
Amiri
,
A.
,
Zou
,
C.
, and
Doroudyan
,
M. H.
,
2013
, “
Monitoring Correlated Profile and Multivariate Quality Characteristics
,”
Qual. Reliab. Eng. Int.
,
30
(
1
), pp.
133
142
.
16.
Chou
,
S. H.
,
Chang
,
S. I.
, and
Tsai
,
T. R.
,
2014
, “
On Monitoring of Multiple Non-Linear Profiles
,”
Int. J. Prod. Res.
,
52
(
11
), pp.
3209
3224
.
17.
Paynabar
,
K.
,
Jin
,
J.
, and
Pacella
,
M.
,
2013
, “
Monitoring and Diagnosis of Multichannel Nonlinear Profile Variations Using Uncorrelated Multilinear Principal Component Analysis
,”
IIE Trans.
,
45
(
11
), pp.
1235
1247
.
18.
Lu
,
H.
,
Plataniotis
,
K. N.
, and
Venetsanopoulos
,
A. N.
,
2009
, “
Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
,”
IEEE Trans. Neural Netw.
,
20
(
11
), pp.
1820
1836
.
19.
Grasso
,
M.
,
Colosimo
,
B. M.
, and
Pacella
,
M.
,
2014
, “
Profile Monitoring Via Sensor Fusion: The Use of PCA Methods for Multi-Channel Data
,”
Int. J. Prod. Res.
,
52
(
20
), pp.
6110
6135
.
20.
Lu
,
H.
,
Plataniotis
,
K. N.
, and
Venetsanopoulos
,
A. N.
,
2008
, “
MPCA: Multilinear Principal Component Analysis of Tensor Objects
,”
IEEE Trans. Neural Netw.
,
19
(
1
), pp.
18
39
.
21.
Lu
,
H.
,
Plataniotis
,
K. N.
, and
Venetsanopoulos
,
A. N.
,
2009
, “
Uncorrelated Multilinear Discriminant Analysis With Regularization and Aggregation for Tensor Object Recognition
,”
IEEE Trans. Neural Netw.
,
20
(
1
), pp.
103
123
.
22.
De Lathauwer
,
L.
,
De Moor
,
B.
, and
Vandewalle
,
J.
,
2000
, “
A Multilinear Singular Value Decomposition
,”
SIAM J. Matrix Anal. Appl.
,
21
(
4
), pp.
1253
1278
.
23.
Kolda
,
T. G.
, and
,
B. W.
,
2009
, “
Tensor Decompositions and Applications
,”
SIAM Rev.
,
51
(
3
), pp.
455
500
.
24.
Acar
,
E.
, and
Yener
,
B.
,
2009
, “
Unsupervised Multiway Data Analysis: A Literature Survey
,”
IEEE Trans. Knowl. Data Eng.
,
21
(
1
), pp.
6
20
.
25.
Kiers
,
H. A. L.
,
2000
, “
Towards a Standardized Notation and Terminology in Multiway Analysis
,”
J. Chemom.
,
14
(
3
), pp.
105
122
.
26.
Jin
,
Z.
,
Yang
,
J. Y.
,
Hu
,
Z. S.
, and
Lou
,
Z.
,
2001
, “
Face Recognition Based on the Uncorrelated Discriminant Transformation
,”
Pattern Recognit.
,
34
(
7
), pp.
1405
1416
.
27.
Duda
,
R. O.
,
Hart
,
P. E.
, and
Stork
,
D. G.
,
2012
,
Pattern Classification
,
John Wiley & Sons
,
New York
.
28.
Donoho
,
D. L.
, and
Johnstone
,
I. M.
,
1994
, “
Ideal Spatial Adaptation by Wavelet Shrinkage
,”
Biometrika
,
81
(
3
), pp.
425
455
.
29.
Ye
,
J.
,
2005
, “
Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems
,” ,
6
, pp.
483
502
.
30.
Ye
,
J.
,
Xiong
,
T.
,
Li
,
Q.
,
Janardan
,
R.
,
Bi
,
J.
,
Cherkassky
,
V.
, and
Kambhamettu
,
C.
,
2006
, “
Efficient Model Selection for Regularized Linear Discriminant Analysis
,”
Proceedings of the 15th ACM International Conference on Information and Knowledge Management
,
Arlington, VA
,
Nov. 6–11
, pp.
532
539
.
31.
Ho
,
T. K.
,
1998
, “
The Random Subspace Method for Constructing Decision Forests
,”
IEEE Trans. Pattern Anal. Mach. Intell.
,
20
(
8
), pp.
832
844
.
32.
Hastie
,
T.
,
Tibshirani
,
R.
, and
Friedman
,
J.
,
2008
,
The Elements of Statistical Learning
, 2nd ed.,
Springer
,
New York
.
33.
BRANSON Ultrasonics Corporation
,
2007
, “
BRANSON, BRANSON Ultraweld® L20
,”
BRANSON Ultrasonics Corporation
.
34.
Hu
,
S. J.
,
2011
,
Technical Report: On-Line Quality Monitoring System for Ultrasonic Battery Tab Welding
,
General Motors Collaborative Research Lab at the University of Michigan
,
Ann Arbor, MI
.