ArticlePDF Available

The Spatial Sign Covariance Matrix and Its Application for Robust Correlation Estimation

Authors:

Abstract

We summarize properties of the spatial sign covariance matrix and especially look at the relationship between its eigenvalues and those of the shape matrix of an elliptical distribution. The explicit relationship known in the bivariate case was used to construct the spatial sign correlation coefficient, which is a non-parametric and robust estimator for the correlation coefficient within the elliptical model. We consider a multivariate generalization, which we call the multivariate spatial sign correlation matrix.
arXiv:1606.02274v1 [stat.ME] 7 Jun 2016
The spatial sign covariance matrix and its
application for robust correlation estimation
A. D¨
urre1, R. Fried
Fakult¨at Statistik, Technische Universit¨at Dortmund
44221 Dortmund, Germany
D. Vogel
Institute for Complex Systems and Mathematical Biology, University of Aberdeen
Aberdeen AB24 3UE, United Kingdom
Abstract
We summarize properties of the spatial sign covariance matrix and especially
look at the relationship between its eigenvalues and those of the shape matrix
of an elliptical distribution. The explicit relationship known in the bivariate
case was used to construct the spatial sign correlation coefficient, which is a
non-parametric and robust estimator for the correlation coefficient within the
elliptical model. We consider a multivariate generalization, which we call the
multivariate spatial sign correlation matrix.
1 Introduction
Let X1,...,Xndenote a sample of independent pdimensional random variables from
a distribution Fand s:RpRpwith s(x) = x/|x|for x6= 0 and s(0) = 0 the spatial
sign, then
Sn(tn,X1,...,Xn) = 1
n
n
X
i=1
s(Xitn)s(Xitn)T
denotes the empirical spatial sign covariance matrix (SSCM) with location tn. The
canonical choice for the location estimator tnis the spatial median
µn= argmin
µRp
n
X
i=1 ||Xiµ||.
Beside its nice robustness properties like an asymptotic breakdown-point of 1/2, it
has (under regularity conditions, see [12]) the advantageous feature that it centres the
spatial signs, i.e.,
1
n
n
X
i=1
s(Xiµn) = 0,
so that Sn(µn,X1,...,Xn) is indeed the empirical covariance matrix of the spatial
signs of the data. If tnis (strongly) consistent for a location tR, it was shown
1corresponding author, e-mail: alexander.duerre@udo.edu
1
in [5] that under mild conditions on Fthe empirical SSCM is a (strongly) consistent
estimator for its population counterpart
S(X) = E(s(Xt)s(Xt)T).
There are some nice results if Fis within the class of continuous elliptical distributions,
which means that Fpossesses a density of the form
f(x) = det(V)1
2g((xµ)V1(xµ))
for a location µRp, a symmetric and positive definite shape matrix VRp×p
and a function g:RR, which is often called the elliptical generator. Prominent
members of the elliptical family are the multivariate normal distribution and elliptical
t-distributions (e.g. [2], p. 208). If second moments exists, then µis the expectation
of XF, and Va multiple of the covariance matrix. The shape matrix Vis unique
only up to a multiplicative constant. In the following, we consider the trace-normalized
shape matrix V0=V /tr(V),which is convenient since S(X) also has trace 1. If Fis
elliptical, then S(X) and Vshare the same eigenvectors and the respective eigenvalues
have the same ordering. For this reason, the SSCM has been proposed for robust prin-
cipal component analysis (e.g. [13, 15]). In the present article, we study the eigenvalues
of the SSCM.
2 Eigenvalues of the SSCM
Let λ1... λp0 denote the eigenvalues of V0and δ1... δp0 those of
S(X). Explicit formulae that relate the δito the λiare only known for p= 2 (see
[19, 3]), namely
δi=λi
λ1+λ2
, i = 1,2.(1)
Assuming λ2>0, we have δ12=pλ12λ12,thus the eigenvalues of the SSCM
are closer together than those of the corresponding shape matrix. It is shown in [8]
that this holds true for arbitrary p > 2, so
λijδijfor 1 i < j p(2)
as long as λj>0.There is no explicit map between the eigenvalues known for p > 2.
urre et al. [8] give a representation of δias one-dimensional integral, which permits
fast and accurate numerical evaluations for arbitrary p,
δi=λi
2Z
0
1
(1 + λix)Qp
j=1(1 + λjx)1
2
dx, i = 1,...,p. (3)
We use this formula (implemented in R [17] in the package sscor [9]) to get an impression
how the eigenvalues of S(X) look like in comparison to those of V0. We first look at of
equidistantly spaced eigenvalues
λi=2i
p(p+ 1), i = 1,...,p,
2
0.20 0.30 0.40 0.50
0.20 0.30 0.40 0.50
Eigenvalues of V0
Eigenvalues of S(X)
0.05 0.10 0.15
0.05 0.10 0.15
Eigenvalues of V0
Eigenvalues of S(X)
0.000 0.005 0.010 0.015
0.000 0.005 0.010 0.015
Eigenvalues of V0
Eigenvalues of S(X)
Figure 1: Eigenvalues of the SSCM wrt the corresponding eigenvalues of the shape
matrix in the equidistant setting p= 3 (left), p= 11 (centre) and p= 101 (right).
0.1 0.3 0.5 0.7
0.1 0.3 0.5 0.7
Eigenvalues of V0
Eigenvalues of S(X)
0.0 0.1 0.2 0.3 0.4
0.0 0.1 0.2 0.3 0.4
Eigenvalues of V0
Eigenvalues of S(X)
0.00 0.02 0.04 0.06 0.08
0.00 0.02 0.04 0.06 0.08
Eigenvalues of V0
Eigenvalues of S(X)
Figure 2: Eigenvalues of the SSCM wrt the corresponding eigenvalues of shape matrix
in the setting of one large eigenvalue for p= 3 (left), p= 11 (centre) and p= 101
(right).
for different p= 3,11,101. The magnitude of the eigenvalues necessarily decreases
as pincreases, since Pp
i=1 λi=Pp
i=1 δi= 1 per definition of V0and S(X). As one can
see in Figure 1, the eigenvalues of S(X) and V0approach each other for increasing p.
In fact the maximal absolute difference for p= 101 is roughly 2 ·104. In the second
scenario, we take p1 equidistantly spaced eigenvalues and one eigenvalue 5 times
larger than the rest, i.e.,
λi=
i
p((p+1)/2+5)5i= 1,...,p1,
5(p1)
p((p+1)/2+5)5i=p.
This models the case where the dependence is mainly driven by one principle compo-
nent. As one can see in Figure 2, the distance between the two largest eigenvalues is
smaller for S(X) than for V0. This is not surprising in light of (2). Thus in general, the
eigenvalues of the SSCM are less separated than those of V0, which is one reason why
3
the use of the SSCM for robust principal component analysis has been questioned (e.g.
[1, 14]). However, the differences appear to be generally small in higher dimensions.
3 Estimation of the correlation matrix
Equation (1) can be used to derive an estimator for the correlation coefficient based
on the empirical SSCM: the spatial sign correlation coefficient ρn([6]). Under mild
regularity assumptions this estimator is consistent under elliptical distributions and
asymptotically normal with variance
ASV(ρn) = (1 ρ2)2+1
2(a+a1)(1 ρ2)3/2,(4)
where a=pv11/v22 is the ratio of the marginal scales and ρ=v12/v11 v22 is the
generalized correlation coefficient, which coincides with the usual moment correlation
coefficient if second moments exists. Equation (4) indicates that the variance of ρnis
minimal for a= 1, but can get arbitrarily large if atends to infinity or 0.
Therefore a two-step procedure has been proposed, the two-stage spatial sign cor-
relation ρσ,n, which first normalizes the data by a robust scale estimator, e.g., the
median absolute deviation (mad), and then computes the spatial sign correlation of
the transformed data. Under mild conditions (see [7]), this two-step procedure yields
an asymptotic variance of
ASV(ρσ,n) = (1 ρ2)2+ (1 ρ2)3/2,(5)
which equals that of ρnfor the favourable case of a= 1. Since (5) only depends on
the parameter ρ, the two-stage spatial sign correlation coefficient is very suitable to
construct robust and non-parametric confidence intervals for the correlation coefficient
under ellipticity. It turns out that these intervals are quite accurate even for rather
small sample sizes of n= 10 and in fact more accurate then those based on the sample
moment correlation coefficient [7].
One can construct an estimator of the correlation matrix Rby filling the off-diagonal
positions of the matrix estimate with the bivariate spatial sign correlation coefficients
of all pairs of variables. This was proposed in [6]. Equation (3) allows an alternative
approach: First standardize the data by a robust scale estimator and compute the
SSCM of the transformed data. Then apply a singular value decomposition
Sn(tn,X1,...,Xn) = ˆ
Uˆ
ˆ
UT,
where ˆ
∆ contains the ordered eigenvalues ˆ
δ1... ˆ
δp.One obtains estimates
ˆ
λ1,...,ˆ
λpby inverting (3). Although theoretical results are yet to be established,
4
we found in our simulations that the following fix point algorithm
ˆ
λ(0)
i=δi, i = 1,...,p,
˜
λ(k+1)
i= 2ˆ
δi Z
0
1
(1 + ˆ
λ(k)
ix)Qp
j=1(1 + ˆ
λ(k)
jx)1
2
dx, !1
, i = 1, . . . , p, k = 1,2,...
ˆ
λ(k+1)
i=˜
λ(k+1)
i p
X
j=1
˜
λj
(k+1)!1
, i = 1, . . . , p, k = 1,2,...
works reliably and converges fast. Let ˆ
Λ denote the diagonal matrix containing ˆ
λ1,...,ˆ
λp,
then ˆ
V=ˆ
Uˆ
Λˆ
UTis a suitable estimator for for the shape of the standardized data and
ˆ
Rwith ˆrij = ˆvij /pˆvii ˆvjj an estimator for the correlation matrix, which we call the
multivariate spatial sign correlation matrix. Contrary to the pairwise approach, the
multivariate spatial sign correlation matrix is positive semi-definite by construction.
Theoretical properties of the new estimator are not straightforward to establish. By
a small simulation study we want to get an impression of its efficiency. We compare the
variances of the moment correlation, the pairwise as well as the multivariate spatial sign
correlation under several elliptical distributions: normal, Laplace and tdistributions
with 5 and 10 degrees of freedom. The latter three generate heavier tails than the
normal distribution. The Laplace distribution is obtained by the elliptical generator
g(x) = cpexp(p|x|/2), where cpis the appropriate integration constant depending
on p(e.g. [2], p. 209).
We take the identity matrix as shape matrix and compare the variances of an off-
diagonal element of the matrix estimates for different dimensions p= 2,3,5,10,50
and sample sizes n= 100,1000. We use the R packages mvtnorm [10] and MNM [16]
for the data generation. The results based on 10000 runs are summarized in Table 1.
Except for the moment correlation at the t5distribution, the results for n= 100 and
n= 1000 are very similar. Note that the variance of the moment correlation decreases
at the Laplace distribution as the dimension pincreases, but not so for the other
distributions considered. The lower dimensional marginals of the Laplace distribution
are, contrary to the normal and the t-distributions, not Laplace distributed (see [11]),
and the kurtosis of the one-dimensional marginals of the Laplace distribution in fact
decreases as pincreases.
Equation (5) yields an asymptotic variance of 2 for the pairwise spatial sign corre-
lation matrix elements regardless of the specific elliptical generator, which can also be
observed in the simulation results. The moment correlation is twice as efficient under
normality, but has a higher variance at heavy tailed distributions. For uncorrelated t5
distributed random variables, the spatial sign correlation outperforms the moment cor-
relation. Looking at the multivariate spatial sign correlation, we see a strong increase
of efficiency for larger p. For p= 50 the variance is comparable to that of the moment
correlation. Since the asymptotic variance of the SSCM does not depend on the ellipti-
cal generator, this is expected to also hold for the multivariate spatial sign correlation,
and we find this confirmed by the simulations. The multivariate spatial sign correla-
tion is more efficient than the moment correlation even under slightly heavier tails for
moderately large p.
5
n100 1000
p2 3 5 10 50 2 3 5 10 50
N
cor 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
sscor pairwise 1.9 1.9 1.9 1.9 1.9 2.0 2.0 2.0 2.0 2.0
sscor multivariate 1.9 1.6 1.4 1.2 1.0 2.0 1.7 1.4 1.2 1.0
t10
cor 1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.4 1.3
sscor pairwise 2.0 1.9 1.9 2.0 1.9 2.0 2.0 2.0 2.0 2.0
sscor multivariate 2.0 1.7 1.3 1.2 1.0 2.0 1.7 1.4 1.2 1.0
t5
cor 2.0 2.1 2.1 2.1 2.1 2.6 2.6 2.6 2.6 2.6
sscor pairwise 2.0 2.0 1.9 2.0 1.9 2.1 2.0 2.0 2.0 2.0
sscor multivariate 2.0 1.7 1.4 1.2 1.1 2.1 1.7 1.4 1.2 1.0
L
cor 1.6 1.5 1.3 1.2 1.1 1.6 1.5 1.3 1.2 1.1
sscor pairwise 1.9 1.9 1.9 2.0 2.0 2.0 2.0 2.0 2.0 2.0
sscor multivariate 1.9 1.6 1.4 1.2 1.1 2.0 1.7 1.4 1.2 1.1
Table 1: Simulated variances (multiplied by n) of one off-diagonal element of the
correlation matrix estimate based on the moment correlation (cor), the pairwise spatial
sign correlation (sscor pairwise) and the multivariate spatial sign correlation matrix
(sscor multivariate) for spherical normal (N), t5,t10, and Laplace (L) distribution,
several dimensions pand sample sizes n= 100,1000.
An increase of efficiency for larger pis not uncommon for robust scatter estimators.
It can be observed amongst others for M-estimators, the Tyler shape matrix, the MCD,
and S-estimators (e.g. [4, 18]). All of these are affine equivariant estimators, requiring
n > p. This is not necessary for the spatial sign correlation matrix. One may expect
that the efficiency gain for large pis at the expense of robustness, in particular a
larger maximum bias curve. Further research will be necessary to thoroughly explore
the robustness properties and efficiency of the multivariate spatial sign correlation
estimator.
References
[1] Bali J.L., Boente G., Tyler D.E., Wang J.L. (2011). Robust functional principal
components: A projection-pursuit approach. The Annals of Statistics. Vol. 39,
pp. 2852-2882.
[2] Bilodeau M., Brenner D. (1999). Theory of Multivariate Statistics. Springer, New
York.
[3] Croux C., Dehon C., Yadine, A. (2010). The k-step spatial sign covariance matrix.
Advances in data analysis and classification. Vol. 4, pp. 137-150.
6
[4] Croux C., Haesbroeck G. (1999). Influence function and efficiency of the minimum
covariance determinant scatter matrix estimator. Journal of Multivariate Analysis.
Vol. 71, pp. 161-190.
[5] urre A., Vogel D., Tyler D.E. (2014). The spatial sign covariance matrix with
unknown location. Journal of Multivariate Analysis. Vol. 130, pp. 107-117.
[6] urre A., Vogel D., Fried R. (2015). Spatial sign correlation. Journal of Multivari-
ate Analysis. Vol. 135, pp. 89-105.
[7] urre A., Vogel, D. (2016). Asymptotics of the two-stage spatial sign correlation.
Journal of Multivariate Analysis. Vol. 144, pp. 54-67.
[8] D¨urre A., Tyler D.E., Vogel, D. (2016). On the eigenvalues of the spatial sign
covariance matrix in more than two dimensions. Statistics & Probability Letters.
Vol. 111, pp. 80-85.
[9] D¨urre A., Vogel D. (2016). sscor: Robust Correlation Estimation and Testing
Based on Spatial Signs. R package version 0.2.
[10] Genz A, Bretz F., Miwa T., Mi X., Leisch F., Scheipl F., Bornkamp B., Maech-
ler M., Hothorn T. (2016), mvtnorm: Multivariate Normal and t Distributions. R
package version 1.0.5.
[11] Kano Y. (1994). Consistency property of elliptic probability density functions,
Journal of Multivariate Analysis. Vol. 51, pp. 139-147.
[12] Kemperman J. H. B. (1987). The median of a finite measure on a Banach space.
Statistical Data Analysis Based on the L1-Norm and Related Methods. pp. 217-
230.
[13] Locantore N., Marron J.S., Simpson D.G., Tripoli N., Zhang J.T., Co-
hen K.L. (1999). Robust principal component analysis for functional data. Test.
Vol. 8, pp. 1-73.
[14] Magyar A.F., Tyler D.E. (2014). The asymptotic inadmissibility of the spatial sign
covariance matrix for elliptically symmetric distributions. Biometrika. Vol. 101,
pp. 673-688.
[15] Marden, J.I. (1999). Some robust estimates of principal components. Statistics &
probability letters. Vol. 43, pp. 349-359.
[16] Nordhausen K., Oja H. (2011), Multivariate L1methods: the package MNM.
Journal of Statistical Software. Vol. 43, pp. 1-28.
[17] R Development Core Team (2016). R: A Language and Environment for Statistical
Computing.
7
[18] Taskinen S., Croux C., Kankainen A., Ollila E., Oja H. (2006). Influence functions
and efficiencies of the canonical correlation and vector estimates based on scatter
and shape matrices. Journal of Multivariate Analysis. Vol. 97, pp. 359-384.
[19] Vogel D., ollmann C., Fried R. (2008). Partial correlation estimates based on
signs. Proceedings of the 1st Workshop on Information Theoretic Methods in Sci-
ence and Engineering. Vol. 43, pp. 1-6.
8
... For real-valued ES distributions, in [20], a heuristic FP algorithm involving numerical integration was proposed for estimating the eigenvalues of the shape matrix using the eigenvalues of the SSCM. We propose an alternative approach, where we approximate the integral (7) and then invert it. ...
... Next, we compare BASICS, RSSCM, RFP (RSSCM with bias correction using the FP algorithm [20]) and the regularized Tyler's M-estimator (CWH) [21] (implemented as in [22]) in the estimation of the shape matrix. Fig. 3 displays the NMSE, Ave Λ − Λ 2 F / Λ 2 F , as a function of n averaged over 2000 independent MC trials for each n, when sampling from the multivariate t-distribution Ct ν,p (0, Σ) with ν = 2 degrees of freedom. ...
... The misclassification rate was computed using the uncontaminated remaining samples (test set). Fig. 4 depicts the misclassification rates and median computation times for BASICS, RSSCM, RFP of [20], CWH of [21] and (regularized SCM based non-robust) LW of [28] averaged over 10000 independent MC trials. 3 The reported computation times are relative to the median computation time of the RSSCM and exclude the computation time for centering the data. ...
Article
Full-text available
The spatial sign covariance matrix (SSCM), also known as the normalized sample covariance matrix (NSCM), has been widely used in signal processing as a robust alternative to the sample covariance matrix (SCM). It is well-known that the SSCM does not provide consistent estimates of the eigenvalues of the shape matrix (normalized scatter matrix). To alleviate this problem, we propose BASIC (Bias Adjusted SIgn Covariance), which performs an approximate bias correction to the eigenvalues of the SSCM under the assumption that the samples are generated from zero mean unspecified complex elliptically symmetric distributions (the real-valued case is also addressed). We then use the bias correction in order to develop a robust regularized SSCM based estimator, BASIC Shrinkage estimator (BASICS), which is suitable for high dimensional problems, where the dimension can be larger than the sample size. We assess the proposed estimator with several numerical examples as well as in a linear discriminant analysis (LDA) classification problem with real data sets. The simulations show that the proposed estimator compares well to competing robust covariance matrix estimators but has the advantage of being significantly faster to compute.
... The larger the value on the diagonal line of the matrix, the variables are more important. Conversely, the smaller the value, the smaller the corresponding variable is the secondary variable of the noise signal [15][16][17]. ...
... Then, the principal component value (or variance ratio) of single eigenvalue is calculated by λ i /∑ m i=1 λ i . The cumulative contribution rate η of eigenvalues is calculated by the following equation (16). The constant ε = 0:85 is introduced as the screening condition in this paper, and the principal components are retained only when η ≥ ε is satisfied, which is used for subsequent weighted fusion calculation of eigenvectors [4]. ...
Article
Full-text available
Performance feature extraction is the primary problem in equipment performance degradation assessment. To handle the problem of high-dimensional performance characterization and complexity of calculating the performance indicators in flexible material roll-to-roll processing, this paper proposes a PCA method for extracting the degradation characteristic of roll shaft. Based on the analysis of the performance influencing factors of flexible material roll-to-roll processing roller, a principal component analysis extraction model was constructed. The original feature parameter matrix composed of 10-dimensional feature parameters such as time domain, frequency domain, and time-frequency domain vibration signal of the roll shaft was established; then, we obtained a new feature parameter matrix by normalizing the original feature parameter matrix. The correlation measure between every two parameters in the matrix was used as the eigenvalue to establish the covariance matrix of the performance degradation feature parameters. The Jacobi iteration method was introduced to derive the algorithm for solving eigenvalue and eigenvector of the covariance matrix. Finally, using the eigenvalue cumulative contribution rate as the screening rule, we linearly weighted and fused the eigenvectors and derived the feature principal component matrix of the processing roller vibration signal. Experiments showed that the initially obtained, 10-dimensional features of the processing rollers’ vibration signals, such as average, root mean square, kurtosis index, centroid frequency, root mean square of frequency, standard deviation of frequency, and energy of the intrinsic mode function component, can be expressed by 3-dimensional principal components , and . The vibration signal features reduction dimension was realized, and , and contain 98.9% of the original vibration signal data, further illustrating that the method has high precision in feature parameters’ extraction and the advantage of eliminating the correlation between feature parameters and reducing the workload selecting feature parameters. 1. Introduction The object of flexible material roll-to-roll processing is flexible electronic film material; once the core components of processing equipment begin to degrade, flexible electronic film products will be deformed to varying degrees, resulting in processing quality problems [1, 2]. Therefore, it is necessary to predict the performance and health status of the core components of flexible material R2R processing equipment. However, the number of rollers of the flexible material R2R processing equipment is large, and there are correlations between the movement state variables of the collected rollers, which lead to complications in the process of extracting performance degradation feature. If the performance degradation feature parameters of each roller are analyzed separately, the results are often isolated; if the parameters are blindly reduced, it may lead to incorrect performance state prediction conclusions due to the loss of too much raw information. Thus, it is extremely necessary to fuse and reduce the dimensions of the feature data for roll shaft performance degradation prediction and extract representative feature [3]. Previous studies have shown that principal component analysis (PCA), based on the idea of spatial transformation, achieves the purpose of optimal variance without reducing the information content contained in the original data and describes the high-dimensional data information with less principal component information, which has incomparable advantages over other algorithms [4–6]. In reference [4] (2018), spearman grade correlation coefficient and PCA were used for feature fusion to obtain the health index representing the declining state of rolling bearing performance. The analysis results of an example show that the proposed method can accurately identify the declining state of the rolling bearing performance. In reference [5] (2018), a frequency analysis method based on the residuals generated by SOM-PCA algorithm is proposed. This method is effective for fault detection and diagnosis of most of the unsupervised classification experimental data sets. In reference [6] (2018) proposed a sensor fault detection method of water chiller sensor based on empirical mode decomposition (EMD) threshold denoising and principal component analysis (EMD-TD-PCA). PCA model is established by EMD threshold denoising data. statistic is used to detect sensor fault. Compared with the traditional PCA method, EMD-TD-PCA method can effectively improve the efficiency of fault detection. Therefore, on the basis of previous research, this paper proposes a PCA method for extracting the degradation feature of R2R roll shaft of flexible material, constructs the principal component analysis extraction model of the degradation feature of processing roll shaft, establishes the feature matrix of the original vibration signal of roll shaft and the covariance matrix of the degradation feature, and introduces Jacobi method. The eigenvalues and eigenvectors of the covariance matrix, as well as the principal component matrix algorithm of the vibration signal of the processing roll, are studied. Finally, the validity and effectiveness of the proposed method for extracting the performance degradation eigenvalues are verified by R2R processing experiments. 2. Flexible Material R2R Processing Roller Performance Degradation Feature Extraction Based on PCA The principle framework of flexible material R2R performance degradation feature extraction is shown in Figure 1. As shown in Figure 1, the vibration acceleration data of roller axle during R2R processing were collected by sensors. After filtering and denoising the vibration acceleration signal data, the eigenvalues of the vibration acceleration data in the time domain, frequency domain, and time-frequency domain were calculated by integrating mathematical statistics, power spectrum, and empirical mode decomposition (EMD) [7–9]. On this basis, the eigenvalue normalized feature matrix and the covariance matrix of eigenvalue were established. Finally, using the eigenvalue cumulative contribution rate as the screening rule, we linearly weighted and fused the eigenvectors, then obtained the degradation feature parameters principal component of processing roll shaft performance, i.e., the R2R processing roller shaft performance degradation indicator characterizing amount.
... A robust estimator could be achieved in terms of the FAR(1) model by performing a robust estimation of the eigenfunctions, then using the PCA approach. Examples of robust estimators for eigenfunctions can be found in Locantore et al. (1999), Gervini (2008), Bali et al. (2011), Lee et al. (2013), Boente and Salibian-Barrera (2015) and Dürre et al. (2016). However, a robust estimation of the eigenfunctions changes the order of the eigenvalues, which would change the approximation of the covariance structure, due to the truncation procedure (Dürre et al., 2016). ...
... Examples of robust estimators for eigenfunctions can be found in Locantore et al. (1999), Gervini (2008), Bali et al. (2011), Lee et al. (2013), Boente and Salibian-Barrera (2015) and Dürre et al. (2016). However, a robust estimation of the eigenfunctions changes the order of the eigenvalues, which would change the approximation of the covariance structure, due to the truncation procedure (Dürre et al., 2016). Another option is the minimization approach, where we weight the function to be minimized in L 1 -norm or L 2 -norm. ...
Article
A robust estimator for functional autoregressive models is proposed, the Depth-based Least Squares (DLS) estimator. The DLS estimator down-weights the influence of outliers by using the functional directional outlyingness as a centrality measure. It consists of two steps: identifying the outliers with a two-stage functional boxplot, then down-weighting the outliers using the functional directional outlyingness. Theoretical properties of the DLS estimator are investigated such as consistency and boundedness of its influence function. Through a Monte Carlo study, it is shown that the DLS estimator performs better than estimators based on Principal Component Analysis (PCA) and robust PCA, which are the most commonly used. To illustrate a practical application, the DLS estimator is used to analyze a dataset of ambient CO2 concentrations in California.
... A number of papers have followed since then, especially within the groups around H. Oja and D.E. Tyler, respectively, see Gervini (2008), Sirkia et al. (2009), Taskinen et al. (2010Taskinen et al. ( , 2012, Dürre et al. (2014Dürre et al. ( , 2015Dürre et al. ( , 2017 and Dürre and Vogel (2016). ...
Preprint
Full-text available
Sample spatial-sign covariance matrix is a much-valued alternative to sample covariance matrix in robust statistics to mitigate influence of outliers. Although this matrix is widely studied in the literature, almost nothing is known on its properties when the number of variables becomes large as compared to the sample size. This paper for the first time investigates the large-dimensional limits of the eigenvalues of a sample spatial sign matrix when both the dimension and the sample size tend to infinity. A first result of the paper establishes that the distribution of the eigenvalues converges to a deterministic limit that belongs to the family of celebrated generalized Mar\v{c}enko-Pastur distributions. Using tools from random matrix theory, we further establish a new central limit theorem for a general class of linear statistics of these sample eigenvalues. In particular, asymptotic normality is established for sample spectral moments under mild conditions. This theory is established when the population is elliptically distributed. As applications, we first develop two new tools for estimating the population eigenvalue distribution of a large spatial sign covariance matrix, and then for testing the order of such population eigenvalue distribution when these distributions are finite mixtures. Using these inference tools and considering the problem of blind source separation, we are able to show by simulation experiments that in high-dimensional situations, the sample spatial-sign covariance matrix is still a valid and much better alternative to sample covariance matrix when samples contain outliers.
... In particular, multivariate sign methods, that is, methods that use the observations only through their direction from the center of the distribution, have become increasingly popular. For location problems, multivariate sign tests were considered in Randles (1989), Möttönen and Oja (1995), Hallin and Paindaveine (2002) and Paindaveine and Verdebout (2016), whereas sign procedures for scatter or shape matrices were considered in Tyler (1987a), Dümbgen (1998), Hallin and Paindaveine (2006a), Dürre, Vogel and Fried (2015) and Dürre, Fried and Vogel (2017), to cite only a few. PCA techniques based on multivariate signs (and on the companion concept of ranks) were studied in Hallin, Paindaveine and Verdebout (2010), Taskinen, Koch and Oja (2012), Hallin et al. (2013) and Dürre, Tyler and Vogel (2016). ...
Preprint
Full-text available
We consider inference on the first principal direction of a $p$-variate elliptical distribution. We do so in challenging double asymptotic scenarios for which this direction eventually fails to be identifiable. In order to achieve robustness not only with respect to such weak identifiability but also with respect to heavy tails, we focus on sign-based statistical procedures, that is, on procedures that involve the observations only through their direction from the center of the distribution. We actually consider the generic problem of testing the null hypothesis that the first principal direction coincides with a given direction of $\mathbb{R}^p$. We first focus on weak identifiability setups involving single spikes (that is, involving spectra for which the smallest eigenvalue has multiplicity $p-1$). We show that, irrespective of the degree of weak identifiability, such setups offer local alternatives for which the corresponding sequence of statistical experiments converges in the Le Cam sense. Interestingly, the limiting experiments depend on the degree of weak identifiability. We exploit this convergence result to build optimal sign tests for the problem considered. In classical asymptotic scenarios where the spectrum is fixed, these tests are shown to be asymptotically equivalent to the sign-based likelihood ratio tests available in the literature. Unlike the latter, however, the proposed sign tests are robust to arbitrarily weak identifiability. We show that our tests meet the asymptotic level constraint irrespective of the structure of the spectrum, hence also in possibly multi-spike setups. Finally, we fully characterize the non-null asymptotic distributions of the corresponding test statistics under weak identifiability, which allows us to quantify the corresponding local asymptotic powers. Monte Carlo exercises confirm our asymptotic results.
... Over the last decades, many robust estimators for the covariance matrix have been developed. Many of them possess the attractive property of affine equivariance, meaning that when the data are subjected to an affine transformation the estimator will transform In addition to spherical PCA, there also has been a lot of recent research on the use of the SSCM for constructing robust correlation estimators (Dürre et al., 2015;Dürre et al., 2017). The main focus of this work is on results including asymptotic properties, the eigenvalues, and the influence function which measures robustness. ...
Article
Full-text available
The well-known spatial sign covariance matrix (SSCM) carries out a radial transform which moves all data points to a sphere, followed by computing the classical covariance matrix of the transformed data. Its popularity stems from its robustness to outliers, fast computation, and applications to correlation and principal component analysis. In this paper we study more general radial functions. It is shown that the eigenvectors of the generalized SSCM are still consistent and the ranks of the eigenvalues are preserved. The influence function of the resulting scatter matrix is derived, and it is shown that its asymptotic breakdown value is as high as that of the original SSCM. A simulation study indicates that the best results are obtained when the inner half of the data points are not transformed and points lying far away are moved to the center.
Chapter
The need to test (or estimate) sphericity arises in various applications in statistics, and thus the problem has been investigated in numerous papers. Recently, estimates of a sphericity measure are needed in high-dimensional shrinkage covariance matrix estimation problems, wherein the (oracle) shrinkage parameter minimizing the mean squared error (MSE) depends on the unknown sphericity parameter. The purpose of this chapter is to investigate the performance of robust sphericity measure estimators recently proposed within the framework of elliptically symmetric distributions when the data dimensionality, p, is of similar magnitude as the sample size, n. The population measure of sphericity that we consider here is defined as the ratio of the mean of the squared eigenvalues of the scatter matrix parameter relative to the mean of its eigenvalues squared. We illustrate that robust sphericity estimators based on the spatial sign covariance matrix (SSCM) or M-estimators of scatter matrix provide superior performance for diverse covariance matrix models compared to sphericity estimators based on the sample covariance matrix (SCM) when distributions are heavy-tailed and n = O(p). At the same time, they provide equivalent performance when the data are Gaussian. Our examples also illustrate the important role that the sphericity plays in determining the attainable accuracy of the SCM.KeywordsElliptical distributionsHigh-dimensional statisticsM-estimators of scatter matrixRobust statisticsSign covariance matrixSphericity parameter
Article
Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample covariance estimator does not work well if the data exhibits heavy‐tailedness or outliers. To address this challenge, a new robust functional principal component analysis approach based on a functional pairwise spatial sign (PASS) operator, termed PASS FPCA, is introduced. We propose robust estimation procedures for eigenfunctions and eigenvalues. Theoretical properties of the PASS operator are established, showing that it adopts the same eigenfunctions as the standard covariance operator and also allows recovering ratios between eigenvalues. We also extend the proposed procedure to handle functional data measured with noise. Compared to existing robust FPCA approaches, the proposed PASS FPCA requires weaker distributional assumptions to conserve the eigenspace of the covariance function. Specifically, existing work are often built upon a class of functional elliptical distributions, which requires inherently symmetry. In contrast, we introduce a class of distributions called the weakly functional coordinate symmetry (weakly FCS), which allows for severe asymmetry and is much more flexible than the functional elliptical distribution family. The robustness of the PASS FPCA is demonstrated via extensive simulation studies, especially its advantages in scenarios with non‐elliptical distributions. The proposed method was motivated by and applied to analysis of accelerometry data from the Objective Physical Activity and Cardiovascular Health Study, a large‐scale epidemiological study to investigate the relationship between objectively measured physical activity and cardiovascular health among older women. This article is protected by copyright. All rights reserved
Preprint
Full-text available
Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample covariance estimator does not work well in the presence of outliers. To address this challenge, a new robust functional principal component analysis approach based on the functional pairwise spatial sign (PASS) operator, termed PASS FPCA, is introduced where we propose estimation procedures for both eigenfunctions and eigenvalues with and without measurement error. Compared to existing robust FPCA methods, the proposed one requires weaker distributional assumptions to conserve the eigenspace of the covariance function. In particular, a class of distributions called the weakly functional coordinate symmetric (weakly FCS) is introduced that allows for severe asymmetry and is strictly larger than the functional elliptical distribution class, the latter of which has been well used in the robust statistics literature. The robustness of the PASS FPCA is demonstrated via simulation studies and analyses of accelerometry data from a large-scale epidemiological study of physical activity on older women that partly motivates this work.
Article
Full-text available
In the paper we present an R package MNM dedicated to multivariate data analysis based on the L 1 norm. The analysis proceeds very much as does a traditional multivariate analysis. The regular L 2 norm is just replaced by different L 1 norms, observation vectors are replaced by their (standardized and centered) spatial signs, spatial ranks, and spatial signed-ranks, and so on. The procedures are fairly efficient and robust, and no moment assumptions are needed for asymptotic approximations. The background theory is briefly explained in the multivariate linear regression model case, and the use of the package is illustrated with several examples using the R package MNM.
Article
Full-text available
The asymptotic efficiency of the spatial sign covariance matrix relative to affine equivariant estimators of scatter is studied. In particular, the spatial sign covariance matrix is shown to be asymptotically inadmissible, i.e., the asymptotic covariance matrix of the consistency-corrected spatial sign covariance matrix is uniformly larger than that of its affine equivariant counterpart, namely Tyler’s scatter matrix. Although the spatial sign covariance matrix has often been recommended when one is interested in principal components analysis, its inefficiency is shown to be most severe in situations where principal components are of greatest interest. Simulation shows that the inefficiency of the spatial sign covariance matrix also holds for small sample sizes, and that the asymptotic relative efficiency is a good approximation to the finite-sample efficiency for relatively modest sample sizes.
Article
Full-text available
Our object in writing this book is to present the main results of the modern theory of multivariate statistics to an audience of advanced students who would appreciate a concise and mathematically rigorous treatment of that material. It is intended for use as a textbook by students taking a first graduate course in the subject, as well as for the general reference of interested research workers who will find, in a readable form, developments from recently published work on certain broad topics not otherwise easily accessible, as, for instance, robust inference (using adjusted likelihood ratio tests) and the use of the bootstrap in a multivariate setting. The references contains over 150 entries post-1982. The main development of the text is supplemented by over 135 problems, most of which are original with the authors. A minimum background expected of the reader would include at least two courses in mathematical statistics, and certainly some exposure to the calculus of several variables together with the descriptive geometry of linear
Article
We gather several results on the eigenvalues of the spatial sign covariance matrix of an elliptical distribution. It is shown that the eigenvalues are a one-to-one function of the eigenvalues of the shape matrix and that they are closer together than the latter. We further provide a one-dimensional integral representation of the eigenvalues, which facilitates their numerical computation.
Article
The spatial sign correlation (D\"urre, Vogel and Fried, 2015) is a highly robust and easy-to-compute, bivariate correlation estimator based on the spatial sign covariance matrix. Since the estimator is inefficient when the marginal scales strongly differ, a two-stage version was proposed. In the first step, the observations are marginally standardized by means of a robust scale estimator, and in the second step, the spatial sign correlation of the thus transformed data set is computed. D\"urre et al. (2015) give some evidence that the asymptotic distribution of the two-stage estimator equals that of the spatial sign correlation at equal marginal scales by comparing their influence functions and presenting simulation results, but give no formal proof. In the present paper, we close this gap and establish the asymptotic normality of the two-stage spatial sign correlation and compute its asymptotic variance for elliptical population distributions. We further derive a variance-stabilizing transformation, similar to Fisher's z-transform, and numerically compare the small-sample coverage probabilities of several confidence intervals.
Article
A new robust correlation estimator based on the spatial sign covariance matrix (SSCM) is proposed. We derive its asymptotic distribution and influence function at elliptical distributions. Finite sample and robustness properties are studied and compared to other robust correlation estimators by means of numerical simulations.
Article
The consistency and asymptotic normality of the spatial sign covariance matrix with unknown location are shown. Simulations illustrate the different asymptotic behavior when using the mean and the spatial median as location estimator.