Content uploaded by Chi Man Wong
Author content
All content in this area was uploaded by Chi Man Wong on Sep 02, 2020
Content may be subject to copyright.
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 1
Inter- and Intra-Subject Transfer Reduces
Calibration Effort for High-Speed SSVEP-based
BCIs
Chi Man Wong, Ze Wang, Boyu Wang, Ka Fai Lao, Agostinho Rosa, Peng Xu, Tzyy-Ping Jung,
C. L. Philip Chen, and Feng Wan*
Abstract—Objective: Steady-state visual evoked potential
(SSVEP)-based brain-computer interfaces (BCIs) that can deliver
high information transfer rate (ITR) usually require subject’s
calibration data to learn the class- and subject-specific model pa-
rameters (e.g. the spatial filters and SSVEP templates). Normally,
the amount of the calibration data for learning is proportional
to the number of classes (or visual stimuli), which could be
huge and consequently lead to a time-consuming calibration. This
study presents a transfer learning scheme to substantially reduce
the calibration effort. Methods: Inspired by the parameter-based
and instance-based transfer learning techniques, we propose a
subject transfer based canonical correlation analysis (stCCA)
method which utilizes the knowledge within subject and between
subjects, thus requiring few calibration data from a new subject.
Results: The evaluation study on two SSVEP datasets (from
Tsinghua and UCSD) shows that the stCCA method performs
well with only small amount of calibration data, providing
an ITR at 198.18±59.12 (bits/min) with 9 calibration trials in
the Tsinghua dataset and 111.04±57.24 (bits/min) with 3 trials
in the UCSD dataset. Such performances are comparable to
those from using the multi-stimulus CCA (msCCA) and the
ensemble task-related component analysis (eTRCA) methods with
the minimally required calibration data (i.e., at least 40 trials in
the Tsinghua dataset and at least 12 trials in the UCSD dataset),
respectively. Conclusion: Inter- and intra-subject transfer helps
the recognition method achieve high ITR with extremely little
calibration effort. Significance: The proposed approach saves
much calibration effort without sacrificing the ITR, which would
This work is supported in part by the Science and Technology Development
Fund, Macau SAR (File no. 055/2015/A2 and 0045/2019/AFJ), the University
of Macau Research Committee (MYRG projects 2016-00240-FST, 2017-
00207-FST and 2020-00161-FST), the National Natural Science Foundation
of China (Grant No. 61961160705), the LARSyS - FCT Plurianual funding
2020-2023, and the Natural Sciences and Engineering Research Council of
Canada (NSERC), Discovery Grants Program. Asterisk indicates correspond-
ing author
C. M. Wong, Z. Wang, K. F. Lao and F. Wan* are with the Department
of Electrical and Computer Engineering, Faculty of Science and Engineering,
University of Macau, Macau, and also with the Centre for Cognitive and
Brain Sciences, and the Centre for Artificial Intelligence and Robotics,
Institute of Collaborative Innovation, University of Macau, Macau (e-mail:
fwan@um.edu.mo).
B. Wang is with the Department of Computer Science and the Brain Mind
Institute, University of Western Ontario, London, ON N6A 5B7, Canada.
A. Rosa is with ISR and DBE-IST, Universidade de Lisboa, Lisbon,
Portugal.
P. Xu is with the Key Laboratory for NeuroInformation of Ministry of
Education, School of Life Science and Technology, University of Electronic
Science and Technology of China, Chengdu, People’s Republic of China.
T-. P. Jung is with the Swartz Center for Computational Neuroscience,
Institute for Neural Computation, University of California San Diego, La Jolla,
CA, USA.
C. L. P. Chen is with the Department of Computer and Information Science,
Faculty of Science and Technology, University of Macau, Macau (e-mail:
philip.chen@ieee.org).
be significant for practical SSVEP-based BCIs.
Index Terms—Brain-computer interface, steady-state visual
evoked potential, inter-subject, intra-subject, transfer learning
I. INTRODUCTION
ELECTROENCEPHALOGRAPHY (EEG)-based brain-
computer interfaces (BCIs) can provide physically dis-
abled people an additional communication channel by means
of their brain activities [1]. For this purpose, researchers have
put great efforts in developing usable and reliable BCIs with
significant progress in recent years, especially for steady-state
visual evoked potential (SSVEP)-based BCIs [2]–[7]. SSVEP
is the user’s brain response to the external visual stimulus
flashing at a specified frequency. According to the SSVEP’s
time-/phase-locking property, an SSVEP-based BCI presents
a virtual keyboard via frequency-modulated visual stimuli for
different control commands, and it can identify which visual
stimulus the user is gazing at, by detecting the specified
frequency components from the measured EEG signals [3],
[4].
In the past decades, the performance of the SSVEP-based
BCIs have been drastically enhanced from around 20−60
bits/min [2], [4], [8], [9] to more than 200 bits/min [6], [7].
A major idea behind the success is to utilize the subject’s
calibration data for SSVEP recognition, in the cutting-edge
methods the extended canonical correlation analysis (eCCA)
method [8], [10] and ensemble task-related component anal-
ysis (eTRCA) method [7]. However, a big compromise is
that every subject has to undergo a burdensome calibration
process in which the subject needs to gaze at every visual
stimulus, each for many trials, to generate the subject-specific
calibration data to learn the stimulus-specific (or class-specific)
model parameters: the spatial filter and SSVEP template (more
details will be given in Section II), in the SSVEP recognition
algorithms. Theoretically, the calibration time (or the number
of the calibration trials) is increased along with the number
of visual stimuli, and the number of the calibration trials
should not be smaller than the number of visual stimuli. As
a consequence, such a calibration is usually time-consuming
(especially in a high-speed SSVEP-based BCI with a large
number of visual stimuli, e.g., 40 in [6], [7]), which would
significantly induce subject’s fatigue [11]. Furthermore, when
the calibration trials (or data) are reduced the corresponding
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 2
recognition performance would be severely deteriorated, e.g.,
as shown in [7], [12], [13].
To overcome the above difficulties, several studies have been
proposed using transfer learning technology to either i) transfer
the knowledge across stimulus frequencies (but within subject)
[12], [14] or ii) transfer the knowledge across subjects (but
within stimulus frequency) [15], [16]. Specifically, Suefusa et
al. propose to transfer a subject’s SSVEP template from one
stimulus frequency to the other stimulus frequencies but it
could give just a limited performance improvement in [14].
In [12] a learning across stimuli scheme is proposed for the
SSVEP recognition algorithms against the small calibration
data problem. The key idea is to develop a spatial filter
that could fit for different stimulus frequencies for transfer,
so that the learning process can be relatively reliable even
though the calibration data are small. An alternative scheme
is to transfer the SSVEP templates from the existing subjects
to a new subject [15], [16]. In [15], the grand average of
all existing subjects’ templates (or deemed as a transferred
SSVEP template) is assumed as the new subject’s SSVEP
template, therefore the SSVEP-based BCI does not require
a large amount of calibration data from the new subject. Nev-
ertheless, such a transferred SSVEP template that is designed
without any new subject’s knowledge could not fit all new
subjects due to large individual differences. In [16], to address
this individual difference issue, new subject’s calibration data
are required to design the transferred SSVEP templates. Still,
the number of the required calibration trials cannot be fewer
than the number of visual stimuli, hence the reduction of the
calibration time or effort is rather limited.
To the best of our knowledge, no study transfers simultane-
ously the knowledge within subject and between subjects. This
study introduces a transfer learning scheme that utilizes both
the transferred spatial filter and transferred SSVEP template,
i.e., the intra-subject spatial filter and the inter-subject SSVEP
template, for the CCA-based method (termed as the subject
transfer CCA (stCCA)) to achieve high performance even the
subject’s calibration trials are fewer than the number of visual
stimuli, and no longer constrained by the number of visual
stimuli. The proposed stCCA is based on two assumptions: i)
a subject’s spatial filters across different stimulus frequencies
share a common spatial pattern, and ii) a subject’s spatially fil-
tered SSVEP templates can be approximated by the weighted
summation of the other subjects’ spatially filtered SSVEP tem-
plates, in which the weight vector is common across different
stimulus frequencies. Experiment results demonstrate that the
proposed stCCA with insufficient calibration trials (e.g., only
several trials) can provide a competitive performance when
compared with the existing state-of-the-art methods even with
more calibration trials, which are verified in two public SSVEP
datasets collected from a total of 45 subjects [17], [18].
II. PRELIMINARIES
A. Notations
For a clear introduction of the SSVEP recognition algo-
rithms, Table I summarizes the notations that are used here. In
general, the variables with a right subscript kdenote that they
are corresponding to the k-th stimulus frequencies (e.g., fk).
For example, ukand Xkare the spatial filter and the SSVEP
template corresponding to fk. In particular, the variables with
a left subscript nindicate that they are corresponding to the
n-th source subject Sn. For example, nXkis the n-th source
subject’s SSVEP template corresponding to fk. Under normal
circumstances, if the variables come from the target subject,
there is no left subscript.
B. Parameters in the SSVEP Recognition Algorithms
To recognize the frequency and phase feature of the
SSVEPs, most of the existing SSVEP recognition algo-
rithms, such as the standard CCA (sCCA), the extended
CCA (eCCA) [6], the transfer template CCA (ttCCA) [15],
the multi-stimulus CCA (msCCA) [12], and the ensemble
TRCA (eTRCA) [7], include the following steps: i) Spatial
filtering: Spatially filtering subject’s single-trial multi-channel
EEG signal without label information (X), in which the spatial
filter (e.g., uk,vk) is learned by using the subject’s labeled
data, e.g., X(j)
kor Xk. ii) Feature extraction: measuring the
similarities between the filtered EEG signals and different
SSVEP templates (Xk), or SSVEP reference signals (Yk). iii)
Feature classification: identifying the visual stimulus that the
subject intends to select by finding the maximal similarity.
TABLE I
TABL E OF NOTATIO NS
Notation Description
fk, φkFrequency and phase of the k-th visual stimulus (k∈
Nf)
Nf,Ntrain,
Ntrial,Np,Nch ,
Nh,Nsub,Nblock
Number of visual stimuli, number of training trials for
each stimulus, number of training trials for all stimuli,
number of sampling points, number of channels, number
of harmonics, number of source subjects, and number
of blocks in an SSVEP-based BCI experiment
SnSource subject’s code (n∈Ns).
Nf,N0
fN0
f⊆Nf={1,2,···, Nf}.
Nt,N0
tN0
t⊆Nt={1,2,···, Ntrain }.
NsNs={1,2,···, Nsub }.
X∈RNp×Nch Single-trial multi-channel EEG signal (unlabeled data).
X(j)
k∈RNp×Nch
Multi-channel calibration data for the k-th stimulus and
the j-th calibration trial (labeled data), where k∈Nf
and j∈Nt.
Xk∈RNp×Nch SSVEP template corresponding to the k-th stimulus, see
(2).
˜
xk∈RNp×1Weighted summation of spatially filtered SSVEP tem-
plates for the k-th stimulus, or inter-subject SSVEP
template (see (6))
Yk∈RNp×2NhSSVEP reference signal for the k-th stimulus, see (1).
uk∈RNch×1Spatial filter for subject’s data.
vk∈R2Nh×1Spatial filter for SSVEP reference signal Yk.
¨
u∈RNch×1Intra-subject spatial filter for subject’s EEG data.
¨
v∈R2Nh×1Intra-subject spatial filter for Yk.
w∈RNsub×1Weight value for subjects, w= [w1,···, wNsub ]>.
nX(j)
k,nXk,n¨
uCalibration data, SSVEP template, and intra-subject
spatial filter of the source subject Sn(the subscript k
indicates that it is corresponding to the k-th stimulus)
XkAverage of all source subjects’ SSVEP templates nXk,
i.e., Xk= 1/Nsub ·PNsub
n=1 nXk
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 3
Ykincludes a series of artificial sine-cosine data corre-
sponding to the frequencies: fk,2fk,·· · , Nh×fk, which
is used to model the SSVEP data evoked by the stimulus
frequency fk[10], [12].
Yk=
sin(2πfkt+φk)
cos(2πfkt+φk)
.
.
.
sin(2πNhfkt+Nhφk)
cos(2πNhfkt+Nhφk)
>
,(1)
where t= [1/Fs,2/Fs,·· · , Np/Fs],Nhis the number of
harmonics, φkis the stimulus phase, and Fsis the sampling
rate. Note that the parameter φk= 0 in [10]. The spatial filter
and SSVEP template are learned from the real SSVEP data,
such as the subject’s single-trial multi-channel EEG data (X),
the subject’s calibration data (X(j)
k: the j-th calibration trial’s
SSVEP data corresponding to fkin the calibration stage) and
the subject’s SSVEP template corresponding to fk(Xk: the
average of X(j)
kacross trials, see (2)).
Xk=1
Ntrain
Ntrain
X
j=1
X(j)
k.(2)
Apparently, the spatial filter and the SSVEP template are
two essential elements in the SSVEP recognition. In most
cases, the spatial filter is computed either using CCA in [6],
[15] or using TRCA in [7]. The SSVEP template is estimated
by either using the subject-specific calibration data in [6],
[7], [12] or using multiple subjects’ calibration data in [15],
[16]. Although the eCCA method [6] and the eTRCA method
[7] have demonstrated excellent accuracy, they require costly
calibration effort. Recently, Wong et al. propose a new learn-
ing across multi-stimulus scheme to alleviate this calibration
problem while maintaining high performance [12]. However,
the minimal number of their calibration trials is restricted by
the number of classes (or visual stimuli) because the model
parameters (i.e., the spatial filter and SSVEP template) in
these methods are class-specific and subject-specific, as seen
in Fig. 1 (a). This poses a significant challenge on how to
substantially reduce the calibration effort for those high-speed
SSVEP-based BCIs containing many visual stimuli, e.g., a 40-
target speller in [6], [7].
III. THE PRO PO SE D MET HO D
Fig. 1 (a) elucidates that the existing algorithms usually
require a massive calibration data from a new subject to
learn the subject-specific and class-specific parameters (i.e.,
the spatial filter and SSVEP template). In this study, we
propose to learn the subject-specific and class-specific pa-
rameters (i.e., the transferred spatial filter and the transferred
SSVEP template) from the new subject’s little calibration data
and other subjects’ calibration data, as illustrated in Fig. 1
(b). Consequently, the new subject’s calibration data can be
greatly reduced. To compute the transferred spatial filter and
transferred SSVEP template, we have two assumptions: i) the
spatial filters within a subject are common across different
visual stimuli, ii) the SSVEP templates across subjects share
Subject’s data (Nf-class)
X(j)
k, j ∈Nt,k ∈Nf
Spatial filters
u1,u2,··· ,uNf
(or v1,v2,··· ,vNf)
Class-specific, subject-sp ecific
SSVEP templates
X1,X2,··· ,XNf
Class-specific, subject-sp ecific
(a) Learning requires Nf-class data
Subject’s data (K-class)
X(j)
k, j ∈N0
t, k ∈N0
f
Other subjects’ data (Nf-class)
nX(j)
k, j ∈Nt,k ∈Nf, n ∈Ns
Transferred spatial filters
¨
u(or ¨
v)
Class-nonspecific, subject-sp ecific
Weight vector
w1, w2,··· , wNsub
Transferred SSVEP templates
PNsub
n=1 wn·nXk·¨
un
Class-nonspecific, subject-sp ecific
Spatially filtered templates
nXk·n¨
u
Class-specific, subject-nonsp ecific
(b) Learning requires K-class data from target subject (K≤Nf) and
Nf-class data from other subjects
Fig. 1. Model parameters learning in traditional supervised learning (a) and
in the proposed subject transfer learning (b). The notations can be found in
Table I
common knowledge in the low-dimensional subspace. With
these transferred model parameters, we develop a subject
transfer based CCA (or stCCA) method.
A. Transferred Spatial Filter and Transferred SSVEP Template
1) Intra-Subject Spatial Filter: The first assumption is that
each subject’s SSVEP data corresponding to different stimulus
frequencies can be assigned the same spatial filter, which is
consistent with the findings in [12], [19]. This means that the
spatial filter is transferable within subject and across different
frequencies. Consequently, the spatial filter can be class-
nonspecific. According to the learning across multi-stimulus
scheme introduced in [12], the class-nonspecific spatial filter
can be computed by using the subject’s calibration data
corresponding to Kdifferent stimulus frequencies (K≤Nf),
which may be considered as the parameter-based transfer
learning [20], [21].
{¨
u,¨
v}= argmax
u,v
u>X>Yv
√u>X>Xu·v>Y>Yv,(3)
where X(or Y) comprises Kgiven SSVEP templates (or
reference signals) corresponding to Kstimulus frequencies,
X= [X>
a1,·· · ,X>
aK]>,Y= [Y>
a1,·· · ,Y>
aK]>,(4)
where a1, a2,·· · , aKdenote the indices of Kstimuli,
{a1, a2,·· · , aK}=a⊂Nf,K≤Nf,a1< a2<·· · < aK,
and fa1< fa2<·· · < faK. As there is no prior knowledge
for a new subject, a general selection strategy is that we
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 4
select Kfrequencies that are uniformly distributed within Nf
frequencies. Here we propose three selection strategies:
A1 :ai= 1 + (Nf−1) ×(i−1)
K−1,
A2 :ai=Nf×(2i−1)
2K,
A3 :ai=Nf×2i
2K.
(5)
where bacis a floor function that maps the positive real
number ato the greatest integer less than or equal to a.
The strategy A1 is to select the smallest frequency f1, the
largest frequency fNf, and the other K−2frequencies within
f1and fNf. Among the other two strategies (A2 and A3),
Nffrequency indices are evenly divided into 2Kindices
and thereby the 2Kindices are bj×Nf/(2K)c, where
j= 1,2,·· · ,2K. The strategy A2 picks up Kindices when
j= 1,3,·· · ,2K−1as the frequency indices and A3 picks
up Kindices when j= 2,4,·· · ,2K, respectively. Note that
the frequency indices ais independent of k. Consequently,
the stCCA spatial filter is common across visual stimuli and
invariant with k. We can call it the intra-subject spatial filter
and denote it as ¨
u(or ¨
v).
As (3) comes from the objective function of the msCCA
[12], we can use it to compute the msCCA spatial filter when
we define the index aconsisting of Ksequential numbers.
In the msCCA, avaries with k, e.g., a=ak={k−
k1,·· · , k, ··· , k +k2}for k1+k2+ 1 = K(k1, k2≥0).
Specially, when K=Nf, the stCCA spatial filter and the
msCCA spatial filter are the same.
2) Inter-Subject SSVEP Template: The second assumption
is that some knowledge share among different subjects’ spa-
tially filtered SSVEP templates. According to the instance-
based transfer learning [20], [22], a weighted summation of
the known subjects’ (or the source subjects’) spatially filtered
SSVEP templates may approximate to a new subject’s (or the
target subject’s) spatially filtered SSVEP template, where the
weight vector is transferable within subject and across classes,
see Fig. 2.
˜
xk=1
Nsub
Nsub
X
n=1
wn·nXk·n¨
u,(6)
where wnis the weight for the source subject Sn,nXkand n¨
u
are the SSVEP template and the spatial filter from the source
subject Sn. Note that wnis independent of k(or stimulus
frequency) and ˜
xkis a column vector, which is termed as the
inter-subject SSVEP template.
To determine the weight wn, we minimize the summation of
squared error between the target subject’s Kspatially filtered
SSVEP templates (e.g., Xa1¨
u,·· · ,XaK¨
u(K≤Nf)) and
the weighted summation of the source subjects’ Kspatially
filtered SSVEP templates (e.g., nXa1·n¨
u,·· · ,nXaK·n¨
u
(n= 1,2,·· · , Nsub)), This can be formulated as the following
multivariate linear regression (MLR) problem:
w= argmin
w
1
2kb−Awk,(7)
where the intra-subject spatial filters are computed by using
(3), w= [w1, w2,·· · , wNsub ]>∈RNsub×1,k·k denotes the
Frobenius norm, bis the concatenation of the target subject’s
Kspatially filtered SSVEP templates, and Aconsists of Nsub
source subjects’ Kspatially filtered SSVEP templates, i.e.,
b=h¨
u>X>
a1,¨
u>X>
a2,·· · ,¨
u>X>
aKi>,
A=
1Xa1·1¨
u2Xa1·2¨
u·· · Nsub Xa1·Nsub ¨
u
1Xa2·1¨
u2Xa2·2¨
u·· · Nsub Xa2·Nsub ¨
u
.
.
..
.
.....
.
.
1XaK·1¨
u2XaK·2¨
u·· · Nsub XaK·Nsub ¨
u
,
(8)
where b∈RK·Np×1. The closed form solution of (7) is
w= (A>A)−1A>b. Consequently, the inter-subject SSVEP
templates can be obtained by using (6).
Inter-subject
SSVEP template
Spatial filtering
Target subject
Source subjects
Spatial filtering
+
...
...
...
W
Fig. 2. Illustration of the inter-subject SSVEP template. The target subject’s
spatially filtered SSVEP template and the weighted summation of the source
subjects’ spatially filtered SSVEP templates are similar, in which the weight
vector wis determined by (7).
B. SSVEP Recognition Algorithms with Subject Transfer
In this study, we propose to apply the transferred param-
eters (i.e., intra-subject spatial filter and inter-subject SSVEP
template) in the CCA-based SSVEP recognition algorithms,
see Fig. 1 (b), leading to the subject transfer CCA (stCCA).
Thereby, the corresponding Nffeatures for target recognition
can be computed by
ρk=
2
X
i=1
sign(rk,i)·r2
k,i (9)
where rk,1=corr(X¨
u,Yk¨
v) and rk,2=corr(X¨
u,˜
xk). corr(x,y)
computes the Pearson’s correlation between two vectors: x
and y. Finally, the stimulus frequency fccorresponding to the
largest feature ρcis recognized, i.e.,
c= argmax
k{ρk},(10)
IV. EXP ER IM EN T ST UDY
In this experiment study, we aim to i) explore the similarity
between the model parameters that learn from target subject’s
data and the model parameters that learn from target subject’s
little data and source subjects’ data, ii) evaluate the stCCA
performance under different parameters, iii) make a compari-
son study among different target recognition algorithms, e.g.,
the sCCA, the ttCCA, the eCCA, the msCCA, the stCCA, and
so on, in terms of their ITRs and calibration efforts.
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 5
A. SSVEP Datasets
Two public SSVEP datasets (or termed as Dataset I and
Dataset II hereafter) that were prepared by Tsinghua group
[17] and UCSD SCCN group [18] are adopted in this study.
In Dataset I, 35 subjects participated in the SSVEP-based
BCI experiment using cue-guided target selecting task. The
experiment consisted of 6 blocks. In each block, the subject
was requested to gaze at 40 visual stimuli in a random order.
Their 64-channel EEG signals were recorded. Here the EEG
signals only from Pz, PO5, PO3, POz, PO4, PO6, O1, Oz, and
O2 are used for the offline data analysis studies. In Dataset II,
10 subjects’ 8-channel EEG signals were recorded when they
carried out the similar SSVEP-based BCI experiment. This
experiment included 15 blocks. In each block, the subject was
asked to gaze at 12 visual stimuli in a random order. For more
information about Datasets I and II, please refer to [17] and
[18], respectively. In summary, Nf= 40 and Nblock = 6 in
Dataset I. Nf= 12 and Nblock = 15 in Dataset II.
B. Data Pre-processing
The widely used data pre-processing procedure in [6], [7],
[23] is performed in this study. First, the filter-bank analysis
approach is employed for all the target recognition methods,
which can extract the meaningful features from different sub-
band components of EEG signals to improve the accuracy of
SSVEP recognition [23]. The basic principle is to compute
the features ρ(nb)
kfrom Nbsub-band component of the EEG
signals, respectively. Then, a weighted sum of all features
corresponding to different sub-bands is applied for the final
target recognition. Namely, (10) can be rewritten by
c= argmax
k
Nb
X
nb=1
α(nb)·ρ(nb)
k,(11)
where α(nb) = n−1.25
b+ 0.25 and Nb= 5 [23]. Specifically,
each trial of EEG data is decomposed into 5 sub-bands through
5 bandpass filters, where the lower cut-off frequencies are 8
Hz, 16 Hz, 24 Hz, 32 Hz, and 40 Hz, respectively, and the
upper cut-off frequencies are 90 Hz. More details can be found
in [6], [7], [23]. Second, each trial of EEG data is segmented
from τLto τL+Twto exclude the latency that is useless for
SSVEP recognition [6], where time 0 indicates stimulus onset,
Twdenotes the time-window length, and τLis the SSVEP
latency (τL= 0.14 s). After the data pre-processing, the
filtered and segmented EEG data are ready for the following
analysis.
C. Offline Data Analysis
1) Target Subject Spatial Filter and Transferred Spatial
Filter: To explore whether target subject’s spatial filter (i.e.,
the msCCA spatial filter) and transferred spatial filter (i.e., the
intra-subject spatial filter) have similar spatial filtering perfor-
mance, we compare the similarity between the spatial patterns
of the spatially filtered signals using these two spatial filters.
Here we do not directly compare the similarity of the spatial
filters because such weight vectors have no physical meaning
according to [24]. At first, the msCCA spatial filters and the
intra-subject spatial filters are calculated using (3) and (4),
where the data length is 1 (s), a={k−k1,·· · , k, ··· , k +k2}
for k1+k2+ 1 = 12 (k1, k2≥0) for the msCCA spatial
filter [12] and ais determined by three different strategies
A1, A2, and A3 in (5) for the intra-subject spatial filter. Then,
the spatial pattern Pcan be computed via P= (Σ1/2)u=
where Σ = X>X. When Xis composed of SSVEP templates
corresponding to 12 neighboring frequencies for learning the
msCCA spatial filter, the resulting pattern is denoted as P0.
When Xis composed of Kselected SSVEP templates for
learning the intra-subject spatial filter, three resulting patterns
PA1,PA2and PA3corresponding to three selection strategies
(A1, A2, and A3) are yielded, respectively. To sum up, the
overall procedure can be described as below. Each subject’s
spatial patterns are computed using his/her SSVEP templates,
and each SSVEP template is calculated using all his/her
calibration data, i.e., Ntrain = 6 in Dataset I and Ntrain = 15
in Dataset II. Note that the calculation of the spatial patterns
only utilizes the target subject’s data, without other subjects’
data. Then we compute the correlation coefficients between
P0and PA1, between P0and PA2, between P0and PA3, to
evaluate their similarities, respectively, for different K. Finally
we calculate the averaged similarity across all Nsub subjects
and the corresponding results are shown in Fig. 3.
2) Target Subject Template and Transferred Template:
In this study, the transferred SSVEP template is constructed
with linearly combining the source subjects’ SSVEP tem-
plates. When constructing a transferred template, we require
to consider whether i) the linear combination is the weight
summation or not, and ii) the spatially filtering is enabled
or not, which results in four different types of transferred
SSVEP templates (see Table II). In order to explore which
transferred SSVEP template is the most similar to target
subject’s SSVEP template, we compare the similarity between
the target subject’s SSVEP template and different types of
transferred SSVEP templates. Specifically, this comparison
study is divided into two cases: Case I (with spatial filtering)
and Case II (without spatial filtering). Table II lists the
corresponding target subject’s SSVEP templates (Ztar) and
transferred SSVEP templates (Ztran). Then we calculate the
correlation coefficients between Ztran and Ztar. Note that the
data length is 1 (s), ¨
uis the individual-subject spatial filter
for target subject, n¨
ufor source subject Snusing (3), and
wnis the weight vector that learned from target subject’s
calibration data. In particular, we can find wnusing (7). In
Case I, Ais the concatenation of source subjects’ spatially
filtered SSVEP templates and bis the concatenation of target
subject’s spatially filtered SSVEP templates, as indicated in
(8). As no spatial filtering in Case II, we construct Aand b
in a different way:
b=hvec(Xa1)>,·· · ,vec(XaK)>i>,
A=
vec(1Xa1)·· · vec(NsubXa1)
vec(1Xa2)·· · vec(NsubXa2)
.
.
.....
.
.
vec(1XaK)·· · vec(NsubXaK)
,
(12)
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 6
where vec(X)∈RNpNch×1vectorizes the multichannel signal
X∈RNp×Nch . Finally, we obtain the similarities for different
cases, different subjects, and K, and then we average them
across Nsub subjects in Fig. 4.
TABLE II
TRANSFERRED SSVEP TEM PLATE Ztr an AND TARGET SSVEP
TEM PLATE Ztar I N EXP. 2)
Case Ztran Ztar Correlation
I
Nsub
P
n=1
wn·nXk·n¨
u
Xk¨
u
corr(Ztran ,Ztar )
Nsub
P
n=1
nXk·n¨
u
II
Nsub
P
n=1
wn·nXk
Xk
Nsub
P
n=1
nXk
3) Parameter Exploration in the stCCA: As the model
parameters in the stCCA require to learn from the calibration
data of source subjects and target subject, it is interesting to
evaluate how the stCCA performance would be affected by the
parameters that are mainly related to how much source subject
and target subject data the stCCA uses, e.g., data length Tw,
number of training trials Ntrial, and number of source subjects
Nsub. In this study, we calculate the accuracy and the ITR of
the stCCA utilizing different Tw,Ntrial, and Nsub in Dataset
I and II. The ITR is computed by
I T R = [log2(Nf)+ Plog2(P) + (1 −P)log2(1−P
Nf−1)]×60
T,
(13)
where Pis the recognition accuracy, Tis the time for each
detection (T=Tw+Ts,Tsis the time for subject to shift
his/her visual attention between two consecutive trials). Here
Ts= 0.5s and the details of parameter settings can be found
in Table III. To obtain a general performance of the stCCA, we
perform a 50-fold cross-validation scheme. In each fold, Nsub
source subjects are randomly selected for subject transfer.
All subjects, except the target subject, can act as the source
subjects. Besides, we can involve all possible combinations
of training data and testing data since Ntrain = 1 here.
Specifically, we choose Ktrials of the data from the target
subject corresponding to one specified block (i.e., 1 trial per
stimulus and only Kavailable stimuli) for training and the
data corresponding to the rest (Nblock −1) blocks are for
testing. Like a ‘leave-one-out’ approach, we totally repeat six
rounds on Dataset I and 15 rounds on Dataset II to calculate a
general performance, respectively, in which the training data
come from different blocks of the data in each round.
TABLE III
PARAMETER SETTINGS IN EXP. 3) FO R DATA IAND II
Dataset Tw(s) Ntrial Nsub
I 0.5, 1.0, 1.5, 2.0 10, 20, 30, 40 5, 15, 20, 25, 30, 34
II 0.5, 1.0, 1.5, 2.0 3, 6, 9, 12 2, 4 ,6 ,8, 9
†Ntrial =K×Ntrain and Ntrain = 1.
4) Performance Comparison when Insufficient Calibration
Data: Based on the parameter exploration in Section IV-
D 2), it can be expected that the stCCA could obtain the
highest ITR when 0.5≤Tw≤1s and Nsub does not affect
the performance very much in many cases. This study firstly
evaluates the stCCA’s ITR when Tw= 0.5,0.6,··· ,1s, and
Nsub = 34 in Dataset I and 9 in Dataset II. In particular,
the number of training trials (Ntrial) is kept below Nf(i.e.
Ntrial =K×Ntrain ≤Nf) for the stCCA to simulate the
case of insufficient calibration data from target subject.
Then a performance comparison study between different
methods is conducted. We compare the ITRs of the sCCA, the
ttCCA, the eCCA, the msCCA, the eTRCA, and the stCCA,
respectively, in Dataset I and II. To make a fair comparison,
the optimal Twis respectively selected for the competitors.
Due to space limitation, we put all the details of the Tw
selection in the Supplementary file. The results indicate that
the competitors can achieve the highest ITRs in Dataset I
when Tw= 1.3for the sCCA, Tw= 1.0for the ttCCA,
Tw= 0.9for the eCCA, Tw= 0.8for the msCCA, and
Tw= 0.8for the eTRCA, and in Dataset II when Tw= 1.7
for the sCCA, Tw= 1.0for the ttCCA, Tw= 1.0for the
eCCA, Tw= 0.8for the msCCA, and Tw= 0.5for the
eTRCA. Meanwhile, the competitors are also requested to only
use the minimally required calibration data to simulate the
case of small calibration data from target subject. Namely, the
minimal Ntrial for the eTRCA is 2Nfand Nffor the eCCA
(or msCCA) [12] (the sCCA and the ttCCA do not require
any calibration data from target subject).
In order to achieve a general ITRs, we perform a 20-
fold cross-validation scheme. In each fold, Ntrain trials are
randomly selected from Nblock trials as the training data and
the remaining (Nblock −Ntrain)trials as the testing data.
Thereby, 1≤Ntrain ≤5in Dataset I and 1≤Ntrain ≤14 in
Dataset II. In addition, all subjects, except the target subject,
are considered as the source subjects, i.e., Nsub = 34 in
Dataset I and Nsub = 9 in Dataset II, when computing the
ITRs of the stCCA and the ttCCA. Meanwhile, we propose a
new index, which is mainly related to the cost-performance
ratio (CPR), to evaluate how much ITR the stCCA, the
msCCA, the eTRCA, and the eCCA can benefit from Ntrial
target subject’s calibration trials. First, the ITR of the sCCA is
considered as the baseline ITR. Second, the difference between
the baseline ITR and different methods’ ITRs are calculated, as
denoted by ∆IT R . Then the CPR is computed via CPR= ∆ITR
Ntrial .
5) Performance Comparison when Sufficient Calibration
Data: We also compare the stCCA with two state-of-the-
art algorithms, i.e., the msCCA and the eTRCA, when they
have sufficient calibration data, namely, when K= 40 and
Ntrain = 5 in Dataset I and K= 12 and Ntrain = 14
in Dataset II. The performance indices are the ITR and the
accuracy.
Like in previous experiments, we perform a cross-validation
scheme to produce a general comparison results. Here we
employ a ‘leave-one-out’ approach to choose the training data
and the testing data. In each round, (Nblock −1)-block data
from target subject is selected for training and the remaining
one block data is for testing, in which the training data and
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 7
the testing data do not have overlapping. We repeat 6 rounds
in Dataset I and 15 rounds in Dataset II for the evaluation.
Note that in the stCCA Nsub = 34 in Dataset I and Nsub = 9
in Dataset II.
6) stCCA Using Different Transferred Templates: One of
the key distinctions among the stCCA and the existing CCA
methods (e.g., the eCCA) is that the stCCA employs the trans-
ferred SSVEP template (not the subject’s SSVEP template).
This experiment explores the ITRs of the stCCAs when the
transferred SSVEP template is constructed in three different
approaches, i.e., the stCCA, the stCCA-1, and the stCCA-2.
Specifically, i) the stCCA using the weighted summation of
source subjects’ spatially filtered SSVEP templates while the
weight is determined by (7), ii) the stCCA-1 using the average
of the source subjects’ spatially filtered SSVEP templates,
and iii) the stCCA-2 using the weighted summation of source
subjects’ SSVEP templates while the weight is computed by
(7). Similar to Exp. 4), we employ the same cross-validation
scheme to evaluate the performance. According to the results
in Exp. 4), Tw=0.7 in Dataset I and Tw=0.6 in Dataset II can
lead to the highest performance. Hence, in this experiment
Tw=0.7 in Dataset I and Tw=0.6 in Dataset II.
D. Results
1) Target Subject Spatial Filter and Transferred Spatial
Filter: Fig. 3 shows the averaged similarity across all Nsub
subjects. Specifically, the averaged similarity between the
spatial patterns of target subject’s spatial filter (P0) and
transferred spatial filters which learned from Ktemplates
using three different selection strategies (PA1,PA2, and PA3),
which are indicated in three different colors: red, blue and
green, respectively. Apparently, all three patterns PA1,PA2,
and PA3have high similarity with P0(>0.8) in many cases,
which implies that the transferred spatial filters could have
similar spatial filtering as the target subject’s spatial filter even
if Kis much less than Nf.
A two-way repeated measure ANOVA is conducted to study
the similarity under different selection strategies and different
K. Results indicate that i) the strategy has a significant effect
on Dataset I (F(2,68) = 6.556,p= 0.002), but no significant
effect on Dataset II (F(2,18) = 0.780, p = 0.473), ii) K
has a significant effect (on Dataset I: F(39,1326) = 33.240,
p < 0.001. on Dataset II: F(11,99) = 6.168, p < 0.001),
and iii) their interactions have a significant effect on Dataset I
(F(78,2652) = 4.858,p < 0.001) and no significant effect
on Dataset II (F(22,198) = 0.520, p = 0.964). Pairwise
comparisons indicate that the strategy A2 leads to significantly
different results on Dataset I while three strategies do not
generate statistically different results on Dataset II. Therefore,
we selected the strategy A2 in the following study.
2) Target Subject Template and Transferred Subject Tem-
plate: Fig. 4 compares which types of transferred SSVEP
templates is the most similar to target subject’s templates
in two cases (Case I: SSVEP template with spatial filtering,
Case II: SSVEP template without spatial filtering). The x-axis
indicates how many SSVEP templates from target subject it
uses to learn the spatial filter in (3) and the weight vector in
10 20 30 40
Num. of Templates (K)
0.8
0.85
0.9
0.95
1
Similarity
Dataset I
A1 A2 A3
2 4 6 8 10 12
Num. of Templates (K)
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
Similarity
Dataset II
A1 A2 A3
Fig. 3. Similarity between the spatial patterns of the target subject’s spatial
filter and the transferred spatial filter under different Kand different selection
strategies. Error bars indicate the standard errors
(7). The transferred SSVEP templates are either the weighted
summation of the source subjects’ SSVEP templates (in blue
color) or the average of the source subjects’ SSVEP templates
(in red color). Thereby, the similarities in red color are not
sensitive to Kin Case II because the transferred SSVEP tem-
plates do not require spatial filtering and weighted summation
in Case II.
Apparently, the source subjects’ SSVEP templates with
spatial filtering are more similar to the target subject’s SSVEP
template than that without spatial filtering. In Case I, the
similarity is more than 0.6 while the similarity is less than
0.5 in Case II. This implies that the transferrable knowledge
among inter-subject SSVEP templates should be in a low-
dimensional subspace. Furthermore, the weighted summation
of the source subjects’ spatially filtered SSVEP templates
usually have higher similarity than the average of the source
subjects’ spatially filtered SSVEP templates in Case I. How-
ever, the weighted summation of the source subjects’ SSVEP
templates usually have lower similarity than the average of the
source subjects’ templates in Case II. Therefore, we construct
the transferred SSVEP template via linearly combining the
source subjects’ spatially filtered SSVEP templates.
0 10 20 30 40
0.5
0.6
0.7
0.8
Similarity (Case I)
Dataset I
2 4 6 8 10 12
0.5
0.6
0.7
0.8
Similarity (Case I)
Dataset II
0 10 20 30 40
Num. of Templates (K)
0
0.2
0.4
Similarity (Case II)
2 4 6 8 10 12
Num. of Templates (K)
0
0.2
0.4
Similarity (Case II)
Fig. 4. Similarity between the target subject’s SSVEP templates and the
transferred SSVEP templates. Note that the target subject’s SSVEP templates
and the transferred SSVEP templates are constructed in two different cases
(i.e., with or without spatial filtering), as explained in Table II. Red color
indicates that the transferred SSVEP template is constructed using the weight
summation of the source subjects’ SSVEP templates while blue color indicates
using the average of the source subjects’ SSVEP templates. Error bars describe
the standard errors. ∗denotes p < 0.05,?denotes p < 0.01, and hexagram
denotes p < 0.001 according to the paired t-test results.
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 8
3) Parameter Exploration in the stCCA: Figs. 5 and 6
illustrate how parameters (e.g., Tw,Ntrial, and Nsub ) affect
the stCCA performance (i.e., accuracy or ITR) in Dataset I
and II. Fig. 5 (a) illustrates the average performance (upper
plot: accuracy, lower plot: ITR) of the stCCA across different
Nsub and Ntrial (in red color), across different Twand Nsub
(in blue color), and across different Twand Ntrial (in magenta
color), respectively, for Dataset I. The corresponding results
for Dataset II are given in Fig. 5 (b). Twand Ntrial are
two parameters that directly determine how much the target
subject’s data the stCCA can learn from while Nsub is a
parameter about how much source subject’s data it can utilize.
In addition, Twis also one of the parameters in the ITR
calculation, see (13). Thereby, Twand Ntrial may affect the
stCCA performance much more than Nsub. When Nsub >5
in Dataset I or Nsub >2in Dataset II, it seems that the stCCA
performance has reached to the upper limit. These results may
imply that the stCCA can achieve the highest ITR when we
choose Twwithin 0.5 s and 1 s, Ntrial as large as possible,
and Nsub >5in Dataset I and Nsub >2in Dataset II.
In order to investigate the effect of the parameters, espe-
cially for Nsub and Ntrial, we plot the stCCA performance
in a heat-map in Fig. 6 (x-axis: Nsub, y-axis: Ntrial , color:
performance) while the maximal performance in each row is
normalized to 1, which is highlighted by a yellow circle. The
presented results may imply that we should select a large Nsub
(or small Nsub) when we have much calibration data (or less
calibration data). However, we have to stress that Nsub does
not have a large effect on the performance if Nsub >5in
Dataset I and Nsub >2in Dataset II. Hence, Nsub can be
simply set to 34 in Dataset I and 9 in Dataset II.
Accuracy
0.5 1 1.5 2
0
50
100
Accuracy
Accuracy
10 20 30 40
0
50
100
Accuracy
5 10 15 20 25 30 34
0
50
100
ITR
0.5 1 1.5 2
TW
0
100
200
ITR
ITR
10 20 30 40
Ntrial
0
100
200 ITR
5 10 15 20 25 30 34
Nsub
0
100
200
(a) Dataset
Accuracy
0.5 1 1.5 2
0
50
100
Accuracy (%)
Accuracy
3 6 9 12
0
50
100
Accuracy
24689
0
50
100
ITR
0.5 1 1.5 2
TW (sec)
0
50
100
150
ITR (bits/min)
ITR
3 6 9 12
Ntrial
0
50
100
150 ITR
24689
Nsub
0
50
100
150
(b) Dataset II
Fig. 5. Barcharts of the stCCA performance with different Tw,Ntrial , and
Nsub on (a) Dataset I and (b) Dataset II.
Accuracy (0.50 s)
5 1015 202530 34
10
20
30
40
Ntrial
ITR (0.50 s)
5 1015 202530 34
Nsub
10
20
30
40
Ntrial
Accuracy (1.00 s)
5 1015 202530 34
10
20
30
40
ITR (1.00 s)
5 1015 202530 34
Nsub
10
20
30
40
Accuracy (1.50 s)
5 1015 202530 34
10
20
30
40
ITR (1.50 s)
5 1015 202530 34
Nsub
10
20
30
40
Accuracy (2.00 s)
5 1015 202530 34
10
20
30
40
ITR (2.00 s)
5 1015 202530 34
Nsub
10
20
30
40 0.95
1
Normalized Value
(a) Dataset I
Accuracy (0.50 s)
24689
3
6
9
12
Ntrial
ITR (0.50 s)
24689
Nsub
3
6
9
12
Ntrial
Accuracy (1.00 s)
24689
3
6
9
12
ITR (1.00 s)
24689
Nsub
3
6
9
12
Accuracy (1.50 s)
24689
3
6
9
12
ITR (1.50 s)
24689
Nsub
3
6
9
12
Accuracy (2.00 s)
24689
3
6
9
12
ITR (2.00 s)
24689
Nsub
3
6
9
12 0.95
1
Normalized Value
(b) Dataset II
Fig. 6. Heatmaps of the stCCA performance under different Ntrial and Nsub
on (a) Dataset I and (b) Dataset II.
4) Performance Comparison when Insufficient Calibration
Data: Fig. 7 shows the ITR of the stCCA under different
Ntrain,K, and Tw. Note that the total number of the calibra-
tion trials is not more than Nf, i.e., Ntrial =K×Ntrain ≤Nf
as stated in Section IV C. It can be found that the stCCA can
achieve the highest ITR when Tw= 0.7in Dataset I and
Tw= 0.6in Dataset II, respectively. The ITR of the stCCA
is increased along with K(or Ntrain). Particularly, the ITR
is relatively low when K < 2as the knowledge across multi-
stimulus cannot be utilized.
To compare different methods’ performance statistically, the
paired t-test is used. Fig. 8 shows the paired t-test results
between the stCCA and each method, where all the pvalues
are corrected using the Bonferroni method and the paired t-test
results are indicated with different colors. Specifically, each
subplot shows the comparison results between two methods,
i.e., M1 and M2, in which M1 represents the stCCA and
M2 represents either the msCCA, the eTRCA, the eCCA, the
ttCCA, or the sCCA. Black color indicates that the ITRs of
M1 and M2 have no significant difference. Blue color (or red
color) denotes that the ITR of M1 is significantly lower (or
higher) than the ITR of M2, in which the dark, normal, and
light blue (or red) colors denote different significant levels at
0.001, 0.01, and 0.05, respectively. First of all, the stCCA can
outperform the ttCCA and the sCCA in most cases in Dataset
I and II. Second, the stCCA does not perform worse than the
eCCA when Ntrial >3in Dataset I and any cases in Dataset
II. Then, the stCCA does not perform worse than the eTRCA
when Ntrial >4in Dataset I and provides the similar ITR as
the eTRCA in Dataset II. Finally, the stCCA is not inferior to
the msCCA when Ntrial is not too small (such as when K= 9
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 9
Tw=0.5(s)
ITR (bits/min)
1
2
3
4
5
Ntrain
107.4
129.8
152.2
174.6
197
Tw=0.6(s)
ITR (bits/min)
1
2
3
4
5121.6
143.2
164.8
186.4
208
Tw=0.7(s)
ITR (bits/min)
1
2
3
4
5
Ntrain
133.4
152.8
172.2
191.6
211
Tw=0.8(s)
ITR (bits/min)
1
2
3
4
5138.2
155.4
172.6
189.8
207
Tw=0.9(s)
ITR (bits/min)
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5
Ntrain
139.2
154.4
169.6
184.8
200
Tw=1(s)
ITR (bits/min)
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5135.6
149.2
162.8
176.4
190
(a) Dataset I
Tw=0.5(s)
ITR (bits/min)
2
4
6
8
10
12
Ntrain
85.2
97.4
109.6
121.8
134
Tw=0.6(s)
ITR (bits/min)
2
4
6
8
10
12
88
100
112
124
136
Tw=0.7(s)
ITR (bits/min)
2
4
6
8
10
12
Ntrain
88
99
110
121
132
Tw=0.8(s)
ITR (bits/min)
2
4
6
8
10
12
89.4
99.8
110.2
120.6
131
Tw=0.9(s)
ITR (bits/min)
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12
Ntrain
89.4
98.8
108.2
117.6
127
Tw=1(s)
ITR (bits/min)
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12
88.4
96.8
105.2
113.6
122
(b) Dataset II
Fig. 7. The ITR of the stCCA with different Ntrial and different Twtested
on (a) Dataset I and (b) Dataset II. Note that Ntrial =K×Ntrain
and Ntrain = 1) in Dataset I, and when K > 2in Dataset II.
To make the stCCA attain high ITR as similar as the state-
of-the-art algorithms while requiring calibration data as little
as possible, the following parameters are selected: Tw= 0.7
and Ntrial = 9 in Dataset I, and Tw= 0.6and Ntrial = 3 in
Dataset II.
Based on the results of Tw= 0.7in Dataset I and Tw= 0.6
in Dataset II from Fig. 7, we plot Fig. 9 to exhibit how the
stCCA’s ITR changes with different Ntrial (see the blue solid
line). Each blue ‘dot’ is the average result of the stCCA’s
ITRs corresponding to the same Ntrial. Since the stCCA
always performs bad when K= 1, we exclude them in the
calculation (or exclude the results in the first column of Fig.
7) except when K= 1 and Ntrain = 1. In overall, the ITR
is proportional to Ntrial. Specifically, the stCCA’s ITR can
increase rapidly with Ntrial when Ntrial is not very large.
In addition to the stCCA, the optimal ITRs of the other
methods incorporating the minimal number of target subject’s
calibration data are also given for a comparison, where their
optimal Twand minimal Ntrial are listed in Table IV. Clearly,
the msCCA method outperforms the sCCA, the ttCCA, the
eCCA and the eTRCA. The stCCA method can also achieve a
high ITR as similar as the msCCA method but only requiring
less number of target subject’s calibration data (Ntrial < Nf).
The above results may not exhibit the cost of calibration
effort among the state-of-the-arts recognition methods, espe-
cially for the eCCA, the eTRCA, and the msCCA methods.
The scatterplots in Fig. 10 exhibit the relationship between
the ITR and the number of calibration data (or Ntrial) in
Dataset I and II, respectively. Different symbols denote dif-
M1=stCCA
M2=msCCA
1
2
3
4
5
Ntrain
M1=stCCA
M2=eTRCA
1
2
3
4
5
M1=stCCA
M2=eCCA
1
2
3
4
5
Ntrain
M1=stCCA
M2=ttCCA
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5
M1=stCCA
M2=sCCA
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5
Ntrain
p-value
0.001 0.01 0.05 0.05 0.01 0.001
M1<M2 M1>M2
(a) Dataset I
M1=stCCA
M2=msCCA
2
4
6
8
10
12
Ntrain
M1=stCCA
M2=eTRCA
2
4
6
8
10
12
M1=stCCA
M2=eCCA
2
4
6
8
10
12
Ntrain
M1=stCCA
M2=ttCCA
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12
M1=stCCA
M2=sCCA
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12
Ntrain
p-value
0.001 0.01 0.05 0.05 0.01 0.001
M1<M2 M1>M2
(b) Dataset II
Fig. 8. ITR comparison between the stCCA (M1) and the other methods
(M2) on Dataset I (a) and Dataset II (b). Different colors describe the paired
t-test results after Bonferroni correction. The red and blue colors denote that
the ITR of the stCCA is significantly less and more than the competitor,
respectively, while the black color indicates no significant difference. Dark,
normal, and light red (or blue) colors denote different significant levels, i.e.,
0.001, 0.01, and 0.05, respectively.
Fig. 9. The stCCA’s ITRs along different Ntrial in Dataset I (left) and Dataset
II (right). Note that the x-axis indicates the number of calibration trials Ntrial
that the stCCA requires. The ttCCA and the sCCA do not require calibration
data (i.e., Ntrial = 0). The minimal number of the calibration data that the
eCCA, the msCCA and the eTRCA require is not less than Nf. Specifically,
the minimal Ntrial for the msCCA and the eCCA is Nfand for the eTRCA
is 2Nf. The shaded areas indicate the standard error.
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 10
TABLE IV
DIFF ERE NT M ETH ODS ’ IT RS UN DE R THE S EL ECT ED Ntrial A ND Tw
Dataset Methods TwNtrial (trial) ITR (bits/min)
(s) Ntrain KMean±S.D.
I
sCCA 1.3 0 120.16±44.15
ttCCA 1.0 0 145.83±53.85
eCCA 0.9 1 40 167.00±55.35
eTRCA 0.8 2 40 180.02±61.25
msCCA 0.8 1 40 201.28±54.96
stCCA 0.7 1 9 198.18±59.12
II
sCCA 1.7 0 64.02±30.93
ttCCA 1.0 0 92.80±47.95
eCCA 1.0 1 12 84.55±41.11
eTRCA 0.5 2 12 105.43±62.95
msCCA 0.8 1 12 118.87±45.40
stCCA 0.6 1 3 111.04±57.24
1Ntrial =K×Ntrain .
2For the ttCCA and the stCCA, Nsub = 34 in Dataset I and Nsub = 9
in Dataset II.
ferent methods (i.e., C: eCCA, M: eTRCA, : msCCA, ×:
stCCA). For the stCCA, we only show the ITRs when K > 1
and Ntrial ≥5in Dataset I (or K > 1and Ntrial ≥3in
Dataset II) and the ×in red color represents that the ITR is
not significantly lower than the ITR of the msCCA according
to the paired t-test after Bonferroni correction. Apparently,
the proposed stCCA providing high ITR (as similar as the
msCCA) only requires little calibration data, and consequently
its CPR should be the best among them. Fig. 11 shows the
mean and standard deviation of the CPR of the msCCA, the
eTRCA, the eCCA, and the stCCA. For the stCCA, CPR=8.67
when Ntrain = 1 and K= 9 in Dataset I and ii) CPR=15.66
when Ntrain = 1 and K= 3 in Dataset II, which are also
highlighted in Fig. 10. Paired t-test results indicate that the
CPR in the stCCA is the highest. Finally, all the subjects’
ITRs using the stCCA methods (in black color), the msCCA
method (in red color), the eTRCA method (in blue color),
and the eCCA method (in green color) based on the selected
parameters are plotted in Fig. 12, respectively.
All results indicate that using inter-subject and intra-subject
knowledge can considerably reduce the calibration effort while
retain high ITRs.
0 20 40 60 80
Ntrial
160
170
180
190
200
210
220
ITR (bits/min)
Dataset I
stCCA
msCCA
eTRCA
eCCA
0 10 20
Ntrial
80
90
100
110
120
130
140
ITR (bits/min)
Dataset II
stCCA
msCCA
eTRCA
eCCA
Fig. 10. Scatterplot of the ITR and Ntr ial of the SSVEP recognition methods
in Dataset I (Left) and Dataset II (Right). The crosses in red color denote that
the ITRs of the stCCA are not significantly lower than the ITR of the msCCA
according to the paired t-test results after Bonferroni correction. Note that the
ITR of the stCCA is highlighted by a light blue circle when Ntrain = 1 and
K= 9 in Dataset I and when Ntrain = 1 and K= 3 in Dataset II.
Dataset I
msCCA eTRCA eCCA stCCA
0
2
4
6
8
10
12
CPR
Dataset II
msCCA eTRCA eCCA stCCA
0
5
10
15
20
25
CPR
Fig. 11. Mean and standard deviation of the CPR of the msCCA, the eTRCA,
the eCCA, and the stCCA in Dataset I (Left) and Dataset II (Right). Note
that hexagram denotes p < 0.001 according to the paired t-test results.
5) Performance Comparison when Sufficient Calibration
Data: Fig. 13 compares the performance of the stCCA and the
msCCA, and the performance of the stCCA and the eTRCA in
terms of the ITR and the accuracy, respectively. It should be
noticed that they have all available calibration data from target
subject, i.e., K= 40 and Ntrain = 5 in Dataset I, and K= 12
and Ntrain = 14 in Dataset II. On one hand, the stCCA can
provide higher performance than the msCCA or the eTRCA in
many cases in Dataset I. On the other hand, the stCCA only
provides similar performance as the msCCA or the eTRCA
in many cases in Dataset II. This could be expected that
the proposed subject transfer approach can be only beneficial
to the SSVEP recognition tasks if the calibration data from
target subject is not very much (Ntrain = 14 in Dataset II is
much more than 5 in Dataset I). The transferred knowledge
from source subjects cannot replace the knowledge from target
subject.
6) stCCA Using Different Transferred Templates: Fig. 14
compares the ITRs of the stCCA, the stCCA-1, and the stCCA-
2 under different Ntrain and K, where Tw= 0.7in Dataset I
and Tw= 0.6in Dataset II. It can be observed that the stCCA-
1 performs better than the stCCA-2 as well as the stCCA
outperforms them in most cases according to the statistical
results in Fig. 14 (b). These results may indicate that we
should construct the transferred SSVEP templates using the
weighted summation of source subjects’ spatial filtered SSVEP
templates, and learn the weight vector using the target subject’s
and the source subjects’ calibration data.
V. DISCUSSION
In order to learn the subject-specific and class-specific pa-
rameters (such as the spatial filters and the SSVEP templates),
the traditional supervised learning scheme usually requires a
massive calibration data from a new subject. In contrast, the
proposed subject transfer learning scheme only requires a few
calibration data from the new subject to learn the parameters
by leveraging the calibration data from existing subjects. With
a closer look at the proposed idea, it can be found that i) a
few calibration data from the new subject provide the subject-
specific knowledge (e.g., the spatial filter in (3) is the common
knowledge shared across different classes), ii) the calibration
data from existing subjects contribute the class-specific knowl-
edge (e.g., the existing subjects’ SSVEP templates in (6) is the
common knowledge for different new subjects), and iii) the
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 11
Dataset I
5 10 15 20 25 30 35
Subject No.
0
100
200
ITR (bits/min)
msCCA
eTRCA
eCCA
stCCA
Dataset II
2 4 6 8 10
Subject No.
0
50
100
150
200
Fig. 12. Comparison between the ITRs of the msCCA, the eTRCA, the eCCA, and the stCCA for different subjects in Dataset I (Left) and Dataset II (Right).
Note that they are computed based on the selected parameters as indicated in Table IV.
Accuracy
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
50
100
Accuracy (%)
msCCA
stCCA
Accuracy
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
50
100
Accuracy (%)
eTRCA
stCCA
ITR
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tw (sec)
0
100
200
ITR (bits/min)
msCCA
stCCA
ITR
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tw (sec)
0
100
200
ITR (bits/min)
eTRCA
stCCA
(a) Dataset I
Accuracy
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
50
100
Accuracy (%)
msCCA
stCCA
Accuracy
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
50
100
Accuracy (%)
eTRCA
stCCA
ITR
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tw (sec)
0
100
200
ITR (bits/min)
msCCA
stCCA
ITR
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tw (sec)
0
100
200
ITR (bits/min)
eTRCA
stCCA
(b) Dataset II
Fig. 13. Performance comparison between the stCCA and the state-of-the-art
methods when the target subject’s calibration data is sufficient on Dataset I
(a) and Dataset II (b). Note that ∗denotes p < 0.05,?denotes p < 0.01,
and hexagram denotes p < 0.001 according to the paired t-test results.
relationship between the new subject’s SSVEP templates and
the existing subjects’ SSVEP templates give another subject-
specific knowledge (i.e., the weight vector in (6)), see Fig.
1 (b). Consequently, transferring the intra-subject and inter-
subject knowledge simultaneously can help to estimate the
subject-specific and class-specific model parameters in the
case of small calibration data.
A. Comparison with Existing Recognition Methods
In principle, the major differences between the stCCA
method and the methods incorporating subject’s calibration
data are summarized as follows. First, all of their spatial
filters are learned from subject’s calibration data. But the
spatial filters in the eTRCA, the eCCA, and the msCCA are
class-specific while in the stCCA are class-non-specific. As
introduced in Section III, the stCCA and the msCCA spatial
filters are based on learning across stimuli scheme in [12].
Dataset II
stCCA
ITR (bits/min)
2
4
6
8
10
12 68.8
85.6
102.4
119.2
136
stCCA-1
ITR (bits/min)
2
4
6
8
10
12 68.8
85.6
102.4
119.2
136
stCCA-2
ITR (bits/min)
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12 68.8
85.6
102.4
119.2
136
Dataset I
stCCA
ITR (bits/min)
1
2
3
4
5
Ntrain
126.2
147.4
168.6
189.8
211
stCCA-1
ITR (bits/min)
1
2
3
4
5
Ntrain
126.2
147.4
168.6
189.8
211
stCCA-2
ITR (bits/min)
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5
Ntrain
126.2
147.4
168.6
189.8
211
(a) ITR
Dataset II p-value
M1=stCCA
M2=stCCA-1
M1>M2 M1<M2
2
4
6
8
10
12 0.001
0.01
0.05
0.05
0.01
0.001
M1=stCCA
M2=stCCA-2
M1>M2 M1<M2
2
4
6
8
10
12 0.001
0.01
0.05
0.05
0.01
0.001
M1=stCCA-1
M2=stCCA-2
M1>M2 M1<M2
2 4 6 8 10 12
Num. of Templates (K)
2
4
6
8
10
12 0.001
0.01
0.05
0.05
0.01
0.001
Dataset I p-value
M1=stCCA
M2=stCCA-1
M1>M2 M1<M2
1
2
3
4
5
Ntrain
0.001
0.01
0.05
0.05
0.01
0.001
M1=stCCA
M2=stCCA-2
M1>M2 M1<M2
1
2
3
4
5
Ntrain
0.001
0.01
0.05
0.05
0.01
0.001
M1=stCCA-1
M2=stCCA-2
M1>M2 M1<M2
5 10 15 20 25 30 35 40
Num. of Templates (K)
1
2
3
4
5
Ntrain
0.001
0.01
0.05
0.05
0.01
0.001
(b) p-value
Fig. 14. The ITRs of the stCCA, stCCA-1, and stCCA-2 in (a) and their
comparison results in (b). Each subplot of (b) presents the comparison results
between two methods (M1 and M2). Black color means that there is no
significant difference in the ITRs of M1 and M2. Red and blue colors
indicate that the ITR of M1 is significantly higher and lower than the ITR
of M2, respectively, in which the dark, normal, and light red (or blue) colors
denote different significant levels, i.e., 0.001, 0.01, and 0.05, respectively.
Note that the pvalues of the paired t-test results have been corrected by
using Bonferroni correction.
Moreover, the stCCA spatial filter is a special case of the
msCCA spatial filter when K=Nf. Hence, when Ntrain
is small, the msCCA and the stCCA can have more data for
learning for the others (i.e., K·Ntrain). Second, the SSVEP
templates in the msCCA, the eTRCA, and the eCCA are
learned from target subject’s calibration data only while the
ones in the stCCA are learned from target subject’s and other
subject’s calibration data. Thereby, the SSVEP templates in
the stCCA would be less sensitive to Ntrial than the others.
This could explain why the stCCA can work well with small
calibration data and even if Ntrial < Nf. As a consequence,
in the stCCA the calibration data can be effectively reduced
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 12
while the performance can be still maintained.
Like the existing state-of-art methods (such as the msCCA
and the eTRCA methods), the computational cost is also low
for the stCCA method, which can be easily implemented in
an online SSVEP-based BCI.
B. Transferred SSVEP Templates
Although the concept of transferred SSVEP template has
been introduced in some related works in [15], [16], it should
be pointed out that the transferred SSVEP templates in [15],
[16] are constructed without spatial filtering and even without
target subject’s knowledge in [15]. Fig. 4 shows that the target
subject’s SSVEP templates and the weighted summation of
the source subjects’ SSVEP templates have relatively high
similarity in the low-dimensional subspace. This reveals that
our proposed transferred SSVEP templates (or the inter-subject
SSVEP templates) would be more suitable than the ones in
[15], [16] for subject transfer in SSVEP-based BCIs.
As an example, Fig. 15 illustrates the difference between
the target subject’s SSVEP template Xk¨
u(in black solid), the
inter-subject SSVEP template ˜
xk(in red dot), and the Xk¨
u
(in blue dot) in time domain (left) and in frequency domain
(right), in which the target subject’s SSVEP template Xkis
from the subject S31 in Dataset I (top) and S5 in Dataset II
(bottom). Clearly, the proposed inter-subject SSVEP template
˜
xkcan capture the key characteristics of Xk¨
u.
10 20 30 40
Freq. (Hz)
0
0.01
0.02
Normalized Amp.
0 0.5 1
Time (s)
-2
0
2
Amp.
10 20 30 40
0
0.02
0.04
Normalized Amp.
0 0.5 1
-2
0
2
Amp.
Fig. 15. Comparison between the spatially filtered SSVEP templates in time
domain (left) and frequency domain (right). Note that Xkis from the subject
S31 in Dataset I (top) and S5 in Dataset II (bottom). In Dataset I, K= 10,
fk= 12 (Hz), and Tw= 1 (sec). In Dataset II, K= 4,fk= 11.75 (Hz),
and Tw= 1 (sec).
C. Parameter Selection
When a new subject uses an SSVEP-based BCI based on the
stCCA method, choosing the optimal parameters (e.g. Kand
Ntrain) as well as the selection strategy (e.g., A1, A2, and A3)
individually could lead to best learning performance. However,
finding the optimal parameters for a new subject is difficult,
especially when we have no prior knowledge from this new
subject. For example, in [4] an optimal channel combination
can be found individually through scanning each subject’s
calibration data corresponding to different electrodes. But this
scanning, like a calibration, is usually time-consuming and
troublesome, which is unwanted in practical use. Therefore, if
we want to find the optimal parameters for a new subject, we
require a frequency scan for the new subject, which is conflict
with our motivation of this study. For convenience, we fix the
parameters as much as possible for all different subjects. We
propose three selection strategies based on the assumption that
the SSVEP templates corresponding to different stimulus fre-
quencies are equally important for learning. According to our
experiment results, the SSVEP templates selected from three
selection strategies will not affect the learning performance
very much and thus we choose the selection strategy A2 for
all subjects. For the recognition performance, selecting K= 9
and Ntrain = 1 for the stCCA in Dataset I [17], K= 3
and Ntrain = 1 in Dataset II [18] can lead to the similar
or better performance than the existing algorithms. Although
these parameter settings may not fit for other system designs,
some hints could be provided here. First, the ITR of the stCCA
does not fluctuate very much even though the parameters are
not optimized. For example, Figs. 8 and 10 show that the
stCCA can provide the ITRs as good as the msCCA and with
high CPR in many parameter settings. In addition, Fig. 9 shows
that the average ITR of the stCCA seems saturated when
Ntrial is more than Nf/4. Second, setting K= 1 usually
leads to a low ITR since the between-class information is not
utilized, see Fig. 7. Hence, we suggest that Ntrial ≥Nf/4
while K≥2in general case.
D. Calibration Effort
Learning from the subject’s calibration data can enhance the
accuracy of the SSVEP recognition. However, the subject has
to carry out the laborious calibration trials and then would
suffer from the visual fatigue [11]. It is essential to reduce
such calibration effort. Our experiment results show that the
stCCA method with only 9 calibration trials (or 3 trials) can
provide the comparable ITR as the msCCA method with 40
calibration trials (or 12 trials) and the eTRCA methods with
80 calibration trials (or 24 trials) in Dataset I (or Dataset II). In
other words, with the help of intra- and inter-subject transfer, i)
the calibration trials can be reduced by a factor of 4 (at least)
while the high ITR can be maintained, and ii) the minimal
number of the calibration trials is no longer constrained by Nf.
Therefore, the calibration can be designed in a more flexible
and interesting way. For example, the subject may gaze at
several characters freely, instead of gaze at all characters, in
the calibration stage.
E. Future Work
Our ultimate goal is to develop a high-speed SSVEP-
based BCI with almost no calibration. For this reason, several
topics would be of interest in the future. In the current
study, firstly, the time window length (or Tw) is fixed for
all subjects and stimulus frequencies. According to several
previous studies [25]–[27], an optimal Twis usually different
for different subjects and stimuli, and thus a dynamic Tw
would be considered to enhance the performance. Secondly,
after the calibration, the model parameters are kept constant
during the online stage, which may not adapt the changes of
the subject’s EEG data during the online stage. As a result,
an online adaptation strategy should be utilized to update the
IEEE XXX, VOL. XX, NO. XX, MONTH YEAR 13
model parameters adaptively to obtain a reliable the system
performance [15], [28]–[30]. Finally, the nonlinear transfer
learning technologies (such as transfer component analysis
(TCA) [31] and joint distribution adaptation (JDA) [32]) and
the regularization technologies [33] could be applied to further
improve the performance.
VI. CONCLUSION
This study introduces that transferring the knowledge within
subject and between subjects simultaneously can help the
recognition method achieve competitive performance with lit-
tle calibration effort. Our results show that the proposed stCCA
method can obtain a high ITR of 198.18±59.12 (bits/min) with
only 9 calibration trials in Dataset I [17] and 111.04±57.24
(bits/min) with only 3 calibration trials in Dataset II [18],
respectively. Of particular interest is that such ITRs are no
significantly different from the state-of-the-art methods, such
as the eTRCA and the msCCA methods that require 4 times
more calibration trials. Consequently, the proposed subject
transfer approach can substantially save a large amount of
calibration effort while maintaining high ITR, which would
facilitate the development of the real-life SSVEP-based BCI
applications in the future.
REFERENCES
[1] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and
T. M. Vaughan, “Brain–Computer Interfaces for Communication and
Control,” Clin. Neurophysiol., vol. 113, no. 6, pp. 767–791, 2002.
[2] X. Gao, D. Xu, M. Cheng, and S. Gao, “A BCI-based Environmental
Controller for the Motion-Disabled,” IEEE Trans. Neural Syst. Rehabil.
Eng., vol. 11, no. 2, pp. 137–140, 2003.
[3] S. Gao, Y. Wang, X. Gao, and B. Hong, “Visual and Auditory Brain-
Computer Interfaces,” IEEE Trans. Biomed. Eng., vol. 61, no. 5, pp.
1436–1447, May 2014.
[4] Y. Wang, R. Wang, X. Gao, B. Hong, and S. Gao, “A Practical VEP-
Based Brain-Computer Interface,” IEEE Trans. Neural Syst. Rehabil.
Eng., vol. 14, no. 2, pp. 234–240, Jun. 2006.
[5] Y.-T. Wang, Y. Wang, and T.-P. Jung, “A Cell-Phone-based Brain–
Computer Interface for Communication in Daily Life,” J. Neural Eng.,
vol. 8, no. 2, p. 025018, 2011.
[6] X. Chen, Y. Wang, M. Nakanishi, X. Gao, T.-P. Jung, and S. Gao, “High-
Speed Spelling with a Noninvasive Brain–Computer Interface,” Proc.
Natl. Acad. Sci. U.S.A., vol. 112, no. 44, pp. E6058–E6067, 2015.
[7] M. Nakanishi, Y. Wang, X. Chen, Y. T. Wang, X. Gao, and T. P. Jung,
“Enhancing Detection of SSVEPs for a High-Speed Brain Speller Using
Task-Related Component Analysis,” IEEE Trans. Biomed. Eng., vol. 65,
no. 1, pp. 104–112, 2018.
[8] G. Bin, X. Gao, Z. Yan, B. Hong, and S. Gao, “An Online Multi-Channel
SSVEP-Based Brain-Computer Interface using a Canonical Correlation
Analysis Method,” J. Neural Eng., vol. 6, no. 4, p. 046002, 2009.
[9] H. Cecotti, “A Self-Paced and Calibration-Less SSVEP-Based Brain–
Computer Interface Speller,” IEEE Trans. Neural Syst. Rehabil. Eng.,
vol. 18, no. 2, pp. 127–133, 2010.
[10] Z. Lin, C. Zhang, W. Wu, and X. Gao, “Frequency Recognition Based on
Canonical Correlation Analysis for SSVEP-Based BCIs,” IEEE Trans.
Biomed. Eng., vol. 53, no. 12, pp. 2610–2614, 2006.
[11] T. Cao, F. Wan, C. M. Wong, J. N. da Cruz, and Y. Hu, “Objective
Evaluation of Fatigue by EEG Spectral Analysis in Steady-State Vis-
ual Evoked Potential-Based Brain-Computer Interfaces,” Biomed. Eng.
Online, vol. 13, no. 1, p. 28, 2014.
[12] C. M. Wong, F. Wan, B. Wang, Z. Wang, W. Nan, K. F. Lao, P. U. Mak,
M. I. Vai, and A. Rosa, “Learning Across Multi-Stimulus Enhances
Target Recognition Methods in SSVEP-Based BCIs,” J. Neural Eng.,
vol. 17, no. 1, p. 016026, 2020.
[13] C. M. Wong, B. Wang, Z. Wang, K. F. Lao, A. Rosa, and F. Wan,
“Spatial Filtering in SSVEP-based BCIs: Unified Framework and New
Improvements,” IEEE Trans. Biomed. Eng., 2020 (In press).
[14] K. Suefusa and T. Tanaka, “Reduced Calibration by Efficient Trans-
formation of Templates for High Speed Hybrid Coded SSVEP Brain-
Computer Interfaces,” in Acoustics, Speech and Signal Processing
(ICASSP), 2017 IEEE International Conference on. IEEE, 2017, pp.
929–933.
[15] P. Yuan, X. Chen, Y. Wang, X. Gao, and S. Gao, “Enhancing Per-
formances of SSVEP-Based Brain-Computer Interfaces via Exploiting
Inter-Subject Information,” J. Neural Eng., vol. 12, no. 4, p. 046006,
2015.
[16] K.-J. Chiang, C.-S. Wei, M. Nakanishi, and T.-P. Jung, “Cross-Subject
Transfer Learning Improves the Practicality of Real-World Applications
of Brain-Computer Interfaces,” in 2019 9th International IEEE/EMBS
Conference on Neural Engineering (NER). IEEE, 2019, pp. 424–427.
[17] Y. Wang, X. Chen, X. Gao, and S. Gao, “A Benchmark Dataset for
SSVEP-Based Brain-Computer Interfaces,” IEEE Trans. Neural Syst.
Rehabil. Eng., vol. 25, no. 10, pp. 1746–1752, 2016.
[18] M. Nakanishi, Y. Wang, Y.-T. Wang, and T.-P. Jung, “A Comparison
Study of Canonical Correlation Analysis Based Methods for Detecting
Steady-State Visual Evoked Potentials,” PloS One, vol. 10, no. 10, p.
e0140703, Oct. 2015.
[19] R. Ku´
s, A. Duszyk, P. Milanowski, M. Łabecki, M. Bierzy´
nska,
Z. Radzikowska, M. Michalska, J. ˙
Zygierewicz, P. Suffczy´
nski, and
P. J. Durka, “On the Quantification of SSVEP Frequency Responses
in Human EEG in Realistic BCI Conditions,” PloS One, vol. 8, no. 10,
p. e77536, 2013.
[20] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Trans.
Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, 2009.
[21] S. J. Pan, J. T. Kwok, Q. Yang et al., “Transfer Learning via Dimen-
sionality Reduction.” in AAAI, vol. 8, 2008, pp. 677–682.
[22] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu, “Boosting for Transfer Learn-
ing,” in Proceedings of the 24th international conference on Machine
learning. ACM, 2007, pp. 193–200.
[23] X. Chen, Y. Wang, S. Gao, T.-P. Jung, and X. Gao, “Filter Bank
Canonical Correlation Analysis for Implementing a High-Speed SSVEP-
Based Brain-Computer Interface,” J. Neural Eng., vol. 12, no. 4, p.
046008, 2015.
[24] S. Haufe, F. Meinecke, K. G¨
orgen, S. D¨
ahne, J.-D. Haynes, B. Blankertz,
and F. Bießmann, “On the Interpretation of Weight Vectors of Linear
Models in Multivariate Neuroimaging,” Neuroimage, vol. 87, pp. 96–
110, 2014.
[25] J. N. da Cruz, F. Wan, C. M. Wong, and T. Cao, “Adaptive Time-
Window Length Based on Online Performance Measurement in SSVEP-
based BCIs,” Neurocomputing, vol. 149, pp. 93–99, 2015.
[26] E. Yin, Z. Zhou, J. Jiang, Y. Yu, and D. Hu, “A Dynamically Optimized
SSVEP Brain–Computer Interface (BCI) Speller,” IEEE Trans. Biomed.
Eng., vol. 62, no. 6, pp. 1447–1456, 2015.
[27] C. Yang, X. Han, Y. Wang, R. Saab, S. Gao, and X. Gao, “A Dynamic
Window Recognition Algorithm for SSVEP-Based Brain-Computer In-
terfaces Using a Spatio-Temporal Equalizer,” Int. J. Neural Syst., vol. 28,
no. 10, p. 1850028, 2018.
[28] M. Sp¨
uler, W. Rosenstiel, and M. Bogdan, “Online Adaptation of a c-
VEP Brain-Computer Interface (BCI) Based on Error-Related Potentials
and Unsupervised Learning,” PloS One, vol. 7, no. 12, p. e51077, 2012.
[29] K. F. Lao, C. M. Wong, Z. Wang, and F. Wan, “Learning Prototype
Spatial Filters for Subject-Independent SSVEP-Based Brain-Computer
Interface,” in 2018 IEEE International Conference on Systems, Man,
and Cybernetics (SMC). IEEE, 2018, pp. 485–490.
[30] A. A. P. Wai, M.-H. Lee, S.-W. Lee, and C. Guan, “Improving the
Performance of SSVEP BCI with Short Response Time by Temporal
Alignments Enhanced CCA,” in 2019 9th International IEEE/EMBS
Conference on Neural Engineering (NER). IEEE, 2019, pp. 155–158.
[31] S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang, “Domain Adaptation
via Transfer Component Analysis,” IEEE Trans. Neural Netw., vol. 22,
no. 2, pp. 199–210, 2011.
[32] M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu, “Transfer Feature
Learning with Joint Distribution Adaptation,” in Proc. 14th Int. Conf.
Comput. Vis., 2013, pp. 2200–2207.
[33] F. Lotte and C. Guan, “Regularizing Common Spatial patterns to
Improve BCI Designs: Unified Theory and New Algorithms,” IEEE
Trans. Biomed. Eng., vol. 58, no. 2, pp. 355–362, 2010.