ThesisPDF Available

Towards Universal EEG systems with minimum channel count based on Machine Learning and Computational Intelligence

Authors:

Abstract and Figures

The aim of this thesis is to move one step forward towards the concept of electroencephalographic (EEG) systems that can achieve the same objectives as high-density EEG with a minimum required number of channels. This requires EEG signal analysis, computational intelligence, and optimization techniques that can systematically identify the minimum number of channels that fulfills the objectives currently achieved with high-density EEG systems. Achieving this goal will pave the way towards the hardware-software realization of user-centric, easy-to-use, readily affordable EEG systems for universal applications. Enabling portability while ensuring performance of comparable or higher quality than that of high-density EEG will expand the accessibility of EEG to non-traditional users and personal applications moving EEG out of the lab. The application horizon will be expanded from experimental research to clinical use, to the gaming industry, intelligence and security sectors, education and daily use by people for self-knowledge. The methods proposed in the thesis comprise the combination of feature extraction techniques and channel selection algorithms with optimization techniques that allow extracting the most essential information from a minimum set of required EEG channels that were tested in two cases-studies: Epileptic seizure classification, and EEG-based biometric systems. The Discrete Wavelet Transform (DWT) and Empirical Mode Decomposition (EMD) were used to decompose EEG signals into different frequency bands and then four features were computed for each sub-band, the Teager and Instantaneous energies and the Higuchi and Petrosian fractal dimensions. For the optimization stage, non-dominated sorting genetic algorithms (NSGA) were used for channel selection, using binary values to represent the channels in the chromosomes, $1$ if the channel is used in the classification and optimization process, and $0$ if not. Additional genes to represent important parameters for the classifiers were added using integer and decimal values. For Case-study 1, NSGA-III selected one or two channels from a set of 22 for epileptic seizure classification, obtaining an accuracy of up to 0.98 and 1.00, respectively, using EMD/DWT-based features. For Case-study 2, a task-independent, resting-state-based biometric system using Local Outlier Factor (LOF)- and DWT-based features showed a True Acceptance Rate (TAR) of up to 0.993±0.01 and a True Rejection Rate (TRR) of up to 0.941±0.002 using only three channels selected by NSGA-III from a set of 64. The results presented herein can be considered to be a first proof-of-concept, showing that it is possible to reduce the number of required EEG channels for classification tasks and opens the way to explore these methods on other neuroparadigms. This will lead to reduced real-time computational costs for EEG signal processing, removing task-irrelevant and redundant information, as well as reducing the preparation time for use of the EEG headsets. The results of such a reduction in the number of required EEG channels will make possible a low-power hardware design, expanding the range of EEG-based applications from clinical diagnosis and research to health-care, to non-medical applications that can improve our understanding of cognitive processes, learning and education and to the discovery of current hidden/unknown properties behind ordinary human activity and ailments.
Content may be subject to copyright.
Luis Alfredo Moctezuma
Towards Universal EEG systems
with minimum channel count
based on Machine Learning and
Computational Intelligence
Doctoral thesis
for the degree of Philosophiae Doctor
Trondheim Norway, August 2021
Norwegian University of Science and Technology
Faculty of Information Technology and Electrical Engineering
Department of Engineering Cybernetics
NTNU
Norwegian University of Science and Technology
Doctoral thesis
for the degree of Philosophiae Doctor
Faculty of Information Technology and Electrical Engineering
Department of Engineering Cybernetics
©2021 Luis Alfredo Moctezuma. All rights reserved
ISBN 978-82-471-9693-9 (printed version)
ISBN 978-82-471-9970-1 (electronic version)
ISSN 1503-8181
Doctoral theses at NTNU,
Printed by NTNU-trykk
i
To my family
ii
Preface
This thesis is submitted in partial fulllment of the requirements for the
degree of Philosophiae Doctor (Ph.D.) at the Norwegian University of Science
and Technology (NTNU). The research was conducted at the Department of
Engineering Cybernetics (ITK) from June 2018 to August 2021.
During this time, I had the opportunity to attend conferences in various
countries and collaborate with other universities, as well as work with Master’s
and Ph.D. students.
My rst words of gratitude are for Professor Marta Molinas for sharing her
time and passion for research with me during these years. Thank you for giving
me the freedom to follow my ideas and for supporting them.
I would also like to thank Andres F. Soler, Erwin Habibzadeh, Chen Zhang,
Alejandro A. Torres, and Pablo Muñoz for sharing their time and ideas. Thank
you to all the sta of NTNU. Your work was essential throughout my studies at
the university.
Thank you to all the anonymous reviewers of my conferences and journal
papers. Their comments were truly useful and they helped me to raise the level of
my work.
Mis ultimas palabras de gratitud son para mi esposa Laura Encarnación, gracias
por soportarme y apoyarme siempre, te amo. Gracias a mi mamá y a mi papá por
darme la vida y por guiarme siempre, sé que no ha sido fácil y que siempre han
dado todo por mí y por mis hermanos.
Luis Alfredo Moctezuma
August 2021, Trondheim Norway
iii
iv
Abstract
The aim of this thesis is to move one step forward towards the concept of
electroencephalographic (EEG) systems that can achieve the same objectives
as high-density EEG with a minimum required number of channels. This requires
EEG signal analysis, computational intelligence, and optimization techniques that
can systematically identify the minimum number of channels that fullls the
objectives currently achieved with high-density EEG systems. Achieving this
goal will pave the way towards the hardware-software realization of user-centric,
easy-to-use, readily aordable EEG systems for universal applications. Enabling
portability while ensuring performance of comparable or higher quality than
that of high-density EEG will expand the accessibility of EEG to non-traditional
users and personal applications moving EEG out of the lab. The application
horizon will be expanded from experimental research to clinical use, to the gaming
industry, intelligence and security sectors, education and daily use by people for
self-knowledge.
The methods proposed in the thesis comprise the combination of feature
extraction techniques and channel selection algorithms with optimization
techniques that allow extracting the most essential information from a minimum
set of required EEG channels that were tested in two cases-studies:
Epileptic
seizure classication
, and
EEG-based biometric systems
. The Discrete
Wavelet Transform (DWT) and Empirical Mode Decomposition (EMD) were used
to decompose EEG signals into dierent frequency bands and then four features
were computed for each sub-band, the Teager and Instantaneous energies and the
Higuchi and Petrosian fractal dimensions.
For the optimization stage, non-dominated sorting genetic algorithms (NSGA)
were used for channel selection, using binary values to represent the channels in
i
ii Abstract
the chromosomes, 1if the channel is used in the classication and optimization
process, and 0if not. Additional genes to represent important parameters for the
classiers were added using integer and decimal values.
For Case-study 1, NSGA-III selected one or two channels from a set of 22
for epileptic seizure classication, obtaining an accuracy of up to 0.98 and 1.00,
respectively, using EMD/DWT-based features.
For Case-study 2, a task-independent, resting-state-based biometric system
using Local Outlier Factor (LOF)- and DWT-based features showed a True
Acceptance Rate (TAR) of up to 0.993
±
0.01 and a True Rejection Rate (TRR) of up
to 0.941±0.002 using only three channels selected by NSGA-III from a set of 64.
The results presented herein can be considered to be a rst proof-of-concept,
showing that it is possible to reduce the number of required EEG channels
for classication tasks and opens the way to explore these methods on other
neuroparadigms. This will lead to reduced real-time computational costs for EEG
signal processing, removing task-irrelevant and redundant information, as well as
reducing the preparation time for use of the EEG headsets.
The results of such a reduction in the number of required EEG channels will
make possible a low-power hardware design, expanding the range of EEG-based
applications from clinical diagnosis and research to health-care, to non-medical
applications that can improve our understanding of cognitive processes, learning
and education and to the discovery of current hidden/unknown properties behind
ordinary human activity and ailments.
Contents
Abstract i
List of Abbreviations vii
List of Tables xi
List of Figures xiii
1 Introduction 1
1.1 Motivations for the research and knowledge gaps ......... 1
1.2 Research Questions and Objectives ................. 3
1.3 Contributions ............................. 5
1.4 Structure of the thesis ......................... 8
2 Fundamentals of Electroencephalography, evolution, and open
challenges 11
2.1 Electroencephalography ....................... 11
2.1.1 Mechanisms of EEG generation ............... 12
2.1.2 Normal and abnormal EEG .................. 12
2.1.3 EEG signal acquisition .................... 16
2.1.4
A brief comparison with other brain signal acquisition
methods ............................ 17
2.1.5 International EEG electrode placement systems ...... 18
2.1.6 Consumer-grade low-density EEG headsets ........ 19
2.1.7 Using brain signals for control purposes .......... 21
2.2 EEG paradigms ............................ 23
2.2.1 Event-related potentials and P300 .............. 23
2.2.2 Resting-state ......................... 24
2.3 Current and future trends in EEG .................. 26
iii
iv CONTENTS
3 Materials and Methods 29
3.1 Improving the signal-to-noise ratio ................. 29
3.2 Data analysis .............................. 31
3.2.1 Empirical Mode Decomposition ............... 31
3.2.2 Discrete Wavelet Transform ................. 34
3.3 Data features .............................. 37
3.3.1 Energy distribution ...................... 37
3.3.2 Fractal dimension ....................... 39
3.4 Computational intelligence methods for classication ....... 42
3.4.1 Multi-class classication ................... 42
3.4.2 One-class classication .................... 43
3.4.3 Evaluation of classier performance ............ 47
3.5 Channel reduction and selection ................... 48
3.5.1 Greedy algorithms ...................... 49
3.5.2 Multi-objective optimization methods ........... 50
3.6 Description of datasets used in the thesis .............. 53
3.6.1 CHB-MIT ........................... 53
3.6.2 EEGMMIDB .......................... 54
3.6.3 P300-speller .......................... 56
3.7 Methods proposed in the thesis .................... 57
3.7.1 Pre-processing, feature extraction and classication . . . . 57
3.7.2 General overview of the proposed method ......... 59
3.8 Hardware and software tools used in the thesis ........... 61
4 Case study 1: Channel count optimization for Epileptic seizure
classication 63
4.1 Introduction .............................. 63
4.2 State-of-the-art ............................. 64
4.3 Denition of the problem to optimize ................ 66
4.4
Channel selection for Epileptic-seizure classication with EMD-
based features ............................. 68
4.5
Channel selection for Epileptic-seizure classication with DWT-
based features ............................. 74
4.6 Discussion ............................... 76
CONTENTS v
5 Case study 2: Channel count optimization for EEG-based
biometric systems 83
5.1 Introduction .............................. 83
5.2 State-of-the-art ............................. 85
5.3 First approach using a two-stage classication process ...... 87
5.3.1 Dening the problem to optimize .............. 89
5.3.2
Solving the four-objective optimization problem using
NSGA-II with subjects 1-13 as non-intruders and 14-26
as intruders. .......................... 90
5.3.3
Solving the four-objective optimization problem using
NSGA-II with subjects 14-26 as non-intruders and subjects
1-13 as intruders. ....................... 91
5.3.4
NSGA-III for solving the four-objective optimization
problem. ............................ 95
5.3.5
Testing the proposal in 10 random subdivisions of subjects
using NSGA-II and NSGA-III. ................ 96
5.4 Discussion ............................... 99
5.5 Second approach, using a one-stage one-class algorithm ...... 101
5.5.1 Dening the problem to optimize .............. 103
5.5.2
Channel selection using NSGA-III and OCSVM for EEG
signals for the resting-state with the eyes open ...... 104
5.5.3
Channel selection using NSGA-III and LOF for EEG signals
for the resting-state with the eyes open ........... 107
5.5.4
Channel selection using NSGA-III and LOF for EEG signals
for the resting-state with the eyes closed .......... 111
5.6 Discussion ............................... 115
6 Conclusions and future work 123
6.1 Summary of ndings ......................... 123
6.1.1
Feature extraction and channel count optimization for
epileptic seizure classication ................ 123
6.1.2
Channel count optimization for EEG-based biometric systems
124
6.2 Conclusion of the thesis contributions ................ 125
6.3 Future work .............................. 127
vi CONTENTS
References 131
List of Abbreviations
2D Two-dimensional.
3D Three-dimensional.
ABC Articial bee colony.
AEMD Adaptive Empirical Mode Decomposition.
BCI Brain-Computer Interfaces.
BFPA Binary ower pollination algorithm.
BSS Blind source separation.
CAR Common Average Reference.
CNN Convolutional neural network.
CNN-GRU
Convolutional neural network gated recurrent
units.
CRR Correct recognition rate.
CT Computerized tomography.
DMD Dynamic mode decomposition.
DT Decision tree.
DWT Discrete Wavelet Transform.
vii
viii List of Abbreviations
Ear-EEG In-the-ear Electroencephalography.
ECG Electrocardiograph.
EEG Electroencephalography.
EEGMMIDB Motor movement/imagery dataset.
EEMD Ensemble Empirical Mode Decomposition.
EMD Empirical Mode Decomposition.
EMG Electromyography.
EWT Empirical wavelet transform.
FAR False acceptance rate.
fMRI Functional magnetic resonance imaging.
FN False negatives.
FP False positives.
FT Fourier transform.
GA Genetic algorithms.
GNMM Genetic neural mathematics method.
HTER Half total error rate.
ICA Independent component analysis.
iEEG Intracranial Electroencephalography.
IMFs Intrinsic Mode Functions.
KNN k-nearest neighbors.
LAP Laplacian Filter.
List of Abbreviations ix
LDA Linear discriminant analysis.
LOF Local Outlier Factor.
LRD Local reachability density.
LS-SVM Least-square support vector machine.
MEG Magnetoencephalography.
MEMD Multivariate Empirical Mode Decomposition.
MI Mutual information.
MOEA/D
Multi-objective evolutionary algorithms based
on decomposition.
MOOP Multi-objective optimization problem.
MRI magnetic resonance imaging.
NB Naive Bayes.
NN Neural networks.
NSGA Non-dominated sorting genetic algorithm.
OCC One-class classication.
OCSVM One-class support vector machine.
PCA Principal component analysis.
PET Positron emitted tomography.
PSR Phase space representation.
RBF Radial basis function.
RF Random Forest.
RSNs Resting-state networks.
xList of Abbreviations
SVM Support vector machine.
TAR True Acceptance Rate.
TIRDA Temporal intermittent rhythmic delta activity.
TLE Temporal-lobe epilepsy.
TN True negatives.
ToC Third-order cumulant.
TP True positives.
TRR True Rejection Rate.
List of Tables
3.1 Details of the epileptic-seizure data presented in [218]. ...... 55
4.1
Accuracy obtained using EMD for feature extraction with NSGA-II
and NSGA-III for EEG channel selection (subjects 1-12). ...... 71
4.2
Accuracy obtained using EMD for feature extraction with NSGA-II
and NSGA-III for EEG channel selection (subjects 13-24). ..... 72
4.3
Accuracy obtained using DWT for feature extraction with NSGA-II
and NSGA-III for EEG channel selection (subjects 1-12). ...... 75
4.4
Accuracy obtained using DWT for feature extraction with NSGA-II
and NSGA-III for EEG channel selection (subjects 13-24). ..... 76
4.5
Comparison of relevant existing methods for epileptic-seizure
classication using the CHB-MIT Scalp EEG dataset presented in
[218]. .................................. 79
4.6
Comparison of several relevant existing methods for epileptic-
seizure classication using dierent datasets. ............ 80
5.1
TAR, TRR, and accuracy for subject
identication and authentication with EEG data from all channels
using dierent nu and gamma values for one-class SVM. ..... 88
5.2
TAR, TRR, and accuracy values obtained for the Pareto-front for
four objectives solved with NSGA-II using subjects 1-13 as non-
intruders. ................................ 93
5.3
TAR, TRR, and accuracy values obtained for the rst 30 EEG
channels in the Pareto-front for four objectives solved with NSGA-
II using subjects 14-26 as non-intruders. ............... 94
xi
xii LIST OF TABLES
5.4
TAR, TRR, and accuracy values obtained in the Pareto-front when
using 7-15 EEG channels with four objectives solved with NSGA-
III using subjects 1-13 as non-intrudes and 14-26 as intruders and
vice-versa. ............................... 96
5.5
Mean TAR, TRR, and accuracy values obtained in the Pareto-front
when using 7-15 EEG channels validated in 10 random subdivisions
of all the subjects, using 50% as intruders and 50% as non-intruders.
98
5.6
Average TARs and TRRs for subject detection with EEG data
from 64 channels and 109 subjects using dierent parameters for
OCSVM and LOF, with EMD- and DWT-based features. ...... 102
5.7
TARs and TRRs obtained for the rst ve EEG channels in the
Pareto-front for three objectives solved with NSGA-III using EMD-
and DWT-based features with OCSVM. ............... 105
5.8
TARs and TRRs obtained for the rst seven EEG channels in the
Pareto-front for three objectives solved with NSGA-III using EMD-
based and DWT-based features and LOF. .............. 110
5.9
TARs and TRRs obtained with LOF for the rst seven EEG channels
in the Pareto-front for three objectives solved with NSGA-III using
EMD- or DWT-based features and the resting-state with the eyes
closed. ................................. 114
List of Figures
1.1 Flowchart of contributions of papers to each Research Question. . 5
1.2
General overview of the methodology and contributions to the
thesis. .................................. 10
2.1 EEG electrode placement methods: bipolar (a) and monopolar (b). 16
2.2
The original gure illustrating the international 10-20 system.
Note that the electrodes are erroneously located inside the skull
on the surface of the cortex [2]. ................... 19
2.3
Timeline of the evolution of EEG systems and relevant consumer-
grade wearable EEG headsets. .................... 20
2.4
FlexEEG concept. FlexEEG moves from
X1
to
X2
to capture sources
S1and S2[58]. ............................. 22
2.5
Schematic representation of certain ERP components after the
onset of a visual stimulus [72]. .................... 24
2.6
Topography of four microstate maps from [
92
]. Map areas of
opposite polarity are coded in red and blue using a linear color
scale. The left ear is to the left and the nose is at the top ...... 26
3.1 Stages of the methodology followed in the thesis. ......... 30
3.2
IMFs plus residue (Sub-g. 3.2a) obtained from the synthetic signal
presented in sub-g. 3.2b, as well as the reconstructed signal using
all the IMFs (Sub-g. 3.2c) and three IMFs selected using the
Minkowski distance plus the residue (Sub-g. 3.2d). ........ 35
xiii
xiv LIST OF FIGURES
3.3
Details and approximation coecients extracted from the original
signal using DWT with four levels of decomposition and the
mother wavelet biorthogonal 1.3. .................. 38
3.4
Teager and Instantaneous energy distribution of EMD and DWT
sub-bands from Figs. 3.2 and 3.3. ................... 40
3.5
Higuchi and Petrosian fractal dimension of EMD and DWT sub-
bands from Figs. 3.2 and 3.3. ..................... 41
3.6 Decision boundaries in OCSVM for a random dataset with outliers 45
3.7 Decision boundaries with LOF for a random dataset with outliers 46
3.8 An illustrative example of the NSGA-II procedure [211]. ...... 52
3.9
Reference points of NSGA-III in a three-objective optimization
problem. ................................ 53
3.10
Example of the raw EEG data of C3-P3, T7-FT9 and C4-P4 channels
from the third instance of Patient 1 of the CHB-MIT dataset. . . . 54
3.11
Example of the raw EEG data of F5, T8 and T10 channels of the
rst instance of subject 1 of the EEGMMIDB dataset. ....... 56
3.12
Protocol design for recording positive or negative feedback-related
responses in the P300-speller dataset [220]. ............. 57
3.13
Example of the raw EEG data of P7, P8 and T8 channels of the rst
instance of subject 1 of the P300-speller dataset. .......... 58
3.14 Flowchart summarizing feature extraction using DWT. ...... 59
3.15
Flowchart summarizing the feature extraction procedure using EMD.
59
3.16 Flowchart of the procedure followed for EEG signal classication. 59
3.17
Example of chromosome representation and owchart of the
optimization process for parameter optimization and EEG channel
selection using NSGA-III. ....................... 60
4.1
Complete process for EEG channel selection using NSGA-II or
NSGA-III for epileptic-seizure classication. ............ 67
4.2
EEG Channel Selection for epileptic seizure classication of patient
1 using EMD-based features. Comparison between NSGA-II and
the backward-elimination algorithm. ................ 69
4.3
Four EEG Channel subsets selected by NSGA-II (
a)
) and backward-
elimination (b)) for epileptic-seizure classication in patient 1. . . 70
LIST OF FIGURES xv
4.4
EEG Channel selection for epileptic-seizure classication of patient
19 using EMD-based features. Comparison between NSGA-III and
the backward-elimination algorithm. ................ 73
4.5
Comparison of the most used classiers by NSGA-II (left) and
NSGA-III (right) for the 24 patients using EMD-based feature
extraction. ............................... 73
4.6
Comparison of the most-used classiers by NSGA-II (left) and
NSGA-III (right) for the 24 patients using DWT-based feature
extraction. ............................... 77
5.1
Flowchart of the rst approach for intruder detection and subject
identication. ............................. 88
5.2
Example of the complete process for EEG channel selection using
NSGA-II, including the chromosome representation using 56 genes
for the EEG channels and eight for the nu and gamma parameters. 90
5.3
Four dierent views of the results obtained with NSGA-II using
subjects 1-13 as non-intruders and 14-26 as intruders. ....... 92
5.4
Relevant EEG channel subsets in the Pareto-front for four
objectives using NSGA-II, considering subjects 14-26 as intruders
in the previous experiment and subjects 1-13 as intruders in the
current experiment. .......................... 95
5.5
Relevant EEG channel subsets in the Pareto-front for four
objectives using NSGA-III, considering subjects 14-26 as intruders
in the previous experiment and subjects 1-13 as intruders in current
experiment. ............................... 97
5.6
TARs and TRRs obtained using various numbers of neighbors with
the LOF k-d tree algorithm and DWT-based features. ....... 103
5.7
Chromosome representation and owchart of the optimization
process for EEG channel selection using NSGA-III and LOF. . . . . 104
5.8
Frontal and aerial view of the TARs and TRRs obtained in the
channel-selection process using EMD-based features (
a)
) and
DWT-based features (b)) with OCSVM. ............... 106
xvi LIST OF FIGURES
5.9
Set of one to ve channels found during the optimization process
for creating the biometric system with OCSVM using EMD-based
features (a)) or DWT-based features(b)) and the resting-state with
the eyes open. ............................. 108
5.10
Frontal and aerial view of the TARs and TRRs obtained in the
channel-selection process using EMD-based features (
a)
), and
DWT-based features (b)) with LOF. ................. 109
5.11
Average distribution of the algorithms and number of neighbors
used in the optimization process with EMD-based features (
a)
) and
DWT-based features (b)). ....................... 110
5.12
Average distribution of the algorithms and number of neighbors
used for the results in the Pareto-front of the optimization process
with EMD-based features (a)) and DWT-based features (b)). . . . 111
5.13
Set of one to seven channels found during the optimization process
for creating the biometric system with LOF and EMD-based
features (a)) or DWT-based features(b)) for the resting-state with
the eyes open. ............................. 112
5.14
Frontal and aerial view of the TARs and TRRs obtained in
the channel-selection process using EMD- (
a)
) and DWT-based
features (b)) for the resting-state with the eyes closed, using LOF. 113
5.15
Average distribution of the algorithms and number of neighbors
used in the optimization process with EMD-based features (a)) and
DWT-based features (b)) using EEG signals for the resting-state
with the eyes closed. .......................... 114
5.16
Average distribution of the algorithms and number of neighbors
used for the results in the Pareto-front of the optimization process
with EMD-based features (a)) and DWT-based features (b)) using
EEG signals for the resting-state with the eyes closed. ....... 115
5.17
Set of one to seven channels found during the optimization process
for creating the biometric system with LOF using EMD-based
features (a)) or DWT-based features(b)) and the resting-state with
the eyes closed. ............................ 116
Chapter 1
Introduction
The objective of this thesis is to move one step forward towards a concept of
electroencephalographic (EEG) systems, with a minimum number of channels, that
can contribute to the realization of low-cost real-time applications, thus enabling the
portability of EEG headsets while retaining quality comparable to, or higher than, that
of high-density EEG-based systems. This requires EEG signal analysis, computational
intelligence, and optimization techniques that can systematically identify a minimum
number of EEG channels that fulll the objectives currently achieved using high-
density EEG systems. To this end, the thesis proposes to systematically apply greedy
algorithms and multi-objective optimization methods for which targeted algorithms
were developed and implemented to solve the problem of channel selection and
parameter optimization.
This Ph.D. research is part of a larger project,
David and Goliath: single-
channel EEG unravels its power through adaptive signal analysis
, which
aims to identify an optimal minimum EEG channel count for wearable EEG solutions
for universal applications. This thesis contributes to this goal by achieving one of the
three objectives of David and Goliath: Optimization-based channel reduction.
This Chapter provides an overview of the main contributions of the thesis,
including an overview of the publications associated with the work.
1.1 Motivations for the research and knowledge gaps
Consumer-wearable EEG technologies have experienced steady growth, with a
growing number of devices with a reduced number of EEG channels available
for personal uses, such as meditation, relaxation training, motor imagery, and
1
2Introduction
the control of moving objects [
1
]. As a result, people today can measure their
own brain signals outside medical laboratories due to the proliferation of low-cost
wireless headset EEG devices with varying numbers and congurations of EEG
channels, with dry or wet electrodes, using the 10-5, 10-10, or 10-20 international
system [25].
There are a number of critical open issues (i.e., real-time use, quality of
recordings, portability, ease-of-use, and user orientation) that are as yet unexplored
[
6
]. One of the unexplored aspects that can inuence these issues is electrode
placement, which in most EEG devices is xed and inexible, depending on
the targeted application/s. For real-time applications, high-quality/high-density
EEG devices are computationally costly and the applications are very limited.
The existing wireless portable devices, with xed electrode placement, also have
limitations. Depending on the related task, neuro-paradigm used, and age and
sex of the subject, the most relevant features of brain signals may be obtained at
locations dierent from those of the electrodes in the scalp [710].
Most EEG devices available on the market were designed for a set of related
tasks and neuro-paradigms and in general, are found to be reliable only within the
context of such tasks and neuro-paradigms. The accuracy and reliability of these
systems for prolonged and repeated measurements have not been well-established
and a rigorous comparative investigation of the dierent portable solutions is not
yet available. Most importantly, it is not clear whether the limited number of
channels and their xed localization can provide sucient data and anatomical
coverage to obtain the neural signatures necessary for the given tasks, as these
concepts are not supported by openly available research. They are based on
proprietary technology backed by protected research or IP not available to the
public. Essentially, this is because both electrode localization and the number of
electrodes are task-dependent [
1
,
7
,
11
]. Moreover, these commercial solutions are
intended to only support the tasks/paradigms for which they were designed.
The current state-of-the-art consists of methods to decompose and extract
information from brain signals using wet or dry EEG electrodes. However,
the behavior of brain signals varies depending on the neuro-paradigm, the
technology of the device, and the specic characteristics of the subject (culture, age,
IQ/cognition level, sex, etc.) [
7
]. In addition, because of the non-stationary/non-
1.2. Research Questions and Objectives 3
linear nature of brain signals, it is necessary to create a method with multiple
sub-steps to extract the most essential features that can help identify the targeted
tasks (e.g., event detection and classication). If such advances are plausible, the
performance of Brain-Computer Interfaces (BCI) can increase and applications
will span-new areas of research, from medical applications to industrial security
systems.
The major motivations and objectives behind the reported research work in
this thesis are based on the following knowledge gaps that were identied based
on the literature review in Chapter 3,4, and 5.
Knowledge gap 1:
High-density EEG is challenged by high computational
cost, immobility of the equipment, and the use of inconvenient conductive
gels. Several studies have explored reducing the number of electrodes
required for a certain task and electrode placement towards real-time EEG
signal processing. Most were based on a priori or empirical knowledge.
Consolidated studies based on systematic searches aiming to reduce the
EEG channel count required for a given task are not currently available.
Such an approach can be achieved by applying systematic search algorithms
and optimization techniques for identifying the most relevant electrode
position/placement for a given paradigm.
Knowledge gap 2:
There is currently insucient knowledge of feature
extraction for better representation of low-density EEG signals that can
also reduce the computational cost. Most research on feature extraction has
been based on high-density EEG.
Knowledge gap 3:
There are several proposed methods for feature
extraction and classication in the state-of-the-art, but they are used for
specic tasks and the results may vary for dierent tasks. In other words,
the methods are neither generalized nor replicable for dierent applications.
1.2 Research estions and Objectives
The objective of this thesis is the analysis of EEG signals with high-density and
low-density channel arrays to compare their performance in two case studies:
Epileptic seizure classication
and
EEG-based biometric systems
. For this
4Introduction
objective, it was necessary to create various algorithms for channel reduction and
selection to ensure a reliable method to extract the most relevant information
from the raw EEG signals.
The data used in the experiments were extracted from public repositories to
ensure the quality of the analysis. The stages of the methodology include noise
removal, feature extraction, optimization techniques, which were all explored and
combined to eectively represent large raw EEG signals for classication tasks.
These steps aim to improve the quality and response time of the machine-learning
based models.
Based on the analysis of the knowledge gaps presented, the thesis
concentrated on the following three Research Questions:
Research Question 1: Channel Dimensionality Reduction
Can the
number of EEG channels required for classication tasks be reduced while
increasing, or at least maintaining, the accuracy relative to the use of high-
density EEG?
Research Question 2: Data Dimensionality Reduction
Can a few useful
features be sucient to eectively represent large raw EEG signals for
classication and thus accelerate the computational performance of the
used methods for classifying dierent tasks?
Research Question 3: Generalizing the Methodology
Can the same
process of feature extraction, classication, and channel selection be
generalized or at least used (expand the methodology) for dierent problems
related to the classication of EEG signals (i.e., task-dependent and task-
independent)?
Testing state-of-the-art methods on certain specic problems and conditions
will make it possible to propose new methods to tackle the feature extraction
and dimensionality-reduction problem associated with EEG signals. Then, if the
number of required channels can be reduced, it will be possible to draw certain
conclusions and entertain the possibility of a new type of EEG headset. During
this process, it will be necessary to repeat the methodology for dierent task-
dependent and task-independent neuro-paradigms using EEG signals and analyze
their behavior, trying to draw more general conclusions.
1.3. Contributions 5
Figure 1.1: Flowchart of contributions of papers to each Research Question.
1.3 Contributions
Fig. 1.1, presents a owchart of the contributions to the thesis for each research
question. Paper 8 presented the rst approach using a feature extraction process
based on the Empirical Mode Decomposition (EMD), which was later compared
to the second approach of the thesis, consisting of features based on the Discrete
Wavelet Transform (DWT), introduced in Paper 6. This connection is indicated by
the red rectangles and arrows. The method presented in paper 8 was used in most
of the subsequently published papers, indicated by the arrows connecting the
papers that contributed to Research Question 3. All the papers presented in Fig. 1.1
contributed to the achievement of the objectives, but papers 1, 2, and 3 presented
the nal contributions, as they presented the use of greedy and non-dominated
sorting genetic algorithm (NSGA)-based algorithms for channel selection and
parameter optimization, and are the most relevant contributions to this thesis.
The following articles and conference papers were published during the Ph.D.
and are directly related to the thesis:
6Introduction
Journal articles
1.
Moctezuma, Luis Alfredo, Marta Molinas. "Towards a minimal EEG channel
array for a biometric system using resting-state and a genetic algorithm
for channel selection". Scientic Reports (2020). DOI: 10.1038/s41598-020-
72051-1
2.
Moctezuma, Luis Alfredo, Marta Molinas. "EEG Channel-selection method
for epileptic-seizure classication based on multi-objective optimization".
Frontiers in neuroscience (2020). DOI: 10.3389/fnins.2020.00593
3.
Moctezuma, Luis Alfredo, Marta Molinas. "Multi-objective optimization for
EEG channel selection and accurate intruder detection in an EEG-based
subject identication system". Scientic Reports (2020). DOI: 10.1038/s41598-
020-62712-6
4.
Moctezuma, Luis Alfredo, Marta Molinas. "Classication of low-density EEG
epileptic seizures by energy and fractal features based on EMD". Journal of
Biomedical Research (2019). DOI: 10.7555/JBR.33.20190009
Peer-reviewed Conferences
5.
Moctezuma, Luis Alfredo, and Marta Molinas. “Event-related potential
from EEG for a two-step Identity Authentication System”. IEEE
international conference on industrial informatics, indin’19 (2019):. DOI:
10.1109/INDIN41052.2019.8972231
6.
Moctezuma, Luis Alfredo, and Marta Molinas. “Subject identication from
low-density EEG-recordings of resting-states: A study of feature extraction
and classication”. In Future of Information and Communication Conference
(FICC), 2019:. DOI: 10.1007/978-3-030-12385-7_57
7.
Moctezuma, Luis Alfredo, and Marta Molinas. “Sex dierences observed in
a study of EEG of linguistic activity and resting-state: Exploring optimal
EEG channel congurations”. In the 7th International Winter Conference
on Brain-Computer Interface, 2019. DOI: 10.1109/IWW-BCI.2019.8737312
8.
Moctezuma, Luis Alfredo, and Marta Molinas. “EEG-based Subjects
Identication based on Biometrics of Imagined Speech using EMD”. In
International Conference on Brain Informatics. Springer, Cham, 2018:. DOI:
10.1007/978-3-030-05587-5_43
1.3. Contributions 7
Peer-reviewed abstracts
9.
Soler-Guevara, Andres Felipe,
Luis Alfredo Moctezuma
, Eduardo Giraldo,
Marta Molinas. “EEG channel-selection method based on NSGA-II for source
localization”. The 4
th
HBP Student Conference on Interdisciplinary Brain
Research (2020):.
10.
Moctezuma, Luis Alfredo, Andres Felipe Soler, Erwin H. T. Shad, Marta
Molinas, Alejandro A. Torres-Garcia. “David versus Goliath: Low-density
EEG unravels its power through adaptive signal analysis - FlexEEG”. The
4th HBP Student Conference on Interdisciplinary Brain Research (2020):.
Book Chapters
11.
Moctezuma, Luis Alfredo, and Marta Molinas. “EEG-based subject
identication with multi-class classication”. In Biosignal Processing and
Classication using Computational Learning and Intelligence (2020). (In
press)
12.
Torres-Garcia Alejandro A., Omar Mendoza-Montoya, Marta Molinas,
Mauricio Antelis,
Luis Alfredo Moctezuma
. “Pre-processing and Feature
Extraction”. In Biosignal Processing and Classication using Computational
Learning and Intelligence (2020). (In press)
Other contributions
Contributions written during the Ph.D. but not directly related to the thesis:
Peer-reviewed Conferences
13.
Alejandro A. Torres-Garcia,
Luis Alfredo Moctezuma
and Marta Molinas.
“Assessing the impact of idle state type on the identication of RGB color
exposure for BCI”. In 13th International Joint Conference on Biomedical
Engineering Systems and Technologies (2020):. 10.5220/0008923101870194
14.
Torres-Garcia Alejandro A.,
Luis Alfredo Moctezuma
, Sara Asly and
Marta Molinas. “Discriminating between color exposure and idle
state using EEG signals for BCI application”. In 7-th edition of the
International Conference on e-Health and Bioengineering (2019):. DOI:
10.1109/EHB47216.2019.8969919
8Introduction
15.
Asly, Sara,
Luis Alfredo Moctezuma
, Monika Gilde, Marta Molinas.
“Towards EEG-based signals classication of RGB color-based stimuli”. In 8th
Graz Brain-Computer Interface Conference 2019 (2019):. DOI: 10.3217/978-
3-85125-682-6-61
16.
Moctezuma, Luis Alfredo, Marta Molinas, AA Torres Garcia, Luis Villaseñor
Pineda, and Maya Carrillo. “Towards an API for EEG-based imagined speech
classication”. In International Conference on Time Series and Forecasting.
2018:. Proceedings at itise.ugr.es/ITISE2018_Papers_Vol_3.pdf
Peer-reviewed abstracts
17.
Torres-Garcia Alejandro A., Marta Molinas,
Luis Alfredo Moctezuma
.
“Towards a BCI based on Color Exposure Recognition”. The 4
th
HBP Student
Conference on Interdisciplinary Brain Research (2020):.
1.4 Structure of the thesis
Chapter 1introduces the work in this thesis and the knowledge gaps and research
motivations are listed. The contributions to the thesis are presented in a owchart,
showing how the published papers are connected to the dened research questions.
Finally, a list of the results published separately in journals, conference papers,
and abstracts is presented, including contributions directly related to the thesis,
as well as published results not directly related to the objective of the thesis.
In Chapter 2, the fundamentals of EEG, a brief history of EEG and EEG signal
analysis, international EEG standards, and the two paradigms of interest for this
thesis are presented, which are event-related potentials (ERPs) and the resting-
state.
Chapter 3presents the fundamentals of the methods used for EEG signal
analysis, which include EMD and DWT and the reasons for choosing them in
this study. This is followed by a presentation of how the energy distribution and
fractal dimension feature functions in the context of feature extraction. Then,
the multi-class and one-class classiers tested and the metrics for evaluating
performance are presented. A description of NSGA and how it is used for solving
multi-objective optimization problems is provided in this Chapter.
The description of the datasets used in the two investigated scenarios are also
presented in Chapter 3, in which a general owchart of the proposed methodology
1.4. Structure of the thesis 9
for feature extraction, classication, and optimization process handled by NSGA
algorithms is presented and explained.
Chapter 4presents Case-study 1, which is focused on validation of the methods
for channel count minimization in a case of epileptic seizure classication using
multi-class classication. Two dierent approaches for representing the epileptic-
seizure and seizure-free EEG signals are presented. The rst approach is based
on DWT and the second EMD. Using these two approaches, the EEG data is
decomposed into dierent frequency sub-bands and then a set of four features per
sub-band is calculated. Once this is carried out, a multi-objective optimization
process is organized and solved using NSGA-II and NSGA-III. The objective of the
optimization process is to increase the accuracy of the machine-learning models
for classication of epileptic seizures and seizure-free periods while decreasing the
number of required EEG channels. Finally, a discussion about the results obtained
is presented and they are compared with those of other approaches using the same
datasets and other datasets.
Case-study 2, which consists of a proposal for a biometric system with minimal
channel count, is presented in Chapter 5. Two dierent approaches are presented,
a two-stage approach consisting of a multi-class classication layer and then a
one-class classier, and a second approach using only one-class classiers. The
experiments are compared using dierent methods for feature extraction and
NSGA-II or NSGA-III for solving the optimization process. As in Chapter 4, the
work in Chapter 5also has the objective of minimizing or reducing the number of
required EEG channels while increasing or maintaining classication accuracy,
which in this case consist of increasing the True Acceptance Rate (TAR) of the
subjects with access and the True Rejection Rate (TRR) of intruders.
Finally, Chapter 6presents the conclusions of the thesis and identies
opportunities for further work.
Fig. 1.2, presents an overview of the methods proposed and used to achieve
the objectives of the thesis. As will be explained later, all the EEG datasets used
are freely available to the public at no cost, but the number of subjects, the number
of channels, etc., were considered to select them (
a)
). In the feature extraction
stage (
b)
), two methods were used to decompose the EEG signals into dierent
frequency bands and then a set of four features were calculated to obtain a single
10 Introduction
Figure 1.2: General overview of the methodology and contributions to the thesis.
feature vector for each instance. Then, depending on the case study, one-class
or multi-class classiers were developed and validated. In each case, dierent
methods were used to compare their performance (
c)
). During this work, four
dierent methods for channel reduction and selection were developed. This stage
in the methodology (
d)
) is the main focus of the thesis and, therefore, is where
the main contributions of the thesis can be found.
Chapter 2
Fundamentals of
Electroencephalography,
evolution, and open challenges
This Chapter presents the main concepts related to EEG signals, signal analysis,
the evolution of EEG technology, the two paradigms of interest for this thesis, and open
challenges related to applications such as brain-computer interfaces, neurofeedback,
ambulatory EEG, etc.
2.1 Electroencephalography
EEG is an electrophysiological monitoring method that measures the electrical
activity generated by the synchronized activity of thousands of neurons of the
brain via intracranial electrodes or electrodes placed on the scalp surface, i.e., using
invasive or non-invasive methods. The rst known neurophysiological recordings
were made by Richard Caton in 1875, when he presented his ndings on the
electrical phenomena of the exposed cerebral hemispheres of rabbits and monkeys
[
12
,
13
]. In 1890, Adolf Beck published an investigation on the spontaneous
electrical activity of the brain of rabbits and dogs, which included rhythmic
oscillations altered by light [
14
,
15
]. Later, in 1924, Hans Berger recorded the rst
human EEG [13,16].
Hans Berger described EEG in 1929 with the promise that it would be a
technique that provides a “window into the brain” [
16
]. Recent progress in EEG
sensors and methods for signal analysis have made this window more transparent
11
12 Fundamentals of Electroencephalography, evolution, and open challenges
but the analytic potential and potential applications of EEG have not yet been
fully exploited [17].
2.1.1 Mechanisms of EEG generation
Most of the electrical activity recorded in an EEG is generated by groups of
well-aligned cortical pyramidal neurons that re together and are oriented
perpendicular to the surface of the brain, as well as near the scalp where the
recording electrodes are placed. Each scalp electrode collects an estimated
synchronous cortical activity of at least 6cm2[18].
The neural/electrical activity detectable by EEG is the sum of the excitatory
and inhibitory postsynaptic potentials from thousands of pyramidal cells ring
synchronously near each recording electrode. If the cells do not have a similar
spatial orientation, their ions do not line up and thus do not create detectable
waves. This summed activity can be represented as a eld with positive and
negative poles (dipole). The dipole vector is parallel to the orientation of the
pyramidal cells that generate the activity [
18
,
19
]. Negative dipoles are mostly
detected when they are perpendicular and pointed directly at a recording electrode.
The positive end of the dipole is subcortical and thus can be recorded only with
deep electrodes (e.g., by intracranial EEG) [20].
Conventional scalp EEG is unable to record spontaneous changes in local eld
potential arising from neuronal action potentials. Because voltage elds fall o
with the square of distance, activity from deep sources is more dicult to detect
than currents near the skull [18,20].
Cerebral voltages must traverse the brain, cerebrospinal uid, meninges,
skull, and skin prior to reaching the recording site where they can be detected.
Cortical synaptic action generates electrical signals that change in the 10- to 100-
millisecond range. EEG and magnetoencephalography (MEG) are the only widely
available technologies with sucient temporal resolution to follow such rapid
dynamic changes.
2.1.2 Normal and abnormal EEG
The electrical activity measured by EEG is caused by the activation of neurons,
but if these neurons are activated abnormally, sudden impulses can occur, which
are dened as seizures. An EEG waveform is normal when the EEG recording
2.1. Electroencephalography 13
does not show unusual seizures. The waveform exhibits unusual characteristics,
such as frequent, long, or continuous seizures, when the subject is aected by a
tumor or brain disorder [18,21].
Abnormal activity can be separated into epileptiform and non-epileptic activity.
Focal abnormal non-epileptiform activity can occur in areas of the brain where
there is focal damage to the cortex or white matter. It consists of an increase
in slow-frequency rhythms and/or a loss of normal higher frequency rhythms
[21,22].
EEG waveforms are generally classied according to their frequency,
amplitude, and shape, but the most familiar classication uses the EEG waveform
frequency. This EEG waveform information is dependent on the subject’s age and
state of alertness and location of the electrodes on the scalp.
2.1.2.1 EEG frequency bands
The frequency of the EEG waveforms is important because the predominant
frequencies vary according to the subject’s condition. Frequency bands are
typically within the range of 0.5 to 32 Hz. However, these frequency bands
may vary slightly depending on the laboratory/headset and can be broken down
into more limited components as required by the research or clinical question.
There are ve commonly used frequency bands that are examined by spectral
analysis; alpha, beta, theta, delta, and gamma. However, there is no consensus
in the literature on what the ranges should be. For example, the values for the
upper end of alpha and the lower end of beta include 12, 13, 14, and 15 Hz [
18
,
23
].
Frequencies above 25 Hz are not commonly found on scalp EEG, but can be seen
arising directly from the cortical surface during intracranial recordings; these
frequencies are called gamma and are divided into low (25
70
Hz
) and high
gamma (
>
70
Hz
) [
18
,
24
,
25
]. Below, a brief overview of the ve main frequency
bands, including important points and frequency ranges, is presented.
Delta:
frequency range of 0.5-4 Hz. This activity is positively associated
with the homeostatic sleep drive in such a way that it increases
concomitantly with increasing time spent awake [
26
]. It tends to have
the highest amplitude and the slowest waves. It is seen normally in adults
in slow-wave sleep. Temporal intermittent rhythmic delta activity (TIRDA)
14 Fundamentals of Electroencephalography, evolution, and open challenges
is frequently seen in individuals who have temporal lobe epilepsy [27].
Theta:
frequency range of 4-8 Hz. This activity is similar to delta activity
and is positively associated with the homeostatic sleep drive [
26
]. It has been
associated with reports of relaxed, meditative, and creative states. Excess
theta activity for age represents abnormal activity, and focal theta activity
during awake states is suggestive of focal cerebral dysfunction [28].
Alpha:
frequency range of 8-12 Hz. This activity is positively associated
with relaxed wakefulness and drowsiness associated with the onset of sleep,
and is also present during REM sleep [
29
31
]. Hans Berger named the
rst rhythmic EEG activity he observed the “alpha wave”. Deceleration
of the background alpha rhythm is considered to be a sign of generalized
brain dysfunction [
32
]. The amplitude of the alpha rhythm varies between
individuals, as well as at dierent times in the same individual [
31
]. It is best
seen with the eyes closed and during mental relaxation and is attenuated
by eye-opening and mental eort.
Beta:
frequency range of 13-30 Hz. This activity is the dominant rhythm of
subjects who are alert or anxious or who have their eyes-open. It is the most
frequently seen rhythm in normal adults and children and is associated
with physiological arousal and psychological stress [
33
]. This activity is
closely linked to motor behavior and is generally attenuated during active
movement [
34
]. The amplitude of beta activity is typically 10-20
µV
, rarely
increasing above 30 µV.
Gamma:
frequency range of approximately 30-100 Hz, consisting of
ripples (80 to 200 Hz) and fast ripples (200 to 500 Hz). Ultra-fast EEG
activity correlates with cognitive states and ERPs. It has been attributed
to sensory perception that integrates dierent areas. There has been
extensive research on high-frequency oscillations, particularly in relation
to epilepsy [
24
,
25
,
35
]. Epileptic foci are known to generate very high-
frequency episodes of activity. Intracranial depth recordings of the epileptic
hippocampus have reported ultra-fast frequency bursts or fast waves,
which probably correlate with the local epileptogenicity of brain tissue
2.1. Electroencephalography 15
[
35
]. Subdural recordings during presurgical evaluation of epilepsy have
demonstrated that activity bursts at a relatively lower frequency range (60
to 100 Hz) may likewise indicate the location of an epileptic focus [28,35].
2.1.2.2 Artifacts
Electrical signals detected on the scalp by an EEG sensor, but which are non-
cerebral in origin, are called artifacts. Artifacts originate from both physiological
and non-physiological sources, of which physiological artifacts arise from a variety
of bodily activities and non-physiological artifacts from outside the human body
[3638].
The most highly studied artifacts include
eye-induced artifacts
, which
include eye blinks, eye movements, and extra-ocular muscle activity,
electrocardiograph (ECG) artifacts
, which are related to heart beat (cardiac
electrical activity),
electromyography (EMG)-induced artifacts
, which are
related to muscle activation, and
glossokinetic artifacts
from tongue movement.
Respiration can also cause artifacts by introducing rhythmic activity that is
synchronized with the respiratory movements of the body. Skin responses, such
as sweating, can alter the impedance of the electrodes and cause artifacts in EEG
signals [18,37,39].
Certain artifacts are essential for understanding brain function but many are
not and limit the interpretation of the EEG. Artifact removal is the process of
identifying and removing artifacts from brain signals. This can be accomplished by
applying frequency-band and spatial lters but artifacts can overlap with the signal
of interest in the spectral domain. An artifact-removal method should be able to
remove the artifacts while keeping the related neurological phenomenon intact.
The rst step in managing artifacts is to prevent them from occurring by issuing
proper instructions to users. For example, users are instructed to avoid blinking
or moving their body during data collection. Some of the common methods for
removing artifacts in EEG signals are linear ltering, linear combination and
regression, blind source separation (BSS), independent component analysis (ICA),
and principal component analysis (PCA) [3740].
16 Fundamentals of Electroencephalography, evolution, and open challenges
Figure 2.1: EEG electrode placement methods: bipolar (a) and monopolar (b).
2.1.3 EEG signal acquisition
EEG uses the principle of dierential amplication, or recording of voltage
dierences between dierent points using a pair of electrodes that compares
an active scanning electrode site with another neighboring or distant reference
electrode. This can be accomplished using monopolar or bipolar recordings, in
which measuring dierences in electrical potential generates detectable EEG
waveforms [41,42].
The dierence between monopolar and bipolar recordings is the location of
the electrodes. In bipolar recordings, the electrodes are both placed on the scalp,
i.e., in the area of interest, whereas in the monopolar electrode placement method,
one of the measurement electrodes is placed on the scalp and the other is located
away from the area of interest (see Fig. 2.1).
In both cases, the amplier captures the dierence between the respective
activity at each site. Both are in fact bipolar recordings, in the sense that there
are two inputs to the amplier. When the second electrode is placed on an EEG
neutral site, the recording is considered to be monopolar (also know as referential),
because only one site is believed to be capturing the EEG data. If both electrodes
are placed over sites that capture active EEG data, the recording is called bipolar
(also called sequential or dierential) [42].
There are several reasons why monopolar recordings are recommended for
surface EEG recordings. One reason is, because the bipolar or dierential amplier
rejects everything that is common to both electrodes, it will reject any common
EEG activity, which is far less present in monopolar recordings. Another reason
is that a bipolar recording can be derived from a monopolar recording using
simple arithmetic, whereas a bipolar recording can never be transformed into a
2.1. Electroencephalography 17
monopolar one [43].
2.1.4 A brief comparison with other brain signal acquisition
methods
There are several brain-imaging methods available for neuroscientists and
researchers. These imaging modalities can be divided into structural and functional
imaging techniques. They all allow the study of brain structures and their function
but dier in the spatial and temporal resolution at which connectivity is captured.
Structural imaging provides details on the morphology and structure of tissues,
whereas functional imaging reveals physiological activities, such as changes in
metabolism, blood ow, regional chemical composition, and absorption.
Non-invasive EEG and MEG reect the average activity of dendritic currents in
a large population of cells. The temporal resolution of EEG and MEG for measuring
changes in neuronal activity is very good, typically on the order of milliseconds,
but the spatial resolution for determining the precise position of active sources
in the brain is poor relative to modern imaging methods, such as computerized
tomography (CT), positron emitted tomography (PET), and magnetic resonance
imaging (MRI) [17,44].
Despite its limited spatial resolution, EEG is still a valuable tool for research and
diagnosis. It is one of the few mobile techniques available and oers millisecond-
range temporal resolution that is not possible with CT, PET, or MRI. The poor
spatial resolution, particularly for sources deeper in the brain, is due to the spatial
mixing of electrical activity generated by dierent cortical areas and the passive
conductance of these signals through brain tissue, cerebrospinal uid, bone, and
skin/scalp [
17
,
19
,
44
]. Additionally, these measurements are very susceptible
to artifacts arising from muscle and eye movements. Invasive versions of EEG
improve spatial resolution by placing subdural and/or deep electrodes for a more
direct recording of spontaneous or evoked neural activity.
Functional magnetic resonance imaging (fMRI) measures changes in blood
hemoglobin concentrations associated with neural activity, based on the
dierential magnetic properties of oxygenated and deoxygenated hemoglobin.
fMRI has much better spatial resolution than EEG and MEG, but the temporal
resolution is poor, which puts an upper bound on the bit rate for fMRI in BCI
applications. Recently, an approach was presented that uses intracranial EEG
18 Fundamentals of Electroencephalography, evolution, and open challenges
(iEEG) that can collect as much data as fMRI, but using a portable device inside a
backpack [
45
]. This will allow the study of brain function of subjects while they
are interacting with others, rather than inside an fMRI machine.
Since the inception of EEG, various standards and guidelines have been
proposed for electrode placement to ensure signal integrity and repeatability
of recordings, as described below.
2.1.5 International EEG electrode placement systems
H.H. Jasper studied possible methods to standardize electrode placement, resulting
in the denition of the 10-20 international system, which consists of 21 electrodes
placed at distances of 10% and 20% along certain contours over the scalp, as
illustrated in Fig. 2.2 [
2
]. Since then, the 10-20 international system has become
the standard for the study of EEG and ERPs in both clinical and non-clinical
settings. Later, the extended 10-20 or 10-10 system was proposed to extend the
number of channels from 21 up to 74. These systems simply extend the number of
electrodes by placing them at every 10% along the medial-lateral contours and by
introducing new contours in between the existing ones [46].
The extended 10-20 or 10-10 system have been accepted and endorsed as the
standard of the American Electroencephalographic Society and the International
Federation of Societies for Electroencephalography and Clinical Neurophysiology
[
4
,
5
]. There is a proposed extension to accommodate a larger number of electrodes,
known as the 10-5 system, which includes the 10-20 system and 10-10 system
locations, enabling the use of up to over 300 electrode locations [3].
In all cases, the electrode names consist of one or more letters and a number,
with the electrodes on the left being odd numbered and the electrodes on the
right even numbered. The electrodes at the center, or midline, are designated by
the letter
z
, indicating that the electrode is neither even nor odd. The electrodes
at the midline have the smallest numbers and the numbers increase towards
the side, where the letter indicates the location on the head, which are
Fp:
frontal pole, F: frontal, C: central, T: temporal, P: parietal, O: occipital
.
Additionally, combinations of two letters indicate intermediate locations, i.e.,
FC:
in between frontal and central electrode locations, PO: in between parietal
and occipital electrode locations.
2.1. Electroencephalography 19
Figure 2.2: The original gure illustrating the international 10-20 system. Note
that the electrodes are erroneously located inside the skull on the surface of the
cortex [2].
2.1.6 Consumer-grade low-density EEG headsets
High-density EEG
uses a dense array of EEG channels, in which the number of
electrodes can vary from 32 to 256 or more [
47
49
]. However, there is no xed
number of channels that denes a low-density EEG headset. The 21 channels from
the 10-20 international system is considered to be low-density and in some studies,
the authors considered low-density EEG to consist of arrays with 25 channels [
50
]
and others when using arrays of 32, 16, or 8 channels [
51
]. In this context, EEG
can be considered low-density when less than 32 channels are used.
There is currently a wide range of consumer-grade EEG headsets available
that follow the 10-20, 10-10, or 10-5 system [
52
,
53
]. A review published in 2015
provides information about the headsets Emotiv, NeuroSky, interaXon (Muse), and
OpenBCI, which are mainly used for cognitive studies, BCI research, education,
and gaming [
52
]. Interestingly, Emotiv products are popular for cognitive studies
and gaming, NeuroSky dominates the educational eld, and published BCI research
has only used Emotiv and OpenBCI headsets. In [
54
] there is a review of various
BCI applications and cognitive neuroscience research using Emotiv up to 2019,
showing that most of the research has come from the United States, India, China,
Poland, and Pakistan. Fig. 2.3 presents a timeline of the evolution of EEG systems
since the time of Hans Berger and several relevant consumer-grade EEG headsets.
20 Fundamentals of Electroencephalography, evolution, and open challenges
Figure 2.3: Timeline of the evolution of EEG systems and relevant consumer-grade
wearable EEG headsets.
2.1. Electroencephalography 21
Fig. 2.3 shows the starting point for recording human EEG signals, using two
white needle-shaped electrodes, which was performed by Hans Berger in 1924 and
reported in 1929. High-density EEG was the starting point for analysis for certain
applications, initiating the publication of international standards, starting with
the international 10-20 system, and subsequent standards by placing electrodes in
the middle and around this rst system.
Fig. 2.3 also presents the set of channels found in this thesis, which will be later
described in Chapters 4and 5. As explained in Chapter 1, the thesis focused on two
main applications:
Epileptic seizure classication
, and
EEG-based biometric
systems
, nding that a set of 1-3 EEG channels can be used for epileptic seizure
classication, and 1-4 EEG channels for creating EEG-based biometric systems.
Various consumer-grade wearable EEG headsets using dry or wet electrodes
have gradually emerged, featuring dierent channel congurations or even exible
solutions, such as for the openBCI. Indeed, there is evidence that it is possible
to obtain similar results to that of medical grade equipment using the openBCI
with dry electrodes [
55
]. However, work is still needed to improve the recording
quality and increase the sample rate, which is limited to 250
Hz
for the openBCI
for a maximum of eight channels or 125Hz if more are used.
There are various areas of application for which the creation of new EEG
headsets could be interesting but the idea of comparing the use of static versus
movable EEG electrodes for a single headset for dierent applications needs
further exploration, as discussed in [
56
58
]. Recently, a research project entitled
FlexEEG
was presented, which aims to achieve real-time BCI with brain mapping
capabilities [
58
]. The FlexEEG concept is dierent from the standard high-density
EEG in that it involves dynamically scanning the human scalp to achieve the
minimum required recordings, rather than having electrodes attached to the scalp,
as illustrated in Fig. 2.4. The work in this thesis can contribute to the realization
of such a low-density EEG array by providing the software that can identify the
minimum EEG channel count required for a given neuro-paradigm.
2.1.7 Using brain signals for control purposes
Technological progress has allowed the analysis of EEG to move from pure
visual inspection of amplitude and frequency modulation to a more rigorous
and automatic exploration of the temporal and spatial features of the recorded
22 Fundamentals of Electroencephalography, evolution, and open challenges
Figure 2.4: FlexEEG concept. FlexEEG moves from
X1
to
X2
to capture sources
S1
and S2[58].
signals.
As a result, EEG is accepted as a powerful tool to capture brain function
and has been shown to be valuable in clinical diagnosis, i.e., the identication of
epilepsy and sleep and mental disorders, the evaluation of various dysfunctions,
etcetera [17,44].
Since the rst proposal to use EEG signals to control external devices (i.e.,
prosthetic arms) [
59
], eorts to improve the interpretation of brain signals through
EEG signals, and thus establish more robust control over external devices, have
rapidly increased [60,61].
The assumption that invasive methods can provide better performance has not
been completely supported by the results of several studies [
62
66
], which have
shown that the control of movement obtained with scalp-recorded sensorimotor
rhythms falls in the same range in terms of speed and precision as the control
obtained with invasive methods [63].
Recently, several approaches using invasive methods have been presented that
allow subjects to control a prosthetic limb with 10
°
of freedom (three-dimensional
(3D) translation, 3D orientation, four-dimensional hand shaping) [
67
]. However,
this required two 96-channel intracortical electrode arrays implanted in the
subject’s left motor cortex.
The processes followed for invasive and non-invasive methods, assumptions,
2.2. EEG paradigms 23
and results obtained in each case are too dierent to allow a good comparison of
invasive and non-invasive methods. For example, current non-invasive studies
suggest that a spelling protocol that uses a goal-selection approach (such as
P300-speller) may be faster and more reliable than a spelling protocol that uses a
process-control approach [60,61,68].
The most appropriate protocol and paradigm need to be selected following
careful analysis, according to the purpose of the BCI. In addition there are
numerous dierent paradigms available, such as motor imagery paradigms,
external stimulation paradigms (i.e., P300), error-related potential, etcetera [69].
Then, it is necessary to create a training set using the selected paradigm, which
can be task-dependent or task-independent during the resting-state, and collect
the EEG data for creating the models using mathematical methods. The EEG
data are then collected while the subject performs the same task (or during the
resting-state), the created model used to predict the task, and the predicted task
used for BCI control.
2.2 EEG paradigms
Paradigm selection is important and must be associated with the purpose of
the EEG-based control application or EEG-based controller or BCI. Below, one
important paradigm and several relevant aspects about the resting-state, which
are referred to throughout the thesis, are described.
2.2.1 Event-related potentials and P300
ERPs are very small voltages that appear on the scalp as a response of the human
brain to specic events or stimuli that are time- and phase-locked. These have
been used to evaluate brain function and the response to stimuli. These signals
include both spontaneous electrical activity of the cerebral network and the cortical
response to external or internal events.
ERPs produce several well-known patterns (see Fig. 2.5). One of the most
extensively studied and used for BCIs is the P300 peak, also known as P3 [
69
71
].
The P300 component is elicited in response to infrequent events using what is
known as an oddball paradigm. It consists of a positive peak in the ERP ranging
from 5 to 10
µV
in amplitude with a latency between 220 to 500 ms after onset
of the stimulus, and is most signicant at central-parietal scalp and midline skull
24 Fundamentals of Electroencephalography, evolution, and open challenges
Figure 2.5: Schematic representation of certain ERP components after the onset of
a visual stimulus [72].
locations, i.e., Pz, Cz, and Fz in the 10-20 international system. Normally, hundreds
of ERPs are generated, collected, and averaged to visually distinguish the P300
peak from the background activity, thus cancelling the inuence of noise.
The P300-speller paradigm was developed with the initial aim to restore
communication to locked-in state patients [
73
] and normally consists of a
Nx N
matrix of characters that is presented to the subject in random sequences of
intensied columns and rows (Flashed), thus constituting an oddball paradigm
[70,73].
An important advantage of P300 for a BCI is that most subjects can use it with
very high accuracy and it can be calibrated in a few minutes, which means that
subjects can use BCI systems to control devices quickly. However, disadvantages
of this paradigm are that it may produce fatigue and that subjects with visual
impairment are not able to use BCIs based on this paradigm [7376].
2.2.2 Resting-state
The resting-state, also called resting-state activity, is typically used to analyze
problems relative to the subject’s internal state of mind. A stable resting-state does
not necessarily exist, because spontaneous changes in regional neuronal ring
occur even when the organism is apparently in resting-state [77].
In addition, spontaneous activation can change local blood ow and cause
2.2. EEG paradigms 25
low-frequency blood oxygenation level-dependent signal uctuations [
78
]. In
other words, the brain is never truly at rest [
79
] and the term only refers to the
absence of goal-directed neuronal action with the integration of information of
the external environment and the subject’s internal state, as well as when the
subject is not actively engaged in sensory or cognitive processing.
Brain activity can be studied in the resting-state in children or patients who
would otherwise be unable to complete long experiments or perform complex
cognitive tasks and the simplicity of the procedure for collecting EEG signals has
also facilitated the replication of experiments and comparison of results.
The resting-state is typically used to analyze clinical or psychological problems
[
80
82
] and for most cases of real-time implementation of BCI approaches, as it
is necessary to dierentiate between the tasks associated with the paradigm and
the resting-state [
83
]. The resting-state can also be used for various EEG-based
systems [8387].
Most resting-state features from EEG consist of ongoing amplitude-modulated
oscillations in the approximate frequency range of 0.5-70 Hz [
88
]. There is evidence
that the alpha frequency band of the multi-channel resting-state in EEG signals
can be parsed into a set of discrete states, called microstates, which are dened
by topographies of electrical potentials, and remain stable for 80–120 ms before
rapidly transitioning to a dierent microstate [89,90].
Resting-state EEG microstates reect neural activity in a task-negative state,
which is considered to be primarily involved in involuntary actions. Brain regions
exhibiting functional connectivity are organized into discrete networks associated
with distinct functions. Among them are a host of so-called resting-state networks
(RSNs), which represent functionally connected areas that are active in the task-
negative state [
90
]. One such network is the
default-mode network, which is
active in the task-negative state
but becomes deactivated in a wide array of
cognitive tasks [91].
Interestingly, only four predominant topographies occur during the resting-
state and all can be reliably identied in healthy individuals throughout their
life span and explain most global topographical variance [
92
,
93
], as shown in
Fig. 2.6. However, several studies have been published that show more than four
microstates [
94
]. This can all inuence the selection of the most relevant channels
26 Fundamentals of Electroencephalography, evolution, and open challenges
Figure 2.6: Topography of four microstate maps from [
92
]. Map areas of opposite
polarity are coded in red and blue using a linear color scale. The left ear is to the
left and the nose is at the top
for extracting information in BCI applications.
Fig. 2.6 presents the eyes-closed resting-state EEG microstates from [
92
], which
consist of four classes of microstates:
class A
, with a left occipital to right frontal
orientation;
class B
, from right occipital to left frontal orientation;
class C
, with
a symmetrical occipital to prefrontal orientation; and
class D
, also symmetrical,
but with a fronto-central to occipital axis. The resting-state microstates are shown
to move around the sensorimotor areas of the brain, as a way of sensing the brain
through the most important senses of the human body.
A review compared the four microstate maps determined in various
independent studies using a varying number of electrodes, participants, lter
settings, etcetera [
95
]. The four presented microstate maps were distinct in the
studies but highly reproducible, with the
class A
and
class B
similarities being
clearer.
As will be shown in Chapter 5, the channel distribution found during the
followed optimization process showed a similar channel distribution as the four
topographies of the resting-state microstates presented in Fig. 2.6.
2.3 Current and future trends in EEG
There is a growing interest in the use of EEG in medical ambulatory and non-
medical and wearable applications, such as entertainment, day-to-day mobile EEG,
sports, neuro-assisted learning, and brain-computer interfaces. This will require
the implementation of miniaturized, user-centric, wireless EEG acquisition systems
with ultra-low power dissipation that is robust to motion artifacts. However,
currently available mobile EEG systems are still quite bulky and use structures
with a large number of xed electrodes, which are not comfortable for day-to-day
2.3. Current and future trends in EEG 27
mobile EEG monitoring.
There are many fronts on which these requirements can be addressed. Two
central research points in terms of EEG electrodes are the creation of newer
electrode technologies and lower-power consumption electronics. To increase
the battery lifetime of wearable EEG devices, research is also being carried out
on data reduction approaches. For example, in the diagnosis of epilepsy, data
reduction techniques have been used to extend the battery life of wearable EEG
devices through intelligent selection and solely transmission of EEG data relevant
for diagnosis [96].
There is a trend towards applying combined sets of features that can produce
better performance for classication rather than using features independently [
97
].
Future directions should combine machine learning and traditional approaches
for eective automatic artifact removal [
98
]. One of the main concerns regarding
EEG and BCIs is that almost all published experiments have been performed in a
controlled laboratory, whereas the need is towards improving artifact removal in
daily-life EEG-BCI, which is also important for the use of dry electrodes, for which
more research is clearly needed [
99
,
100
]. When designing new EEG headsets, it is
important to thoroughly examine the basic criteria of the system, environmental
aspects, situation, and target users/applications [98,101].
For certain applications and environments, the trend is towards higher sample
rates and more recording channels. However, for low-power, easy-to-use portable
systems, the channel count needs to be minimized without aecting the accuracy
of manual/visual inspection and machine learning based applications [99].
The integration of brain monitoring based on EEG into everyday life has
been hindered by the limited portability and long setup time of current wearable
systems, as well as the invasiveness of implanted systems. There is a current
trend towards exploring the potential of recording EEGs in the ear canal for brain
monitoring, which is known as in-the-ear EEG (Ear-EEG) [
102
,
103
]. Ear-EEG has
been presented as a system that promises a number of advantages, including xed
electrode position, user comfort, robustness to electromagnetic interference, and
ease of use, and that can be used for long-term monitoring [102].
Research eorts are ongoing to make EEG devices smaller, more portable, and
easier to use. The so-called wearable EEG is based on the creation of low-power
28 Fundamentals of Electroencephalography, evolution, and open challenges
wireless collection electronics and dry electrodes that do not require a conductive
gel for use [
104
,
105
]. Wearable EEG aims to provide small EEG devices that are
present only on the head and can record for days, weeks, or months, as promised
by ear-EEG [100,102].
In general, wearable EEG is envisioned as the evolution of ambulatory EEG
units from the bulky, limited-life devices available today to small devices. Such
miniaturized devices will enable long-term monitoring of diseases, such as epilepsy
and various mental disorders, as well as improve end-user acceptance of BCI
systems [100,102,105].
Future wearable EEG systems should be unobtrusive, lightweight, discrete,
and durable, which can be achieved by eliminating the large ambulatory EEG
recording units and wires that attach them to the electrodes. These will be
replaced by microchips containing the necessary ampliers, quantizers, and
wireless transmitters, which are mounted on top of the electrodes. EEG data
will then be transmitted wirelessly to a suitable mobile phone or similar device,
which people often keep a short distance from themselves [104,105].
In some cases, such as epilepsy diagnosis, wireless transmission of EEG data is
not strictly necessary, as data analysis is normally performed after data collection,
but wireless transmission will be necessary for future applications in predicting
epileptic-seizures and their automatic treatment. Even wireless connections
between electrodes is desirable to enable miniaturization [100,104,105].
Chapter 3
Materials and Methods
This chapter introduces the concepts that provide the basis for the thesis
contributions and a summary of the datasets used, as well as a owchart describing
the proposed methods for feature extraction and classication. The proposed methods
for channel-count optimization used in the cases studied are presented.
As introduced in Chapter 1, a comprehensive view of the necessary methods and
tools used to achieve the objectives of the thesis, is presented. Fig. 3.1 presents the stages
followed, which includes the EEG datasets (
a)
), pre-processing and feature extraction
(
b)
), the classiers used (
c)
), and the various methods for channel reduction and
selection (
d)
). Each necessary step is presented and explained below for the datasets
used, which are presented in Section 3.6.
3.1 Improving the signal-to-noise ratio
As introduced in Section 2.1.2.2, EEG signals can be contaminated by various
sources of artifacts or noise produced by body movement, EMG, ECG, eye
movements, sweating, power lines, impedance uctuations, cable movements,
etcetera [
106
]. Therefore, an important step before analyzing EEG signals is to
enhance the signal-to-noise ratio, for which there are several spatial ltering
techniques [
38
,
107
109
]. Among the simplest and most used methods are the
Common Average Reference (CAR) and Laplacian Filter (LAP) [110112].
In this thesis, the signal-to-noise ratio from the EEG signal was improved using
the CAR method, which removes simultaneously-recorded common information
from all electrodes. CAR can be computed for an EEG channel
VCAR
i
, where
i
is
the number of the channel, as follows:
29
30 Materials and Methods
Figure 3.1: Stages of the methodology followed in the thesis.
VCAR
i=VER
i1
n
n
Õ
j=1
VER
j(3.1)
where
VER
i
is the potential between the
ith
electrode and the reference, and
n
is
the number of electrodes.
After removing the noise from the EEG signals, it can be processed using data
transformation techniques, such as EMD or DWT, to decompose the signals into
dierent frequency bands and thus extract relevant features from each sub-band,
as explained below.
3.2. Data analysis 31
3.2 Data analysis
Data analysis helps to provide information hidden in the data. It refers to the
process of manipulating and transforming/converting data from one format,
structure, or domain to another. For example, data analysis techniques can be used
to convert a signal from the time-amplitude to time-frequency or amplitude-
frequency domain, and vice-versa. This process can increase the value and
eciency of analytical or feature extraction procedures. When working with noisy
raw data, the extraction of a handful of fundamental features (mean, variance,
slope, etc.) is not generally sucient, but valuable information can be extracted by
manipulating or transforming the data. When working with EEG signals, feature
extraction techniques can be time-based, frequency-based, or time-frequency-
based. Time-frequency-based features are used more frequently as they can
simultaneously provide information about the time and frequency of the EEG
signals. EMD and DWT are the most popular and useful feature extraction
techniques [113115].
3.2.1 Empirical Mode Decomposition
EMD is an adaptive data analysis method used for decomposing non-linear and
non-stationary signals, which may be mono-component or multi-component, into
a nite number of amplitude and frequency-modulated zero-mean signals without
leaving the time domain, called Intrinsic Mode Functions (IMFs), which satisfy two
conditions [116]:
1.
The number of extrema and the number of zero crossings must be either
equal or dier at most by one.
2.
At any point, the mean value of the envelope dened by the local maxima
and the envelope dened by the local minima is zero.
The method decomposes a signal into oscillatory components by applying a
process called sifting, making EMD a data-driven method that does not depend
on any a priori dened system. This process removes riding waves and makes
the wave-prole more symmetrical [
116
,
117
]. EMD decomposes a time-series
x(t)
into IMFs
xi(t)
and a residue, such that the signal can be represented and
reconstructed as shown in Eq. 3.2 and summarized, as shown in algorithm 1:
32 Materials and Methods
x(t)=
n
Õ
i=1
xi(t)+residue (3.2)
An important aspect presented in algorithm 1is whether a given sample is
or is not an upper or lower extrema, since it must be based on the relationship
of the actual sample with its left and right neighbours. The envelopes will be
dierent depending on the accuracy of the method for nding these upper and
lower extrema points, as the sifting process is implemented by connecting all of
the local minima or maxima by a cubic spline line to extract the IMFs . Additionally,
it may lead to minor deviations from the true mean envelope depending on the
spline used for the interpolation, producing dierent IMFs. According to [
118
],
the natural spline is the most reasonable one to select.
During the interpolation process, at least one extrema on each side must be free,
unless the rst and last points were simultaneously considered as the maximum
and minimum. This is known as an end eect and can be solved by using mirror
continuation [
119
122
]. However, the requirement for this approach is that the
mirror be placed at the extrema point, but if the signal cannot determine whether
the endpoint is the extrema point, then it amputates part of the data to place the
mirror at the extrema point. The authors in [
122
] proposed a combination based
on support vector machine (SVM) and EMD mirror extension methods to predict
the extrema points near the end of the signal and thus solve the EMD end-eect
problem. Briey, an SVM model is used to extend the two ends of the original data
to obtain local extrema points, then the image in the mirror is mapped to a ring
signal with no endpoints by mirror extension. The stopping criterion is another
important part of EMD, as it determines the number of sifting steps to produce
an IMF, and the sifting process has to be repeated as many times as necessary to
eliminate all riding waves. Generally, it is critically important in the successful
implementation of EMD.
Mode mixing is another well-known problem encountered during the sifting
process and happens when EMD tries to extract mono-components from a multi-
component signal. In such cases, the sifting process only identies modes that
clearly contribute their own maxima and minima. Otherwise, EMD will not be able
to separate the mode in a single IMF and the mode will remain mixed in another
3.2. Data analysis 33
Algorithm 1 The sifting process for a signal x(t)
1: Data: signal = x(t)
2: Result: IMFs
3: sifting = True
4: while si f t inд=True do
5: Identify all upper extrema in x(t)
6: Interpolate the local maxima to form an upper envelope u(x).
7: Identify all lower extrema of x(t)
8: Interpolate the local minima to form an lower envelope l(x)
9: Calculate the mean envelope:
m(t)=u(x)+l(x)
2
10: Extract the mean from the signal:
h(t)=x(t) − m(t)
11: if h(t)satises the two IMF conditions then
12: h(t)is an IMF { Add h(t)to IMFs }
13: sifting = False { Stop sifting }
14: else
15: x(t)= h(t)
16: sifting = True { Keep sifting }
17: end if
18: if x(t)is not monotonic then
19: Continue
20: else
21: Break
22: end if
23: end while
IMF or split between several IMFs [
123
,
124
]. Data aected by the presence of
intermittence and noise can also produce the mode-mixing problem.
There are EMD-based methods for noise removal, solving end eects, and the
mode-mixing problem. For example, Ensemble EMD (EEMD) denes true IMFs as
the mean of an ensemble of trials [
124
]. However, EEMD is not recommended for
real-time applications due to the computational cost [125].
3.2.1.1 IMF selection
Depending on the parameters selected for the EMD method (spline for the
interpolation, the method for solving the end-eect problem, etc.) and because
the numerical procedure is susceptible to errors, some IMFs that contain limited
34 Materials and Methods
information may appear in the decomposition [126].
There are several approaches for selecting the IMFs that contain the most
relevant information about the signal, i.e., using energy-based techniques or
using a threshold or distance [
127
129
]. For illustrative purposes, an example
employing the Minkowski (Euclidean) distance (
dmi nk )
is presented, which is
dened as follows.
dmi nk = n
Õ
i=1xiyi
2!1/2
(3.3)
where
xi
and
yi
are the
i
-th respective samples of the observed signal and the
extracted IMF. According to [
128
], the redundant IMFs have a shape and frequency
content dierent from those of the original signal, which means that when an IMF
is not appropriate, the dmink presents a maximum value.
Fig. 3.2, presents an example using a synthetic signal generated by
x(t)=
sin(
3
πt)+sin(πt)+whit e_noise
, which can be compared to the IMF selection
methods presented by [
127
,
129
]. For the example presented, it was considered to
be a trial of two seconds with a sample rate of 512 Hz and, for illustrative purposes,
only the rst three most relevant IMFs, according to the Minkowski distance, were
selected (the closest three IMFs). However, this number may vary depending on
the nature of the data, sample rate, trial-duration, and other factors.
Fig. 3.2 shows that the original signal can be reconstructed by using all the
obtained IMFs, but also if only the three closest IMFs and the residue are used.
This means that EMD can decompose a signal into dierent components and also
capture the most relevant information in dierent IMFs. This may be important
for certain applications and depending on the nature of the signal, as the use of a
large dataset can increase the computational cost. Therefore, using only the most
relevant IMFs, it is possible to extract the main components (relevant information)
from the signal and analyze it further.
3.2.2 Discrete Wavelet Transform
A wavelet is a brief rapidly decaying wave-like oscillation with an amplitude that
begins at zero, increases, and decreases back to zero, and has a nite duration. The
wavelet transform (WT) replaces the sine and cosine functions of Fourier transform
3.2. Data analysis 35
(a) IMFs and residue (res.) extracted from the original signal using EMD.
(b) Original signal
(c) Reconstructed signal using
all IMFs plus residue.
(d) Reconstructed signal using
IMFs 1, 2, 7 and res.
Figure 3.2: IMFs plus residue (Sub-g. 3.2a) obtained from the synthetic signal
presented in sub-g. 3.2b, as well as the reconstructed signal using all the IMFs
(Sub-g. 3.2c) and three IMFs selected using the Minkowski distance plus the
residue (Sub-g. 3.2d).
(FT) by translations and dilations of a wavelet. It is basically a mathematical
technique in which a particular signal is analyzed in the time domain using
dierent versions of a translated and dilated basis function called a mother wavelet.
WT is suitable for analyzing irregular data patterns, such as non-stationary signals,
36 Materials and Methods
and it provides well-dened frequency and time resolution for both low and high
frequencies.
There are two important parameters used in the transformation: scaling and
shifting. A stretched wavelet, which is produced with large-scale factors, helps
to capture the slowly varying changes (low frequencies), whereas a compressed
wavelet, produced with small-scale factors, helps to capture the abrupt changes
(high frequencies). The wavelet has to be shifted to align with the desired feature.
Shifting a wavelet means delaying or advancing the onset of the wavelet along
with the signal. In general, WT is represented in Eq. 3.4.
ψa,b=1
p|a|ψtb
a(3.4)
where
aand bare the scaling and shifting parameters, respectively.
ψis the mother wavelet
For a given scaling parameter
a
, the wavelet is translated by varying the
parameter b.
Selecting an appropriate mother wavelet is crucial for analyzing the signals, as
it will aect the outcome and various wavelets applied on the signal may produce
dierent results. It is common to select a mother wavelet that is similar in shape
to the original raw signal, but it can be selected experimentally.
DWT provides a time-frequency representation of a signal and decomposes a
signal in the time domain into shifted and scaled versions of a mother wavelet.
DWT provides sucient information of the original signal with a signicant
reduction in computation time by passing the signal through a series of low-pass
and high-pass lter pairs. The DWT is presented in Eq. 3.5.
DWTj,k=
−∞
x(t)1
p|2j|ψt2jk
2jdt (3.5)
where
jand kare the scaling and shifting parameters, respectively.
3.3. Data features 37
ψis the mother wavelet
2jand 2jkreplace aand bfrom Eq. 3.4, respectively.
Additionally, it is necessary to pre-dene two parameters, the decomposition
level and the mother wavelet. The outputs provide the level 1 high-frequency
part, called detail coecients (D1), and the level 1 low-frequency part, called
approximation coecients (A1). Subsequently, the low-pass portion is fed into
a new set of lters and the process is repeated until the signal is decomposed
to a pre-dened level. Briey, the wavelet decomposition of a signal
x(t)
in the
j
decomposition level has the structure
[Aj,Dj,Dj1, ..., D1]
. It should be noted
that at every level, half of the samples can be removed according to the Nyquist
theorem [130].
Fig. 3.3, presents an example using a synthetic signal generated by
x(t)=
sin(
3
πt)+sin(πt)+whit e_noise
, using four levels of decomposition and the
mother wavelet biorthogonal 1.3. As in the example presented in 3.2.1, it was
considered to be a trial of two seconds with a sample rate of 512 Hz.
3.3 Data features
A feature is an individual measurable property or characteristic of a phenomenon
being observed.
They can be mainly divided into two types, fundamental
and complex
. Fundamental features, also know as time-domain features, are
explicitly present in the acquired data and can be directly used, i.e., mean, median,
variance, standard deviation, amplitude, kurtosis, skew, etc. Complex features are
generated by manipulation or transformation of the data (transformations using
methods such as EMD or DWT), and after a certain amount of transformation
of the data,
it is necessary to extract certain relevant patterns, which also
helps in dimensionality reduction
. Choosing informative, discriminating, and
independent features is a crucial step for eective training of algorithms in pattern
recognition, classication, and regression. Below, a set of energy and fractal
features relevant to this thesis is introduced.
3.3.1 Energy distribution
The energy
Es
of a discrete signal
(n)
is dened as the area under the squared
magnitude of the signal, and is calculated as in Eq. 3.6.
38 Materials and Methods
Figure 3.3: Details and approximation coecients extracted from the original
signal using DWT with four levels of decomposition and the mother wavelet
biorthogonal 1.3.
Es=hx(n),x(n)i =
Õ
n=−∞
|x(n)|2(3.6)
There are several approaches for computing the energy distribution, which
has been used for feature extraction in various signal processing applications,
including those for audio and EEG signals [
131
133
]. In EEG, the features to
represent the energy distribution can be computed to reduce the computational
cost and obtain a better representation of the obtained sub-bands by transformation
using EMD or DWT.
As shown below, let
wj(r)
denote the coecient of one of the sub-bands (level
3.3. Data features 39
of decomposition or IMF) at position r, with Nas the length of the sub-band.
The instantaneous energy gives the energy distribution in log base 10 of a time
series [133], and can be computed in Eq. 3.7:
fj=loд10 1
Nj
Nj
Õ
r=1
(wj(r))2!(3.7)
The Teager energy is a robust parameter, as it attenuates auditory noise [
131
133
]. This log base 10 energy operator reects variations in both amplitude and
frequency of the signal, which is computed as in Eq. 3.8:
fj=loд10 1
Nj
Nj1
Õ
r=1(wj(r))2wj(r1) ∗ wj(r+1)!(3.8)
There are more approaches for computing dierent values of energy features,
but these two parameters have proven to be robust for representing the sub-bands
of EEG signals [87,132135].
Fig. 3.4, presents the average value and standard deviation of the Teager
and instantaneous energy distribution of the IMFs from EMD and the levels of
decomposition using DWT from Figs. 3.2 and 3.3.
3.3.2 Fractal dimension
A fractal is an irregular geometric object that exhibits similar patterns at
increasingly small scales called self-similarity. A fractal dimension is a ratio
providing a statistical index of complexity comparing how details in a pattern
change with the scale at which it is measured. It is used to measure the roughness
of a signal, i.e., a mild or wild randomness, and the complexity of an EEG signal
can be directly evaluated by its fractal dimension [136].
There are several self-similarity features from fractal geometry that are useful
in describing the complexity of an EEG signal and they have been shown to be
highly insensitive to noise [
137
]. Some have been used to directly characterize
EEG signals from raw data or using various methods to extract the information
[
87
,
136
,
138
]. In particular, Higuchi and Petrosian fractal dimensions have been
used to characterize non-linear and non-stationary data [87,137141].
The
Higuchi fractal dimension
algorithm approximates the mean length
of the curve using segments of ksamples and estimates the dimension of a
40 Materials and Methods
Figure 3.4: Teager and Instantaneous energy distribution of EMD and DWT sub-
bands from Figs. 3.2 and 3.3.
time-varying signal directly in the time domain [
142
]. Consider a nite set of
observations taken at a regular interval:
X(
1
),X(
2
),X(
3
), . ., X(N)
. From this series,
a new one Xm
kmust be constructed,
Xm
k:X(m),X(m+k),X(m+2k), .., Xm+Nm
kk(3.9)
Where
m=
1
,
2
, . ., k
,
m
indicates the initial time, and
k
the interval time. Then,
the length of the curve associated with each time series
Xm
k
can be computed as
follows:
Lm(k)=1
k Nm
k
Õ
i=1X(m+ik) − Xm+(i1)k! N1
Nm
kk!(3.10)
Higuchi takes the mean length of the curve for each
k
, as the average value of
Lm(k), for m=1,2, . .., kand k=1,2, . .., kmax , which is calculated as:
L(k)=1
k
k
Õ
m1
(Lm(k)) (3.11)
The Higuchi fractal dimension depends only on the free parameter
kmax
,
which represents the maximum number of scales to explore in the process of
3.3. Data features 41
Figure 3.5: Higuchi and Petrosian fractal dimension of EMD and DWT sub-bands
from Figs. 3.2 and 3.3.
calculation. In this thesis, it was set at
kmax =
10, but dierent values have been
used when working with brain signals [143145].
The
Petrosian fractal dimension
can be used to provide a rapid computation
of the fractal dimension of a signal by translating the series into a binary sequence
[146].
FDP et r o si a n =log10 n
log10 n+log10 n
n+0.4N(3.12)
Where
n
is the length of the sequence and
N
is the number of sign changes in
the binary sequence.
Fig. 3.5, presents the Higuchi and Petrosian fractal dimension of the IMFs
from EMD and the levels of decomposition using DWT from Figs. 3.2 and 3.3. It
presents the average value and the standard deviation of the fractal dimension
values from all the IMFs or levels of decomposition. Using this process, a visual
comparison between the fractal features of EEG signals from dierent classes is
easy to interpret, as presented in [
141
]. However, for the interest of this thesis,
this process will be accomplished using machine learning algorithms, as explained
later.
42 Materials and Methods
3.4 Computational intelligence methods for classification
Machine learning is a well-known research area dened as computational methods
using experience to improve performance or to make accurate predictions.
Supervised learning is the task of learning or inferring a function from labeled
training data of a set of training examples [147].
Deep learning algorithms have been shown to be successful in image
processing and other elds, but have not shown convincing or consistent
improvement when using EEG data over the most advanced current methods.
In addition, its performance depends on the use of a large number of instances,
something that is not common when using EEG data [
148
151
]. Below, a set of
methods that have been shown to be eective with little training data is described
[148,152155].
3.4.1 Multi-class classication
Machine learning gives computers the ability to learn from experience by using
supervised or unsupervised learning [
156
]. Using machine learning, it is possible
to train models for predicting the labels or classes of new inputs. Considering
X
as the sample space and
Y
as the target space, the goal is to construct a function
that predicts
Y
from
X
. There are several approaches using supervised learning of
interest for this thesis, which are described below:
Support Vector Machine or SVM
: This approach uses hyperplanes to
separate classes of data by maximizing the margins, which are the distances
between the nearest training points from dierent classes. The hyperplane
is dened by vectors called support vectors. SVM has the advantage
of transforming nonlinear data to higher-dimensional space for easier
separation using the kernel trick and is therefore exible in representing
complex functions while providing a global solution. There is a linear kernel
and there are nonlinear kernels, such as the radial basis function (RBF),
sigmoid, and polynomial. The classication complexity does not depend on
the dimensionality of the feature space and the sensitivity to the number
of features is relatively low [
157
], as the necessary time to create a model
is
O(N3)
, where
N
is the length of the feature vector and
O(
1
)+O(N)
is
required to predict the class of a new instance using the created model [
158
].
3.4. Computational intelligence methods for classication 43
k-nearest neighbors (KNN)
: This algorithm does not attempt to construct
a general internal model. Instead, it stores instances of the training data,
so no learning is required. The
k
data points most similar to a new data
point from the training dataset are localized [
159
,
160
]. A prediction is
then obtained by majority voting applied over the
k
-nearest data points.
The learning is based on the k-nearest neighbors, where
k
is an integer
value that must be specied and the optimal choice of the
k
value is highly
data-dependent. A large
k
suppresses the eect of noise but makes the
classication boundaries less distinct [161].
Random Forest (RF)
: This is an ensemble learning algorithm, meaning
it generates classiers and aggregates their results. It consists of several
decision trees (DT), each giving a prediction, and the class with most votes
becomes the models’ prediction. Each node is split using the best subset of
predictors randomly chosen at that node. RF has been shown to outperform
SVM and KNN and is robust against over-tting [
162
]. Two parameters
must be dened for RF, the number of trees in the forest and the number of
variables in the random subset at each node, but it is not very sensitive to
such values [163].
Naive Bayes (NB)
: This is a probabilistic classier based on Bayes’ Theorem.
The simple form of the calculation for Bayes Theorem is as follows:
P(A|B)=P(B|A)P(A)
P(B)(3.13)
where
P(A|B)
is the probability of interest. Bayes Theorem assumes that each
input variable depends on all other variables, which causes complexity in the
calculation. Removing the assumption of dependency and considering each
input variable to be independent from each other simplies the calculation.
An advantage of NB is fast computing when making decisions and it does
not require large amounts of data before learning can begin [164].
3.4.2 One-class classication
A one-class classication (OCC) algorithm consists of identifying objects of a
specic class among all objects by learning from a training set that contains only
44 Materials and Methods
the objects of the target class. This task can be more challenging than a multi-
class classication problem, as it is assumed that information for only one of the
classes is available, and the boundary between normal and abnormal data has to
be estimated solely from normal data in such a way that as many target objects as
possible are accepted while minimizing the possibility of accepting outliers [
165
].
3.4.2.1 One-class Support Vector Machine
In SVM, the input data is represented in an
N
-dimensional space, where
N
is
the number of features. The algorithm seeks to nd a decision boundary or a
hyperplane that can separate the data points into classes. The distances from each
point to the decision boundary are called support vectors. The algorithm searches
for the decision boundary with maximised margins, that is the boundary that
maximizes the sum of the support vectors. In one-class SVM (OCSVM), which is
an unsupervised algorithm, this translates to identifying the smallest hypersphere
(with radius
r
, and center
c
) that consists of all data points belonging to the class.
The model infers the properties of the training set, and from these properties it
can predict which trials from a test set are dierent from the training set.
OCSVM learns a decision function for outlier detection, classifying new data
as similar to or dierent from that of the training set. As in SVM, dierent kernels
can be used and certain important parameters require tting, including the nu
and gamma parameters. The nu parameter is an upper bound on the fraction of
training errors and a lower bound of the fraction of support vectors that should
be in the interval [0, 1]. Gamma denes how much inuence a single training
example has: the larger the gamma, the closer other examples must be to be
aected and the interval must be greater than 0; normally it is 1/no_f eatures .
A grid search can be used to adjust the parameters by cross-validation, which
has been shown to be powerful and able to signicantly improve the results.
However, it is a very slow process [
166
]. These parameters dier depending on
the size of the feature vector and it is necessary to re-compute them each time.
To illustrate this point, Fig. 3.6 presents an example of two dierent decision
boundaries in OCSVM obtained by using dierent nu and gamma parameters
with a random dataset of 100 trials for training (two features per trial), 30 new
regular trials, and 30 new abnormal trials. The results obtained clearly show that
OCSVM can be sensitive to these values and they must be tted correctly to obtain
3.4. Computational intelligence methods for classication 45
Figure 3.6: Example of two dierent decision boundaries in OCSVM and a random
dataset with outliers.
generalized results. They also show that the learned frontier better ts the training
set when the recommended gamma parameter (1/no_f eatures) is used.
3.4.2.2 Local Outlier Factor
Local Outlier Factor (LOF) is a density-based unsupervised outlier detection
algorithm that denes the degree of being an outlier by calculating the local
deviation of a given data point with respect to its surrounding neighborhood.
The score assigned to each data point is called the local outlier factor [
167
]. It
is based on a concept of local density given by the distance of the k-nearest
neighbors. Comparing the local density of a data point with the local densities of
its kneighbors, it is possible to identify regions with similar density and outliers,
which have lower density: the lower the density of a data point, the more likely
it is to be identied as an outlier. A small khas a more local focus, and a large k
can miss local outliers. Brute force,ball tree, or k-d tree algorithms can be used to
compute the nearest neighbors.
The k-distance is the distance of a point to its
kth
neighbor and the reachability
distance is the maximum of the distance of two points (i.e.,
distance(a,b)
) and the
k-distance of the second point (i.e., k_distance(b)), as presented in Eq. 3.14.
reach_dist (a,b)=max{k_distance(b),distance(a,b)} (3.14)
The reachability distance of ato all its knearest neighbors has to be calculated
46 Materials and Methods
Figure 3.7: Example of two dierent decision boundaries using LOF and a random
dataset with outliers.
and then the average of that number obtained. Thus, the local reachability density
(LRD) can be calculated, which is the inverse of the obtained average, as presented
in Eq. 3.15. The LRD indicates the distance that must be traveled from a point to
reach the next point (or cluster of points): the lower it is, the less dense it is, and
the longer the distance.
LRD(a)=1ÍbNk(a)reach_distk(a,b)
|Nk(a)| (3.15)
The LRD of each point is then compared to the LRD of its kneighbors. The
LOF is the average ratio of the LRDs of the kneighbors of ato the LRD of a, as
shown in Eq. 3.16.
LOFk(a):=ÍbNk(a)
LR Dk(b)
LR Dk(a)
|Nk(a)| (3.16)
A ratio
<
1indicates a denser region, which means that the point is an
inlier, whereas a ratio
>
1indicates that the point is an outlier. Fig. 3.7 presents
an example of two dierent decision boundaries of the LOF obtained by using
dierent algorithms and numbers of neighbors with a random dataset of 100 trials
for training (two features per trial), 30 new regular trials, and 30 new abnormal
trials.
3.4. Computational intelligence methods for classication 47
3.4.3 Evaluation of classier performance
Evaluating a classier’s performance, which is performed during the learning
process, provides information about how good or bad the followed method
is, compares the results with other proposals, and generalizes the results
[
168
]. There are several parameters that can be calculated, depending on the
approaches followed, i.e., some for multi-class classication and others for one-
class classication approaches. Relevant metrics for the validation of the proposals
are presented below.
3.4.3.1 K-fold cross-validation
This method splits a dataset into
k
folds. One is then used as the test set and the
rest as the training set. The number of trials per class must be the same or similar
in each fold. The model is trained using the training set and scored using the test
set. Then, the process is repeated until each unique group has been used as the
test set. Thus, every data point is used
k
1times as part of the training set and
one time as a test set. Through cross-validation, an unbiased evaluation of the
model can be obtained without reducing the training dataset.
The choice of
k
is usually 5 or 10, but the bias is smaller for
k=
10 than
k=
5.
However, there is no general rule. As
k
gets larger, the dierence in size between
the training set and the re-sampling subsets gets smaller. The most common value
used for cross-validation is k=10 [168,169].
3.4.3.2 Evaluation metrics
For evaluation and analysis of the results, a confusion matrix is generally used,
which in a multi-class problem is a
m×m
matrix, where
m
is the number of classes
in the dataset. The columns in the matrix are the true classes and the rows the
predicted classes.
For example, in a two-class classication problem, lets say Aand B, it is
obtained 1) true positives (TP), cases in which the classier correctly predicted
instances from A, 2) true negatives (TN), cases in which the classier correctly
predicted instances from B, 3) false positives (FP), cases in which the classier
erroneously predicted instances from Bin A, and 4) false negatives (FN), cases
in which the classier erroneously predicted instances from Ain B. With such a
confusion matrix, the accuracy, specicity, and sensitivity can be computed, as
48 Materials and Methods
presented in Eq. 3.17,3.18, and 3.19.
Accuracy=T P +T N
T P +T N +F P +F N (3.17)
Speci f i city=T N
T N +FP (3.18)
Sensitivity=T P
T P +F N (3.19)
An important aspect to consider when evaluating the models is to verify
whether the models are over-tted or under-tted. A low variance error is obtained
when the error using the training set is low but high when validating the model
with the test set. This indicates that the model is over-tted and that it has been
too highly adjusted to the training set, adopting its variability. A solution to avoid
over-tting may be to add more training data or adjust the classier parameters.
Another problem is called bias-error, which is when the error of the model with
both the training set and testing set is high, indicating that the model is not able to
adjust to the dataset or is under-tted. Depending on the nature of the dataset and
the classier, this problem can be avoided by considering longer training times,
lower learning rates, more layers, etcetera [170].
For one-class problems, there are several metrics that can be computed.
Particularly for biometric systems, the true acceptance rate, or TAR, and true
rejection rate, or TRR, are important and among the most widely used metrics
for evaluating models. The TAR is the percentage of times the system correctly
veries a true claim of identity and the TRR the percentage of times it correctly
rejects the subjects that are not in the system.
3.5 Channel reduction and selection
While a laboratory setting and research-grade EEG equipment ensure a
controlled environment and high-quality multiple-channel EEG recording, there
are applications, situations, and populations for which this is not suitable.
Conventional EEG is challenged by a high computational cost, high-density,
immobility of the equipment, and the use of inconvenient conductive gels.
The main objectives for channel reduction and selection are to
1)
reduce the
3.5. Channel reduction and selection 49
computational cost for EEG signal processing,
2)
reduce the over-tting that
can occur due to the use of unnecessary channels and improve the classication
accuracy, since a large number of channels can contain redundant or useless
information,
3)
identify the brain areas that generate task-dependent activity, and
4)
reduce preparation time. All of these objectives can be achieved by selecting
the most relevant channels and removing task-irrelevant and redundant channels,
thus extracting the most relevant features [171,172].
An important point is that selection of a low number of channels can result
in a low-power hardware design. This would allow expansion of the range of
applications of EEG signals from clinical diagnosis and research to healthcare, a
better understanding of cognitive processes, learning and education, and currently
hidden/unknown properties behind ordinary human activity and ailments (i.e.,
resting-state, walking, sleeping, complex cognitive activity, chronic pain, insomnia,
etc.) [173].
Various channel reduction and selection methods have been tested for
extracting channel subsets, ranging from algorithms, such as ltering, wrapper,
embedded, and hybrid methods [
171
,
172
,
174
189
] to the use of genetic
algorithms, such as the simple GA, steady-state genetic algorithm, genetic neural
mathematics method (GNMM), articial bee colony (ABC) algorithm, and NSGA-
based algorithms [
87
,
138
,
190
201
]. These methods have been generally tested
in motor imagery, but a unique set of channels for this task has not been found
[172,174,176,179,188,196,198,199].
In a low-density device, the channel selection approach can be possibly used
to modify the channel’s position or at least activate the relevant sensors in real-
time and, thus, increase classication accuracy and reduce processing time. Two
greedy and one multi-objective optimization algorithm of interest for this thesis
are presented next.
3.5.1 Greedy algorithms
A greedy algorithm makes the optimal decision at each stage (local optimal or
local maximum) and generally does not produce an optimal solution, but this
strategy approximates a globally optimal solution in a short period of time [
202
].
An easy and rapid way to evaluate the most relevant parameters or features for
obtaining the best results in a problem is the use of greedy algorithms [
202
]. The
50 Materials and Methods
idea of using greedy algorithms for channel selection is to obtain all combinations,
removing 1channel at a time, and selection of the subset with the best results,
which represents the local maximum. The procedure is then repeated using the
obtained subset while the length of the subset is still greater than 1channel.
The same process can be applied but rst after selecting the single channel
with the best results. The process is then repeated trying to add another channel
and selecting the subset of two channels with the best results. The process is
repeated, adding additional channels until all the channels have been added to the
subset. This method provides a general idea of the channels with the most useful
information for the classiers.
These methods are known in combinatorial optimization and articial
intelligence as backward-elimination and forward-addition algorithms and have
been used in feature subset and channel selection [
173
,
203
206
]. Both methods
provide an optimal solution at each step, but neither is able to predict complex
iterations between channels or features that may aect the performance of the
classier, which is why they are not considered to be a global solution.
3.5.2 Multi-objective optimization methods
An optimization problem consists of maximizing or minimizing a function by
systematically choosing input values from a valid set and computing the value of
the function, which can be limited to one or more restrictions, or it can be without
any restriction. In an optimization problem, the model is feasible if it satises all
the restrictions and it is optimal if it also produces the best value (minimum or
maximum) for the objective function.
A Multi-objective optimization problem (MOOP) has two or more objective
functions that are to be either minimized or maximized. As in a single-objective
optimization problem, a MOOP may contain a set of constraints, which any feasible
solution must satisfy [207]. Eq. 3.20 presents a MOOP in its general form.
3.5. Channel reduction and selection 51
Minimize/Maximize fm(x),m=1,2, ...., M
subject to дj(x) ≥ 0,j=1,2, ...., J
hk(x)=0,k=1,2, ...., K
x(L)
ixix(U)
i,i=1,2, . ..., n
(3.20)
As a result of the optimization process, a set of solutions is obtained, where
a solution
xRn
is a vector with
n
decision variables,
x=[x1,x2, .. ., xn]
. The
objective functions constitute a multi-dimensional space called the objective space,
or
ZRM
. For each solution
x
in the decision variable space, there is a point
zRMin the objective space, denoted by f(x)=z=[z1,z2, .. ., zM].
3.5.2.1 Non-dominated sorting genetic algorithms (NSGA)
Genetic algorithms (GAs) mimic Darwinian evolution and use biologically inspired
operators. Their population is comprised of a set of candidate solutions, each with
chromosomes that can be mutated and altered. GAs are normally used to solve
complex optimization and search problems [208].
GAs normally consists of
1)
population initialization,
2)
tness function
calculation,
3)
crossover,
4)
mutation,
5)
survivor selection, and
6)
termination
criteria to return the best solutions. The population consists of a set of
chromosomes that are possible solutions to the problem and each chromosome
can have as many genes as variables in the problem. There are various proposed
methods in the state-of-the art for each stage [208211].
For the genetic representation of the solution domain, it is possible to dene
chromosomes using genes with binary values, i.e., 0or 1, as well as those with
integer or decimal values. For example, if the gamma parameter of OCSVM has to
be optimized, it can be dened as a gene with decimal values in the interval [0, 1].
The non-dominated sorting genetic algorithm, or NSGA [
210
], uses a non-
dominated sorting ranking selection method to emphasize good candidates and a
niche method to maintain stable sub-populations of good points (Pareto-front),
where a non-dominated solution is a solution that is not dominated by any other
solution. NSGA-II was used to solve certain problems related to computational
complexity, the non-elitist approach, and the need to specify a sharing parameter
52 Materials and Methods
Figure 3.8: An illustrative example of the NSGA-II procedure [211].
to ensure diversity in a population presented in the rst version. NSGA-II also
reduced the computational cost from
O(M N 3)
to
O(M N 2)
, where
M
is the number
of objectives and
N
the population size. Additionally, the elitist approach was
introduced by comparing the current population with the previously found best
non-dominated solutions [211].
Fig. 3.8 presents the NSGA-II framework, in which parent and child populations
are compared using the tness function and organized using the non-dominated
sorting algorithm for creating dierent fronts, from high to low importance. Then,
the individuals in the rst front are selected to be used in the next generation.
There are situations in which a front has to be split (In Fig. 3.8, front 3) because
not all individuals are allowed to survive. In this split front, solutions are selected
based on crowding distance [211].
NSGA-III has been shown to eciently solve 2- to 15-objective optimization
problems [
212
]. NSGA-III follows the NSGA-II framework but uses a set of
predened reference points that emphasize population members that are non-
dominated, yet close to the supplied set [
212
,
213
]. The predened set of reference
points are used to ensure diversity in the obtained solutions. When using NSGA-
III, the reference points are generally places on a normalized hyper-plane that is
equally inclined to all objective axes and has an intersection with each. For
example, in a three-objective optimization problem, the reference points are
3.6. Description of datasets used in the thesis 53
Figure 3.9: Reference points of NSGA-III in a three-objective optimization problem.
created on a triangle with apexes at
(
1
,
0
,
0
),(
0
,
1
,
0
)
, and
(
0
,
0
,
1
)
[
213
,
214
], as
shown in Fig. 3.9.
3.6 Description of datasets used in the thesis
3.6.1 CHB-MIT dataset
Most of the proposed methods for epileptic seizure classication in the state-of-
the-art are tested on datasets from the PhysioNet [
215
] and EPILEPSIAE [
216
]
projects and the TUH EEG Corpus [
217
], in which some of the datasets consist of
private repositories or to which access is limited.
The EEG recordings used were obtained from pediatric patients with
intractable seizures who were monitored for several days at the Boston Children’s
Hospital following the withdrawal of anti-seizure medication to characterize their
seizures and assess their candidacy for surgical intervention. The dataset used
comes from the PhysioNet project and is partially described in [
215
,
218
] and
can be found in the CHB-MIT Scalp EEG Database or doi.org/10.13026/C2K01R.
The dataset consists of bipolar EEG signals from 24 patients that were recorded
using 22 channels (FP1-F7, F7-T7, T7-P7, P7-O1, FP1-F3, F3-C3, C3-P3, P3-O1,
FP2-F4, F4-C4, C4-P4, P4-O2, FP2-F8, F8-T8, P8-O2, FZ-CZ, CZ-PZ, P7-T7, T7-FT9,
FT9-FT10, FT10-T8, and T8-P8), with a sampling rate of 256 Hz, using the 10-20
54 Materials and Methods
Figure 3.10: Example of the raw EEG data of C3-P3, T7-FT9 and C4-P4 channels
from the third instance of Patient 1 of the CHB-MIT dataset.
international system. It should be noted that channels FT9 and FT10 are not part
of the 10-20 international system.
The EEG data for each epileptic seizure and epileptic-free period is of six
seconds and there are an average of 80 instances for each class for each patient.
More details can be found in [
135
,
215
,
218
], and in the CHB-MIT Scalp EEG
Database.
Certain important details are shown in Table 3.1, including the duration (in
seconds) of the EEG signal for each epileptic event. However, six-second segments
of the epileptic seizures are also considered to compare the seizures between
subjects with similar components.
Fig. 3.10 presents the raw EEG signal of an epileptic seizure and 30 seconds
before onset (the onset is indicated by a vertical line in black) of the rst instance
of subject 1, showing the EEG data corresponded to C3-P3, T7-FT9 and C4-P4
channels.
3.6.2 EEGMMIDB dataset
This dataset consists of EEG signals of 109 subjects collected from 64 EEG channels,
localized according to the 10-10 international system, with a sample rate of 160 Hz
and a recorder using the BCI2000 system. The public motor movement/imagery
dataset (EEGMMIDB) is part of the PhysioNet project [215].
Each subject performed two one-minute resting-state runs, one with the eyes
3.6. Description of datasets used in the thesis 55
Table 3.1: Details of the epileptic-seizure data presented in [218].
Length in seconds
Patient Gender Age Seizures Average Max Min Segments
of 6 s
1 F 11 7 63.1 101 27 74
2 M 11 3 57.3 82 9 29
3 F 14 7 57.4 69 47 67
4 M 22 4 94.5 116 49 63
5 F 7 5 111.6 120 96 93
6 F 1.5 7 15.6 20 12 18
7 F 14.5 3 108.3 143 86 54
8 M 3.5 5 183.8 264 134 153
9 F 10 4 69.0 79 62 46
10 M 3 7 63.9 89 35 74
11 F 12 3 268.7 752 22 134
12 F 2 38 36.9 97 13 234
13 F 3 12 44.6 70 17 89
14 F 9 8 21.1 41 14 28
15 M 16 20 99.6 205 31 332
16 F 7 6 8.8 14 6 9
17 F 12 3 97.7 115 88 49
18 F 18 6 52.8 68 30 53
19 F 19 3 78.7 81 77 39
20 F 6 8 36.8 49 29 49
21 F 13 4 49.8 81 12 33
22 F 9 3 68.0 74 58 34
23 F 6 10 60.6 113 20 101
24 13 31.9 70 16 69
Sum 189 1925
Mean 7.9 74.2 121.4 41.3
Max 752
Min 6
open and one with the eyes closed. Then, three two-minute runs were carried
out for four dierent tasks: two motor movement tasks and two imagery tasks
[
219
]. The four types of motor movement and imagery tasks were performed for
opening and closing the left or right st, imagining opening and closing the left or
right st, opening and closing both sts or both feet, and imagining opening and
closing both sts or both feet according to the position of a target on the screen
(Left, right, top, or bottom).
56 Materials and Methods
Figure 3.11: Example of the raw EEG data of F5, T8 and T10 channels of the rst
instance of subject 1 of the EEGMMIDB dataset.
For the experiments carried out in this thesis, only the two one-minute baseline
runs were used to create instances of one second, obtaining 60 instances of one
second in the resting-state with the eyes open and 60 instances of one second in
the resting-state with the eyes closed for each subject.
Fig. 3.11 presents the raw EEG signal of resting-state with the eyes open of
the rst instance of subject 1, showing the EEG data corresponded to F5, T8 and
T10 channels.
3.6.3 P300-speller dataset
This dataset consists of EEG signals from 26 subjects (24 right-handed and 2 left-
handed), with an average age of 29.2
±
5.5 years, from 56 passive Ag/AgCl EEG
electrodes that were placed following the extended 10-20 international system.
The EEG signals were all referenced to the nose and the ground electrode was
placed on the shoulder, the impedance was kept below 10 k
. The EEG data was
collected during ve sessions and consist of 60 instances per session, with a sample
rate of 600 Hz, that were down-sampled at 200 Hz [220].
The protocol used to record the EEG signals used the P300-speller paradigm
(as is illustrated in Fig. 3.12) and introduced in [
220
]. Briey, the target letter (the
letter to be presented) is indicated by a green circle for one second. Then, letters
and numbers (6 X 6 items, 36 possible items displayed on a matrix) are ashed
in groups of six characters. Next, the display remains blank for a period of 2.5
3.7. Methods proposed in the thesis 57
Figure 3.12: Protocol design for recording positive or negative feedback-related
responses in the P300-speller dataset [220].
to 4 s, representing the resting-state. During this random period, the subjects
are requested to remember the letter displayed. Then, the letter chosen by the
implemented P300 classier is displayed for 1.3 s. If the presented letter is the one
that was previously presented, the subject sends a positive response; otherwise,
the subject sends a negative response.
An example of a positive feedback-related response corresponding to the
target letter
i
is shown in Fig. 3.12. For the experiments carried out, only the
positive-feedback responses were used. Thus, the number of positive-feedback
trials can be dierent between subjects and sessions. The minimum number of
positive-feedback related responses was selected, which was 25 instances per
session per subject. Fig. 3.13 presents the raw EEG signal of the rst instance of
subject 1, showing the EEG data corresponded to P7, P8 and T8 channels.
3.7 Methods proposed in the thesis
This section describes the general owchart of the proposal presented in Fig. 3.1
but it may dier, depending on the dataset used and the application. Thus, more
details are added for each case in the following Chapters.
3.7.1 Pre-processing, feature extraction and classication
The CAR method was applied to the EEG data and then EMD or DWT methods
for decomposing the EEG signals into dierent sub-bands were applied. After
decomposing the EEG signals, two energy values (Teager and instantaneous
58 Materials and Methods
Figure 3.13: Example of the raw EEG data of P7, P8 and T8 channels of the rst
instance of subject 1 of the P300-speller dataset.
energy) and the two fractal dimension features (Higuchi and Petrosian fractal
dimension) were computed for each sub-band.
EMD was tested using various numbers of IMFs but only the two closest IMFs
were used based on the Minkowski/Euclidean distance because they have been
shown to provide the same performance as that of using more. For DWT, the 2.2
mother function bi-orthogonal, with four levels of decomposition, was used based
on the results obtained from previous studies [
86
,
87
,
135
,
138
,
173
,
221
223
]. The
process for extracting four features for each selected IMF returns eight features
per channel or 20 features per channel when using DWT. The process is repeated
for each channel used and then concatenated to obtain a single vector of features
that represents the EEG signal for each instance. Figs. 3.14 and 3.15 present the
owchart of the process followed for DWT and EMD, respectively.
Dierent classiers for creating the machine-learning models were tested
using the obtained feature vectors for each instance, depending on the application
and experiment. In general, the process can be summarized as in Fig. 3.16, in
which the training and testing sets were separated after obtaining the features
from the EEG dataset, whenever possible. The training set was used to create the
machine-learning model using 10-fold cross validation and the model validated
using the testing set, which was 20% of the dataset. Using this approach, the
metrics can be obtained for evaluating the performance of the method in each
experiment, consisting of the accuracy and standard deviation from the 10-fold
3.7. Methods proposed in the thesis 59
Figure 3.14: Flowchart summarizing feature extraction using DWT.
Figure 3.15: Flowchart summarizing the feature extraction procedure using EMD.
Figure 3.16: Flowchart of the procedure followed for EEG signal classication.
cross-validation, as well as the accuracy and standard deviation from the testing
set.
3.7.2 General overview of the proposed method
The owchart presented in Fig. 3.16 is for a single iteration of the method, but
the purpose of the proposal is to repeat this process several times to reduce
the number of necessary channels while increasing, or at least maintaining, the
60 Materials and Methods
Figure 3.17: Example of chromosome representation and owchart of the
optimization process for parameter optimization and EEG channel selection using
NSGA-III.
performance. Additionally, it is also necessary to optimize certain parameters for
certain classiers.
Fig. 3.17 presents an example of the process for feature extraction and
classication, but the entire process can be handled by an optimization algorithm.
In the example presented, the process is handled by NSGA-III using a chromosome
representation with 64 EEG channels,
1
if the channel will be used and
0
if not,
and two genes to optimize the parameters of the model (indicated as P1 and P2),
one with integer values (which can be, for example, from 0 to 5) and the other
with decimal values (which can be from 0 to 1).
The parameters of the classier can be tuned using simple methods, such as grid
search [
224
], but they need to be tuned to the model under specic circumstances
and for a specic number of channels. In this case, the best parameters for the
models must be found and this can be accomplished by adding a gene for each
parameter to the chromosomes generated by the genetic algorithms.
In the example, the process starts using the raw EEG signals, from which
feature extraction is performed and the results organized and stored for iterative
use. From this point on, the main process is handled by NSGA-III, which starts
creating all possible candidates (chromosomes) for each population. Then, the rst
64 genes are used to extract the sub-dataset for the channels, represented as 1 in
3.8. Hardware and software tools used in the thesis 61
the chromosome, and the subset evaluated with the classiers using genes 65 and
66 to dene the classier’s parameters. The best results obtained and the number
of EEG channels used is returned to NSGA-III to evaluate each chromosome in
the current population. The process is repeated, creating dierent populations,
until the termination criterion is reached.
The termination criterion for the optimization process is dened by the
objective space tolerance, which is dened as 0
.
0001. This criterion is calculated
every 5
th
generation. If optimization is not achieved, the process stops after a
maximum number of generations. The denition of the problem to optimize,
the number of objectives, the size of each population in each iteration, and the
maximum number of generations are dened for each experimental conguration
in Chapters 4and 5.
3.8 Hardware and soware tools used in the thesis
Free public EEG datasets, as well as tools and libraries for creating the code on
python3 [
225
], were used. Implementation of the classiers was based on the
scikit-learn python library [226] and the NSGA algorithms on pymoo [227].
Other important python libraries used included Dask (for task distribution
using parallel computing), Scipy, and Numpy [228230]. For the implementation
of EMD and DWT, the PyWavelets and pyhht libraries were used [231,232].
Most of the experiments in which optimization with NSGA was used were
carried out on the NTNU IDUN computing cluster [
233
]. The cluster has more
than 70 nodes and 90 GPGPUs. Each node contains two Intel Xeon cores and at
least 128 GB of main memory and is connected to an Inniband network. Half
of the nodes are equipped with two or more Nvidia Tesla P100 or V100 GPGPUs.
Idun storage is provided by two storage arrays and a Lustre parallel distributed
le system.
62 Materials and Methods
Chapter 4
Case study 1: Channel count
optimization for Epileptic
seizure classication
In this Chapter, the proposed method for feature extraction is implemented
for representing epileptic seizures and seizure-free periods. Dierent classication
algorithms are tested and compared using the obtained features. The main objective
of this thesis, which is reduction of the number of required EEG channels, is assessed
by implementing various channel-reduction and selection methods using greedy and
multi-objective optimization algorithms.
This Chapter is based on the journal articles [
135
,
200
] and mainly addresses the
1st and 2nd research questions and partially the 3rd.
4.1 Introduction
Epilepsy is a group of neurological disorders, characterized by recurrent epileptic
seizures, that aects approximately 1% of the world’s population of all ages, both
sexes, and all races and ethnic backgrounds [
234
]. It consists of widespread
electrical discharges of a set of neurons inside the brain [
235
]. Epileptic seizures
are normally detected by continuous monitoring of EEG signals; the epileptiform
can be categorized into ictal, interictal, and postictal periods. The identication
of seizures by visual inspection can be time-consuming and lead to an incorrect
interpretation of EEG signals, which can trigger under/over medication of patients
[236].
63
64 Channel count optimization for Epileptic seizure classication
Suitable methods and proper detection of epilectic seizures could facilitate the
rapid treatment of patients and improve the diagnosis of epilepsy. Epileptic events
are attributed to localized disturbances in various areas of the brain [
237
]. The
epileptogenic focus for approximately 33% of epilepsy patients is located in the
temporal lobe and their condition is referred to as temporal-lobe epilepsy (TLE)
[238,239].
4.2 State-of-the-art
Current state-of-the-art eorts attempt to improve the feature extraction stage
for correct representation of the seizure and seizure-free periods using machine-
learning methods. Several relevant studies using the same public dataset have
been published, using various experimental setups. The research and applications
for automatic classication and detection of epileptic seizures based on EEG, using
supervised, semi-supervised, and deep-learning techniques, have increased during
the last few years. However, comparisons between experiments, even using the
same datasets, have shown conicting results.
In one study [
240
], the authors used iEEG signals from only ve subjects, with
only 20 epileptic seizures for each. Thus, they had data for only 100 epileptic
seizures and EEG signals from the epileptogenic zone during free intervals as
seizure-free periods. They reported an accuracy of 99.6% from only one channel
using a neural network. However, this approach is known to work better when
using a large amount of data during the training process, as neural networks learn
only by weight adjustment and require all the possibilities to be adequately trained.
In another study, the authors used the same dataset and performed ve levels of
DWT and fuzzy approximate entropy for feature extraction [241].
The study presented by [
242
] used relative energy values and normalized
variation coecients from DWT in the feature extraction stage and then linear
discriminant analysis (LDA) for classication. The method was evaluated on
the data of ve subjects of the CHB-MIT dataset, with 23, 24, or 26 channels,
depending on the subject and the available data. In the classication process, they
used approximately 80% of the data for training and the rest for testing, obtaining
an accuracy of 0.91. Later [
243
] presented a method for feature extraction with
even features from the intersection sequence of Poincaré section with phase space
using LDA and naive Bayes classiers. They used 23 channels from the CHB-MIT
4.2. State-of-the-art 65
dataset, obtaining accuracies of 0.93 using 25% of the data for training and 0.94
using 50%.
The signal curve length of the time-domain EEG signal and the mode powers
of dynamic mode decomposition (DMD) were used by [
244
] for feature extraction
using 18 channels of the CHB-MIT dataset, which were manually selected. They
reported a sensitivity of 0.87 using approximately 50% of the data for training
their models for epileptic-seizure classication.
An approach using EMD to decompose EEG signals into dierent IMFs and
ve features for each chosen IMF was presented in [
135
]. In the aforementioned
study, the results of an approach based on channel reduction using the backward-
elimination algorithm were presented, obtaining an average classication accuracy
of 0.93 when ve channels and 10-fold cross-validation were used.
The work presented in [
245
] used a multivariate extension of the empirical
wavelet transform (EWT) to decompose the EEG signal into dierent oscillatory
levels and compute three features for each level. The accuracies obtained ranged
from 0.95 to 0.99 using ve channels and various classiers. This method selects the
channel with the lowest standard deviation and then the remaining four channels
with the highest mutual information (MI) with the previously chosen channel.
A method based on 24 feature types and SVM classiers was presented by [
246
].
The experiments were performed using the 22 available EEG channels of the TUH
EEG Corpus [217] and the accuracy obtained was 0.994.
Several methods have been proposed using various values of entropy for
feature extraction [
247
], EMD for decomposing the EEG signals [
248
], features
based on Fourier-Bessel series expansion [
249
,
250
], and the energy from sub-
bands extracted using the Taylor-Fourier lter bank [
251
]. The proposals used
machine learning classiers [
247
251
] and neural networks [
252
]. However, these
approaches were tested using the Bonn university EEG database, which consist of
a single channel and is based on invasive seizure EEG signals [253].
Based on the previous presented studies, epileptic-seizure classication can
still be improved by representing the seizure and seizure-free periods correctly
to obtain better results using EEG signals. Certain state-of-the-art methods have
been tested on small or single-channel (using iEEG) datasets, showing competitive
accuracies for classifying epileptic seizures; however, the use of EEG signals
66 Channel count optimization for Epileptic seizure classication
has only been assessed in experiments using all available channels or manually
selected channel arrays.
The feature extraction process and classier design are important for the
classication and detection of epileptic seizures, but the use of only a few EEG
channels (without using iEEG) will provide new areas of research and expand
potential applications in and outside of hospitals and laboratories. This will
required the use of robust EEG channel-selection procedures that will reduce
the current limitations of portability, as well as the computational cost to obtain
faster results, decreasing possible over-tting that comes from using all available
channels. Recent eorts and improved technology of dry EEG sensors have
opened up new possibilities to develop new types of EEG systems [
254
,
255
].
In this context, future eorts will be focused on low-cost portable devices for
personal use, reducing the necessary number of EEG channels while maintaining
or increasing the accuracy of machine-learning-based algorithms.
In this Chapter, two methods for feature extraction, four classiers with various
parameters, and two-channel selection methods to classify epileptic-seizure and
seizure-free periods are analyzed. The process of selecting channels was considered
as a multi-objective optimization problem, using the lowest possible number of
EEG electrodes and obtaining the highest possible accuracy. The approach was
tested on a well-known public dataset, described in Section 3.6.1 [215].
4.3 Definition of the problem to optimize
The problem that requires optimization is the selection of the most relevant and
necessary EEG channels for epileptic-seizure classication while increasing or
at least maintaining the accuracy of the classiers. This requires organizing the
dataset and a representation of the variables in the GA. NSGA-II and NSGA-III
will be used to manage minimization of the objective functions and compare the
results using dierent feature extraction methods and classiers.
In general, a GA requires a genetic representation of the solution domain and
a tness function to evaluate the solutions domain, which in this case, was an
array representing each channel (see Fig. 4.1) and the tness function for the
two-objective optimization problem dened as
[Acc, No]
, where
Acc
was the
classication accuracy obtained with the chromosome and
No
the number of EEG
channels used.
4.3. Denition of the problem to optimize 67
Figure 4.1: Complete process for EEG channel selection using NSGA-II or NSGA-III
for epileptic-seizure classication.
Fig. 4.1 shows a binary representation for creation of the chromosomes, with
each gene representing a channel, 1if the channel is used for the classication
process and 0if not. All possible channels that can be used are colored, representing
the search space, which is 22, as already mentioned in the description of the dataset
in Section 3.6.1. It should be noted that channels FP1-F7, FP1-F3, T7-P7, T7-FT9,
P7-T7, P7-O1, FP2-F4, and FP2-F8 were considered to be dierent, as the references
for the channels are dierent and the dataset provides the EEG signals for each
one separately.
All the best solutions found in the optimization process for epileptic-seizure
classication were analyzed. There are certain applications that use EEG signals in
which the automatic selection of the best solution may be important, especially for
cross-subject analysis. Here, however, it was important to analyze all the results
for each patient individually. With this assumption, the designer of a potential
low-cost EEG headset can consider whether it is better to sacrice accuracy or
the number of EEG channels, depending on how easy or dicult it is to detect
epileptic seizures for a given individual.
The problem to be optimized is dened by two unconstrained objectives:
rst, to maximize accuracy and second, to decrease the number of channels used
for epileptic seizure classication. The termination criterion for the optimization
process is dened by the objective space tolerance, which is dened as 0
.
0001. This
criterion is calculated every 5
th
generation and if not achieved, the process stops
68 Channel count optimization for Epileptic seizure classication
after a maximum of 500 generations. Fig. 4.1 shows the complete process, which
consists of three main stages: feature extraction, classication, and optimization.
Classication experiments were performed using the characterized EEG signals
for each patient separately, while reducing or selecting the EEG channels for
creating models to detect epileptic seizures. For each patient, a carefully balanced
dataset was created using epileptic-seizure and seizure-free segments of six-
seconds (as explained in Section 3.6.1).
The process starts by using the raw EEG signals of one patient at a time,
from which feature extraction is performed and the results organized and stored
for iterative use (see Fig. 4.1). From this point on, the main process is handled
by the NSGA, which starts creating all possible candidates (chromosomes) for
each population, obtaining the corresponding subset of features for the channels
represented as 1in the chromosome and evaluating the subset with four dierent
classiers, with dierent parameters for each. The best accuracy obtained and
the number of EEG channels used is returned to the NSGA to evaluate each
chromosome in the current population. The process is repeated, creating dierent
populations, until the termination criterion is reached.
In summary, the chromosome has 22 genes, each representing an EEG channel.
Each population size in each iteration is dened as 20, which was selected
experimentally. Four classiers were tested for each possible solution, but only
the highest accuracy was retained and the corresponding classier used stored for
analytical purposes.
4.4 Channel selection for Epileptic-seizure classification with
EMD-based features
For this experiment, EMD-based feature extraction was used, followed by the
greedy algorithm for channel reduction, and both NSGA-II and NSGA-III for
channel selection. The process described in Fig. 4.1 was repeated for each patient
using the above techniques.
For illustrative purposes, Fig. 4.2 presents the results obtained using NSGA-II
for epileptic-seizure classication of patient 1.
Fig. 4.2 clearly shows that NSGA-II managed to cope with both objectives,
whereas the opposite was true when using a lower number of channels, although
the backward-elimination algorithm sometimes showed higher accuracy when
4.4. Channel selection for Epileptic-seizure classication with EMD-based features 69
Figure 4.2: EEG Channel Selection for epileptic seizure classication of patient
1 using EMD-based features. Comparison between NSGA-II and the backward-
elimination algorithm.
using a high number of channels.
In this case, the best results obtained using NSGA-II consisted of four subsets of
channels, which did not necessarily overlap. This is because each chromosome was
almost independent and may have come from dierent parents. The illustrative
example presented in Fig. 4.3 shows the subsets of channels used for obtaining
the highest accuracy.
Channel Cz was selected in the rst four subsets shown using the NSGA-II
method, but not when backward-elimination was used. The accuracy obtained by
backward-elimination was notably lower than when NSGA-II was used, i.e., 0.964
and 0.993, respectively (see Fig. 4.2), which shows the feasibility of the method, as
well as the importance of a robust method for channel selection.
Tables 4.1 and 4.2 show the accuracy obtained using each of the methods
on data from all of the patients. Most of the best results were obtained when
10 channels were reduced to one (see Fig. 4.2). The tables show only the
results for channels 1 to 10 for all patients, but the experiment was carried
out with all channels. As an automatic termination criterion was used, the
number of generations for each patient was dierent and is shown in the tables.
70 Channel count optimization for Epileptic seizure classication
Figure 4.3: Four EEG Channel subsets selected by NSGA-II (
a)
) and backward-
elimination (b)) for epileptic-seizure classication in patient 1.
Supplementary material in [
200
] provides data on the accuracy, specicity, and
sensitivity for the rst four EEG channels of Tables 4.1 and 4.2.
The results highlighted in gray are those for which the accuracy obtained was
higher than when using backward-elimination. The average number of generations
was 39±12 for NSGA-II and 47±13 for NSGA-III.
Patient 13 appears to be a possible special case, as similar accuracy was
obtained with all methods. NSGA-II showed the highest accuracy when using
three channels and NSGA-III when using ve, reaching 0
.
813. The addition of
more channels to detect epileptic seizures resulted in uctuations in the accuracy
but it did not increase.
Table 4.2 shows a number of empty cells when using NSGA-II and NSGA-III,
meaning that the accuracy obtained was not part of the best solutions. This is
best illustrated for the results obtained for patient 19 using the NSGA-III method
(see Fig. 4.4). This case shows a clear example of how the method works, as the
accuracy obtained using two channels was 0.975 but the addition of more channels
only decreased the accuracy, except for the use of six channels. This is related to
the small amount of information provided by the added channels.
As mentioned previously, the classier used each time is that resulting in the
highest accuracy using the subsets of EEG channels. The NSGA-based algorithms
4.4. Channel selection for Epileptic-seizure classication with EMD-based features 71
Table 4.1: Accuracy obtained using EMD for feature extraction with NSGA-II and
NSGA-III for EEG channel selection (subjects 1-12).
Id Method No. channels
12345678910
1
B-E 0.943 0.964 0.986 0.964 0.971 0.979 0.986 0.993 0.993 0.993
NSGA-II 0.979 0.979 0.986 0.993
NSGA-III 0.964 0.979 1.000
2
B-E 0.815 0.899 0.921 0.921 0.961 0.976 0.969 0.985 0.985 0.985
NSGA-II 0.866 0.921
NSGA-III 0.866
3
B-E 0.796 0.888 0.912 0.920 0.960 0.976 0.969 0.985 0.985 0.985
NSGA-II 0.911 0.943 0.958 0.975 0.976 0.975
NSGA-III 0.876 0.927 0.951 0.975 0.976
4
B-E 0.832 0.940 0.948 0.977 0.976 0.985 0.977 0.986 0.986 0.986
NSGA-II 0.914 0.946 0.955 0.977 0.992
NSGA-III 0.897 0.955 0.963 1.000
5
B-E 0.972 0.978 0.995 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.974 0.995 1.000
NSGA-III 0.970 0.995
6
B-E 0.975 1.000 0.975 1.000 1.000 0.975 1.000 1.000 1.000 1.000
NSGA-II 1.000 1.000
NSGA-III 1.000 1.000
7
B-E 0.962 0.962 0.963 0.992 0.992 0.992 0.992 0.992 0.992 0.992
NSGA-II 0.962 0.972 0.982 1.000
NSGA-III 0.962 0.972 1.000
8
B-E 0.884 0.884 0.877 0.877 0.874 0.877 0.865 0.884 0.874 0.890
NSGA-II 0.884 0.890 0.890 0.890
NSGA-III 0.884 0.884
9
B-E 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000
NSGA-III 1.000
10
B-E 0.993 0.993 0.993 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.993 1.000
NSGA-III 0.993 1.000
11
B-E 0.996 0.996 0.996 0.992 0.996 0.992 0.992 0.992 0.992 0.996
NSGA-II 0.996 0.996
NSGA-III 0.996 0.996
12
B-E 0.899 0.892 0.918 0.911 0.921 0.925 0.925 0.929 0.922 0.925
NSGA-II 0.899 0.908 0.919 0.928 0.932 0.941
NSGA-III 0.899 0.912 0.942
were clearly able to handle the complete process and the classiers most used
to obtain the highest accuracy are presented in Fig. 4.5. The results show the
percentage of use of each classier for each patient. For example, in the case of
72 Channel count optimization for Epileptic seizure classication
Table 4.2: Accuracy obtained using EMD for feature extraction with NSGA-II and
NSGA-III for EEG channel selection (subjects 13-24).
Id Method No. channels
12345678910
13
B-E 0.775 0.777 0.775 0.806 0.788 0.726 0.749 0.782 0.782 0.733
NSGA-II 0.775 0.777 0.798 0.806 0.813
NSGA-III 0.775 0.777 0.813
14
B-E 0.925 0.933 0.942 0.942 0.942 0.967 0.967 0.983 0.983 0.983
NSGA-II 0.933 0.967 0.983 0.983
NSGA-III 0.933 0.942 0.983
15
B-E 0.971 0.969 0.978 0.981 0.985 0.986 0.986 0.988 0.988 0.988
NSGA-II 0.981 0.981 0.988 0.988
NSGA-III 0.981 0.985 0.988
16
B-E 0.900 0.900 0.900 0.900 0.900 0.900 0.900 0.900 0.900 0.800
NSGA-II 0.900 0.900
NSGA-III 0.900 0.900
17
B-E 0.940 0.980 0.980 0.990 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.980 0.990 1.000
NSGA-III 0.980 1.000
18
B-E 0.790 0.852 0.832 0.862 0.853 0.882 0.892 0.910 0.900 0.900
NSGA-II 0.803 0.852 0.870 0.900 0.910 0.920
NSGA-III 0.783 0.852 0.862 0.880 0.890 0.892
19
B-E 0.913 0.908 0.925 0.925 0.950 0.963 0.975 0.975 0.988 0.988
NSGA-II 0.921 0.946 0.950 0.963 0.975 0.988 1.000
NSGA-III 0.913 0.975 1.000
20
B-E 0.948 0.970 0.957 0.957 0.970 0.980 0.990 0.990 0.968 0.980
NSGA-II 0.980 0.990
NSGA-III 0.980 0.990
21
B-E 0.879 0.933 0.888 0.888 0.908 0.938 0.904 0.942 0.933 0.908
NSGA-II 0.888 0.950 0.954 0.967 0.970 0.983
NSGA-III 0.888 0.942 0.954 0.983
22
B-E 0.971 0.971 0.983 0.983 0.983 0.983 0.983 0.983 0.983 0.983
NSGA-II 0.983 0.983
NSGA-III 0.983
23
B-E 0.938 0.940 0.938 0.955 0.962 0.955 0.962 0.962 0.962 0.962
NSGA-II 0.938 0.948 0.962
NSGA-III 0.938 0.946 0.970
24
B-E 0.975 0.975 0.992 0.992 0.992 0.992 0.992 0.992 0.992 0.992
NSGA-II 0.975 0.992 0.992 1.000
NSGA-III 0.992 1.000
NSGA-II for patient 1, the most highly used classier was RF, which was used
54.59% of the time, then SVM with 33.72%, KNN with 7.35%, and NB with 4.34%.
SVM and RF were the most highly used classiers to obtain the highest accuracy
4.4. Channel selection for Epileptic-seizure classication with EMD-based features 73
Figure 4.4: EEG Channel selection for epileptic-seizure classication of patient
19 using EMD-based features. Comparison between NSGA-III and the backward-
elimination algorithm.
Figure 4.5: Comparison of the most used classiers by NSGA-II (left) and NSGA-III
(right) for the 24 patients using EMD-based feature extraction.
in all iterations of NSGA-II and NSGA-III (see Fig. 4.5). On the other hand, NB was
used in all iterations but only returned the highest accuracy a few times. In general,
RF was used 32.8%
±
24
.
2of the time for all patients, SVM 47.0%
±
27
.
9,NB 3.1%
±
4
.
2,
and KNN 17.1%
±
20
.
5. For NSGA-III, the RF classier was used 32.0%
±
25
.
1of the
74 Channel count optimization for Epileptic seizure classication
time, SVM 48.8%±28.6,NB 2.8%±3.6, and KNN 16.4%±21.7.
The analysis of the most highly used classier in all generations and each
chromosome is important because it allows discarding the use of some to decrease
the computational cost and also because it shows that the classier necessary to
obtain the highest accuracy may dier, depending on the patient and the EEG
channel subsets used.
4.5 Channel selection for Epileptic-seizure classification with
DWT-based features
The experiment was repeated but now using DWT to extract the sub-bands
and then compute the four features per sub-band, as described above. The
experiments were repeated using NSGA-II and NSGA-III for the 24 patients.
Additionally the accuracies obtained were also compared to those obtained using
the backward-elimination algorithm. The results are summarized in Tables 4.3
and 4.4. Supplementary material in [
200
] provides the accuracy, specicity, and
sensitivity for the rst four EEG channels.
The results in Tables 4.3 and 4.4 show that an average of 36
±
7generations was
required for NSGA-II and 41
±
11 for NSGA-III.
In general, the use of DWT for
feature extraction resulted in more rapid EEG channel selection and beer
accuracy.
In the case of patient 13, the use of DWT instead of EMD considerably improved
epileptic-seizure classication, i.e., an improvement from 0.775 to 0.820 using
one EEG channel and from 0.777 to 0.849 using two. In general, both methods
showed high accuracy when the the EEG channels were selected using NSGA-based
methods. The most-used classiers when DWT was used for feature extraction
were SVM and KNN for both NSGA-II and NSGA-III, as shown in a mesh plot of
the most-used classier for each patient (see Fig. 4.6). Specically, for NSGA-II, RF
was used an average of 20.5%
±
16
.
5of the time for all patients, SVM 46.1%
±
23
.
5,NB
3.6%
±
3
.
8, and KNN 29.8%
±
23
.
1. When selecting the EEG channels using NSGA-III,
the RF classier was used an average of 22.1%
±
19
.
0of the time, SVM 47.3%
±
24
.
5,
NB 1.0%±1.4, and KNN 29.5%±23.3.
SVM was the most highly-used classier in general, but RF and KNN were
also highly used (see Fig. 4.6). These data also show that KNN was more highly
used with DWT-based features than with EMD-based features (see Fig. 4.5). NB
4.5. Channel selection for Epileptic-seizure classication with DWT-based features 75
Table 4.3: Accuracy obtained using DWT for feature extraction with NSGA-II and
NSGA-III for EEG channel selection (subjects 1-12).
Id Method No. channels
12345678910
1
B-E 0.950 0.993 0.993 0.993 1.000 0.993 0.993 0.993 1.000 1.000
NSGA-II 0.986 1.000
NSGA-III 0.986 1.000
2
B-E 0.983 0.992 0.992 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.992 0.992 1.000
NSGA-III 0.992 0.992 1.000
3
B-E 0.983 0.985 0.992 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.983 0.992 1.000
NSGA-III 0.983 1.000
4
B-E 0.952 0.966 0.975 0.983 0.976 0.983 0.983 0.983 0.976 0.983
NSGA-II 1.00
NSGA-III 1.00
5
B-E 0.995 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000
NSGA-III 1.000
6
B-E 0.975 0.950 0.950 0.950 0.950 0.950 0.950 0.950 0.900 1.000
NSGA-II 0.975 0.975 0.975
NSGA-III 0.975 0.975 1.000
7
B-E 0.962 0.972 0.980 0.980 0.980 0.980 0.980 0.980 0.980 0.980
NSGA-II 0.980 0.982 1.000
NSGA-III 0.980 1.000
8
B-E 0.914 0.903 0.917 0.904 0.894 0.884 0.894 0.890 0.890 0.894
NSGA-II 0.917 0.917
NSGA-III 0.971 0.917
9
B-E 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000 1.000
NSGA-III 1.000
10
B-E 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000
NSGA-III 1.000 1.000
11
B-E 1.000 1.000 1.000 1.000 0.996 0.996 0.996 1.000 0.996 1.000
NSGA-II 1.000
NSGA-III 1.000
12
B-E 0.899 0.932 0.942 0.942 0.949 0.935 0.942 0.945 0.952 0.945
NSGA-II 0.911 0.948 0.948 0.952
NSGA-III 0.911 0.952
was the classier with the lowest percentage of use for both approaches.
76 Channel count optimization for Epileptic seizure classication
Table 4.4: Accuracy obtained using DWT for feature extraction with NSGA-II and
NSGA-III for EEG channel selection (subjects 13-24).
Id Method No. channels
12345678910
13
B-E 0.822 0.827 0.793 0.827 0.795 0.798 0.776 0.798 0.776 0.827
NSGA-II 0.820 0.849 0.855 0.864
NSGA-III 0.820 0.850
14
B-E 0.950 0.967 0.983 0.983 0.983 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.967 0.983 0.995
NSGA-III 0.967 0.983 1.000
15
B-E 0.978 0.985 0.981 0.986 0.986 0.988 0.994 0.995 0.998 0.997
NSGA-II 0.978 0.994 1.000
NSGA-III 0.978 0.994 0.998 1.000
16
B-E 0.800 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000
NSGA-III 1.000
17
B-E 0.930 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 1.000
NSGA-III 1.000
18
B-E 0.862 0.862 0.912 0.922 0.922 0.922 0.940 0.952 0.932 0.952
NSGA-II 0.890 0.913 0.950 0.952
NSGA-III 0.862 0.913 0.952
19
B-E 0.987 1.000 0.987 1.000 1.000 1.000 1.000 1.000 1.000 1.000
NSGA-II 0.988 1.000
NSGA-III 0.988 1.000
20
B-E 1.000 1.000 1.000 1.000 1.000 0.990 0.990 0.990 1.000 0.990
NSGA-II 1.000
NSGA-III 1.000
21
B-E 0.921 0.950 0.938 0.967 0.983 0.966 0.966 0.966 0.966 0.966
NSGA-II 0.925 0.950 0.971 0.983
NSGA-III 0.933 0.950 0.983
22
B-E 0.983 0.983 0.983 0.983 0.983 0.983 0.983 0.983 0.983 0.983
NSGA-II 0.995 0.998 1.000
NSGA-III 0.995 0.995
23
B-E 0.938 0.946 0.953 0.961 0.961 0.962 0.955 0.962 0.969 0.969
NSGA-II 0.939 0.961 0.969 0.970 0.970 0.977
NSGA-III 0.939 0.961 0.977
24
B-E 0.975 0.975 0.975 0.975 0.975 0.983 0.975 0.983 0.975 0.983
NSGA-II 0.985 0.992 1.000
NSGA-III 0.985 0.988 1.000
4.6 Discussion
The EEG channel selection method for epileptic-seizure classication proved to
be robust. For example, the accuracy for patient 1 with DWT-based features was
4.6. Discussion 77
Figure 4.6: Comparison of the most-used classiers by NSGA-II (left) and NSGA-III
(right) for the 24 patients using DWT-based feature extraction.
0.97 using all EEG channels. The accuracy was even higher when using the EEG
channels selected by NSGA-II or NSGA-III (1 or 2 channels): 0.98 for EMD and
1.00 for DWT.
For example, the results obtained with the data of patient 12 showed the highest
accuracy using EMD to be 0.942 using six EEG channels selected by NSGA-III.
The
highest accuracy obtained using DWT-based features was 0.952 using four
EEG channels.
An important feature of the classication of the epileptic seizures
of this patient is that most of the highest accuracy values were obtained using
the KNN classier (see Figs. 4.5 and 4.6), i.e., an average of 73% and 84% using
EMD-based features and an average of 96% and 98% using DWT-based features,
for NSGA-II and NSGA-III, respectively.
Examination of the number of epileptic seizures described in the database
[
215
] showed this patient to have had 38 epileptic seizures and after segmentation
(six-second segments), 234 instances of epileptic seizures and 234 seizure-free
periods were obtained. This amount of data was one of the highest of the patients
used for this study. However for patient 15, for whom there was a similar amount
of data, the highest accuracy values were obtained using SVM. Thus, it is not
possible to argue that this is due to the amount of data. Therefore, future work will
also analyze more parameters related to the classier (i.e., number of neighbors
for KNN and kernel, as well as kernel parameters for SVM) and how accuracy is
78 Channel count optimization for Epileptic seizure classication
aected by the number of seizure periods/trials and then, a possible relationship
between the feature extraction method, the classier and classier’s parameters,
and more factors (sample rate, wet or dry electrodes, EEG device, etc.) that can
aect a solid conclusion will be determined.
As shown in Figs. 4.5 and 4.6,SVM was generally the most highly-used
classier but KNN was also highly used, independently of the feature extraction
method and whether NSGA-II or NSGA-III was used for channel selection. These
data also show that KNN was more highly used with DWT-based features than
EMD-based features. NB was the classier with the lowest percentage of use for
both approaches. For future steps, these ndings will be considered and used
for testing other important parameters related to each classier to reduce the
computation cost, instead of testing NB again.
In general, the results presented in this Section show that this approach is
able to classify epileptic seizure and seizure-free periods with an average accuracy
of up to 0.97
±
0
.
05 using only one EEG electrode. This result was obtained using
DWT-based features. The use of two or more channels can increase the accuracy
to 0.98 and 0.99, especially when the EEG channels are selected by NSGA-III (see
Table 4.5).
In the state-of-the-art, there are several relevant studies in which the authors
present various methods for feature extraction and classication using the same
dataset under dierent experiment setups. Table 4.5 presents a general overview
of such studies for analysis and comparison.
Table 4.5 shows the state-of-the-art and classication accuracy of approaches
using EMD-based or DWT-based features, as well as NSGA-II or NSGA-III. It
should be noted that the results are not directly comparable to those from previous
studies as a lower number of EEG channels were used, found by NSGA-based
algorithms, and the experiments were based on 24 subjects and used dierent
experimental setups. It should be noted that the average values presented in the
results were obtained from Tables 4.1,4.2,4.3, and 4.4, which correspond to the
results obtained in the Pareto-front for each subject in the dataset. In addition, the
average accuracy was aected for some subjects when using two or three channels,
for whom the highest accuracy values were not obtained with this number of
EEG channels (see Tables 4.1,4.2,4.3, and 4.4), i.e., using EMD-based features, the
4.6. Discussion 79
Table 4.5: Comparison of relevant existing methods for epileptic-seizure
classication using the CHB-MIT Scalp EEG dataset presented in [218].
Ref. Method Subjects,
channels
Evaluation
[256] Energy
and coecient of variation extracted
from DWT, interquartile range, median
absolute deviation from raw signal.
23, 23 accuracy of 0.80 using 80% for
training.
[242] Relative values of energy
and normalized coecients of variation
from DWT.
5, (23, 24
or 26)
accuracy of 0.91 using ˜
80% for
training.
[243] Seven features from the intersection
sequence of Poincaré section with
phase space.
23, 23 accuracy values of 0.93 and 0.94
using 25% and 50% for training,
respectively.
[245] Three features extracted from dierent
oscillatory levels using multivariate
extension of EWT. The channel with
the lowest standard deviation was
selected and the four channels with
higher mutual information then added.
23, 5 accuracy of 0.99 using 10-fold cross-
validation.
[244] Signal curve length of the time-domain
EEG signal and the mode powers of the
dynamic mode decomposition.
12, 18 sensitivity of 0.87 using 50% for
training.
[135]Teager and instantaneous energy,
Higuchi and Petrosian fractal dimension,
and DFA from 2 IMFs based on the EMD.
Channels selected using the backward-
elimination algorithm.
24, 5 average accuracy of 0.93 using 10-
fold cross-validation.
Proposed
method
using
EMD-
based
features
Teager and instantaneous energy, and
Higuchi and Petrosian fractal dimension
from 2 IMFs based on EMD.
24, 1-3 average accuracy values of
0.93±0.06,
0.95±0.06, and 0.95±0.05 using 10-
fold cross-validation for 1, 2, 3, and 4
channels selected by NSGA-II.
24, 1-3
channels
average accuracy values of
0.93±0.06,
0.94±0.06, and 0.96±0.04 using 10-
fold cross-validation for 1, 2, and 3
channels selected by NSGA-III.
Proposed
method
using
DWT-
based
features
Teager and instantaneous energy and
Higuchi and Petrosian fractal dimension
from 4 decomposition levels of the
DWT.
24, 1-3 average accuracy
values of 0.97±0.05, 0.97±0.04, and
0.98±0.02 using
10-fold cross-validation for 1, 2 and
3, channels selected by NSGA-II.
24, 1-3 average accuracy values of
0.97±0.05,
0.98±0.03, and 0.99±0.01 using 10-
fold cross-validation for 1, 2, and 3
channels selected by NSGA-III.
accuracy for the Pareto-front for NSGA-III was 0.992 with one channel, and 1.00
using four EEG channels, but there was no information for the combination with
80 Channel count optimization for Epileptic seizure classication
two or three channels for obtaining the accuracy in the Pareto-front.
Table 4.6: Comparison of several relevant existing methods for epileptic-seizure
classication using dierent datasets.
Ref. Method Subjects,
channels
Evaluation
[
257
]
Features based on approximate entropy and
classication using Elman and probabilistic
neural networks.
5, 1 accuracy of 1.000.
[
258
]
Five levels of decomposition by DWT and
features using PCA, independent component
analysis (ICA), and LDA. The classication
used SVM.
5, 1
accuracy values of 0.987,
0.995,
and 1.000 using features
based on PCA, ICA, and
LDA, respectively.
[
247
]
Entropy-Fuzzy Classier with three classes,
normal vs. pre-ictal vs. epileptic.
5, 1 accuracy of 0.981.
[
248
]
Features based on two-dimensional (2D) and
3D phase space representation (PSRs) of IMFs
from EMD, and least-square SVM (LS-SVM)
classier.
5, 1 accuracy of 0.986.
[
246
]
Using the TUH EEG corpus, they used 10-
second segments with a sample rate of 250
Hz and computed 24 features per channel.
Six dierent classiers were compared: SVM,
NB, KNN, RF, gradient boosting, and logistic
regression.
43, 22
accuracy of 0.994 using
SVM.
[
249
]
Features based on Fourier-Bessel series
expansion and classied using LS-SVM
5, 1
accuracy of 0.990 in the
best case.
[
252
]
Third-order cumulant (ToC) and neural
network with softmax classier.
5, 1 accuracy of 1.000.
[
251
]
Energy features from sub-bands extracted
using the Taylor-Fourier lter bank and LS-
SVM.
5, 1 accuracy of 0.948.
[
185
]
Wavelet coecients from sub-bands obtained
using DWT with 7 levels of decomposition
using iEEG from 10 patients of the Flint Hills
Scientic dataset.
10, 3 sensitivity of 0.96.
It is important to mention that in the work presented in [
246
249
,
251
,
252
,
257
,
258
], no methods of channel selection were used, as the dataset used consisted
of only one or two EEG channels and the study [
185
] used methods based on
variance or entropy to select the channels before the classication process.
Most of the studies presented in Table 4.6 were based on invasive EEG, which
4.6. Discussion 81
provides better signal quality [
253
]. Therefore, their performance should be
re-tested on non-invasive EEG signals for continuous monitoring.
Note that
in the presented work, the SVM classier was the most widely used and
provided the highest accuracy values relative to the other classiers and
neural networks, consistent with the results obtained in this thesis.
According to the results in this thesis, NSGA-III is able to nd the most relevant
EEG channel combinations using DWT-based features to achieve an average
accuracy of up to 0.99 using only three channels. Looking towards improving
the general performance of this approach and testing it using additional public
epileptic-seizure datasets, new experiments will be performed considering more
than two objective functions in the problem and verify whether NSGA-III is still
the best method for solving this problem [212,213].
Results have shown that the best accuracy can be reached using one to three
channels for certain subjects and four or more for others. Thus, testing dierent
methods in an attempt to improve the channel-selection process and decrease the
complexity is proposed for future studies. This can be achieved by testing and
comparing methods such as that presented by [
245
], which selects a channel with
the lowest SD and then four channels with the highest MI with the previously
chosen channel, as well as other optimization approaches [87,138,190201].
Epileptic-seizure classication using EEG signals is important for evaluating
the state of the brain. Following the evolution of the signals through continuous
monitoring will enable prediction with a low number of EEG channels, making it
easier to use and thus allowing long-term monitoring using a possibly personalized
portable EEG device [
259
,
260
]. However, there are several challenges that need
to be addressed before implementation in real life.
Because epilepsy can cause a variety of other neurological disorders (i.e.,
depression, anxiety, etc.) such confounders should be additionally studied to
better distinguish between an epileptic seizure and seizure-free periods. Thus,
future eorts will also include the study of epilepsy-related disorders and how they
can be recognized on EEG signals. A possible portable low-density EEG device
will facilitate monitoring in daily life, which will allow healthcare professionals
more condent management of seizures, not only in the hospital or laboratory
but also in conjunction with the recent progress in telehealth and telemedicine
82 Channel count optimization for Epileptic seizure classication
[261264].
From the results presented in this Chapter, it is clear that EMD-based or
DWT-based features can be useful for epileptic-seizure classication. Using these
approaches, a possible subject-tailored method can consider the addition of another
gene in the chromosome for the optimization process and thus select the most
useful method for detecting epileptic seizures for that subject. This will be tested
in future studies based on the ndings here, as well as dierent chromosome
representations for solving all possible problems related to parameter optimization
at the same time.
The computational complexity of the method used for channel selection is
O(M N 2)
. However, the study of the most relevant channels is important and it
must be performed for analysis and, as presented here, to verify whether epileptic
seizures can be detected using a few non-invasive EEG channels. The limitations
of the methods used for feature extraction are related to the well-known problems
of EMD, such as the selection of the best spline, the end eect, and the mode
mixing problem [116,126,128].
For DWT, the main problems are related to parameter selection, such as
the number of levels of decomposition and the mother function. Some of these
limitations have already been considered in the literature or can be solved by
using recent progress in code optimization [
227
,
228
,
265
]. Future eorts for
classication will focus on testing and comparing shallow convolutional neural
networks and Riemannian classiers, as they have been shown to provide high
accuracy values for EEG-signal classication [148,266,267].
Future eorts will concentrate on testing the methods used for epileptic-
seizure classication, the epileptic seizure prediction problem, testing methods
for feature extraction and classication, and testing whether the methods for
channel selection can nd the most relevant subsets for this task and seizure onset
detection [171,175,184,185].
Chapter 5
Case study 2: Channel count
optimization for EEG-based
biometric systems
This Chapter presents two approaches for creating EEG-based biometric systems
using various methods for channel selection and implementing them for feature
extraction and classication. This is tested in experiments using multi-class
classication, as well as one-class classication
This Chapter is based on the journal articles [
87
,
138
,
223
] and addresses the 1
st
,
2nd, and 3rd Research Questions.
5.1 Introduction
Security systems are used by organizations to protect places or information for
which privileges are needed or require access authorization, as well as to deny
unauthorized access to facilities, equipment, or resources and protect against
espionage, theft, or even terrorist attacks. Various safety measures have long been
proposed, ranging from the use of generic systems (security guards, closed-circuit
television, smart cards, proximity readers, and RFID) to that of biometric identiers
(ngerprints, palmprints, retinal scans, etc.) [268,269].
Biometric recognition refers to the automatic recognition of individuals based
on their physiological and/or behavioral features [
268
]. A biometric system is a
pattern recognition system that operates by acquiring biometric data from subjects,
extracting a set of features, and comparing this set of features against a template
83
84 Channel count optimization for EEG-based biometric systems
set in the database. Biometric systems have advantages over generic systems, as it
is more dicult to steal, compromise, or duplicate the key. However, biometric
systems are vulnerable to a variety of attacks aimed at undermining the integrity
of the authentication process [
269
]. For example, an intruder may fraudulently
obtain the latent ngerprints of an user and later used them to construct a digital or
physical artifact of the user’s nger [
270
]. This is possible because authentication
systems cannot discriminate between an intruder who fraudulently obtains access
privileges and authorized users.
Due to the increasing threat of bypassing the authentication and authorization
process of current traditional/biometric security systems [
269
], there is a growing
interest in exploring new biometric measures. In this context, the use of brain
signals to create biometric markers using various neuro-paradigms has emerged
as a robust alternative to the above-mentioned vulnerabilities.
Brain signals can be used as a basis for the design of biometric markers, as any
human physiological and/or behavioral characteristic can be used as a biometric
feature, as long as it satises the following requirements: universality, permanence,
collectability, performance, acceptability, and circumvention [
268
]. Brain signals
are highly reliable and secure because biometric markers obtained from EEG-
recordings of human brain activity are almost impossible to duplicate, as the brain
is highly individual [271].
An authentication system may include a stage in which the data is used in a
multi-class model with all the subjects in the dataset to identify a specic subject.
It may also include a verication step to compare the data from the claimed subject
with that of the true subject, alone in the dataset, to detect whether the subject is
an intruder or not. The order of these stages may dier depending on the approach.
The number of EEG-based biometric systems has been steadily growing using
various approaches to solve problems related to the authentication and verication
stages.
A research-grade EEG device guarantees a controlled environment and high-
quality multi-channel EEG recording, but this is oset by the high computational
cost, non-portability of the equipment, and use of inconvenient conductive
gels. The development of dry EEG sensors has created new possibilities for the
development of new types of portable EEG systems. An important step towards
5.2. State-of-the-art 85
this goal is a reduction in the number of required EEG channels while increasing,
or at least maintaining, the same performance as high-density EEG.
5.2 State-of-the-art
Depending on whether the paradigm is task-dependent or task-independent,
certain EEG channels provide only redundant or sub-optimal information. Several
techniques have been studied with the aim of developing low-density EEG-based
systems with high performance, i.e., pre-processing and feature extraction, channel
selection, and paradigms to stimulate brain signals. For EEG-based biometric
systems, several approaches have been presented using various paradigms to
stimulate and record the EEG signals, i.e., imagined speech [
222
,
223
,
272
], resting-
state [85,173,273277], and ERPs [138,206].
In general, resting-state potentials and ERPs have been shown to be good
candidates for a new biometric system for which there are several dierent state-
of-the-art approaches [
206
,
273
,
276
278
], with the localization of the relevant
channels diering, depending on the paradigm.
An important element is dimensionality reduction, which can be tackled
through channel selection and feature extraction. Several approaches can be used
to accomplish this task, including those based on methods such as PCA, DWT,
EMD, and even approaches using raw data as input for dierent congurations of
neural networks (NN) [138,206,222,223,279283].
Several approaches have been proposed for the creation of biometric systems
following various experiment congurations with various paradigms and methods
for feature extraction and classication using the EEGMMIDB dataset (see Section
3.6.2), using various congurations of neural networks [
280
,
284
286
], other
supervised and unsupervised techniques [
274
,
278
,
287
296
], and methods for
EEG channel selection [201,275,297].
One approach used a subset of eight pre-selected channels [
297
] and EEG
data from a task for training and then that from another task for testing. The
selection of the channels was justied based on their stability across various
mental tasks, and the results presented were evaluated using the half total error
rate (HTER), which was 14.69%. Another approach used various tasks from the
EEGMMIDB and channel selection, using the binary ower pollination algorithm
(BFPA), and reported accuracy values of up to 0.87 using supervised learning and
86 Channel count optimization for EEG-based biometric systems
approximately 32 EEG channels [
201
]. However, the analysis considered only
non-intruders when using multi-class classication, and therefore the addition of
more stages for detecting the intruders is necessary.
Other approaches use instances of dierent length with the same dataset,
such as instances of 10 or 12 seconds [
274
,
290
]. Resting-state instances of 10
seconds have been validated with the leave one-out framework, consisting of ve
instances of 10 seconds for training and one instance for validating the model
[
290
], resulting in a correct recognition rate (CRR) of 0.997 for the resting-state
with the eyes-open and 0.986 with the eyes-closed, all using 64 EEG channels.
An approach with one-second EEG signals from the FP1 and FP2 channels
and a 256-Hz sample rate during the resting state has been proposed for a
biometric system, extracting features directly from the raw data and using Fisher’s
discriminant analysis [
276
], obtaining a TAR of up to 0.966 and a false acceptance
rate (FAR) of 0.034. Another approach used two-second EEG signals from the
FP1 and FP2 channels, with a 2048-Hz sample rate, and the authors used a set of
classiers to perform multi-class classication [
273
]. They obtained an accuracy
of 0.93 and a false positive identication rate of 0.165. Another approach presented
the results of a study using the Cz EEG channel, which was manually selected , on
20 subjects during the resting-state [
277
], obtaining a TAR of 1.0 and TRR of over
0.8. None of these studies attempted to systematically select the minimal number
of optimal channels to perform the task.
Deep-learning algorithms have shown success in image processing and other
elds but have not shown convincing and consistent improvement over the
most advanced current methods for EEG data [
148
,
282
]. However, several new
approaches have been recently presented that show high accuracy. For example,
an approach using convolutional neural network (CNN) gated recurrent units
(CNN-GRU) was presented in [
281
], and the authors evaluated the proposed
method in a public dataset called DEAP, which consists of EEG signals from 32
subjects recorded from 32 channels using dierent emotions as a paradigm [
298
].
Their experiments were performed using 10-second segments of EEG signals and
they reported a mean CRR of up to 0.999 with 32 channels using CNN-GRU and
0.991 with ve channels that were selected using one-way repeated measures
ANOVA with Bonferroni pairwise comparison (post-hoc). The ndings of this
5.3. First approach using a two-stage classication process 87
work are interesting and the accuracy values obtained high. However, deep-
learning approaches require a large amount of data and the length of the signal
segments and the paradigm followed are not standard. Furthermore, for a real-time
application, the collection of a large number of instances and instances during
long periods can be exhausting, making such an approach noncompetitive with
current biometric systems in the industry (i.e., ngerprints, face recognition, etc.).
The amount of data and time required for training NN are the main concerns
for eective deployment and adoption of EEG-based biometric systems in real-life
scenarios. In the literature, researchers have reported results using from simple
NN structures (i.e., a single hidden layer) to more complex networks (recurrent
and CNN), but this requires the improvement of computational power, with faster
CPUs and the use of GPUs [
148
,
278
,
281
,
294
296
]. The large amount of data
required by deep-learning approaches can be overcome using an approach based
on simple data augmentation techniques by creating overlapped time windows
[284].
Other related proposals using neural networks have been presented and
compared to the state-of-the-art [
278
,
294
296
], amongst which some of the
most relevant studies used approximately 100 subjects and mostly 64 channels for
testing their approaches [
279
,
280
,
284
,
299
]. However, there is no dened method
for channel selection, since the process for selecting the most relevant channels
requires repetition of the classication process several times and it is well known
that deep-learning approaches are computationally costly [148,296].
5.3 First approach using a two-stage classification process
In this approach, the P300-speller dataset described in Section 3.6.3 and a two-stage
approach for the entire process, illustrated in Fig. 5.1, were used. An OCSVM
model was created with the aim to train the model to recognize subjects that are
already in the system and to reject those who are not (Intruders). In the rst
part of this experiment, the model was trained using subjects with IDs 1-13 (non-
intruder) and only EEG signals from session one, using 30 instances and all EEG
channels (56 channels). Then the EEG signals from all the subjects of session two
were used, considering subjects 14-26 as intruders, to validate the model (see Fig.
5.1). The results were evaluated using the TAR, TRR, and accuracy of multi-class
classication (see Table 5.1).
88 Channel count optimization for EEG-based biometric systems
Figure 5.1: Flowchart of the rst approach for intruder detection and subject
identication.
Table 5.1: TAR, TRR, and accuracy for subject identication and authentication
with EEG data from all channels using dierent
nu
and
gamma
values for one-
class SVM.
Subjects nu gamma TAR TRR Accuracy
Non-intruders 1 - 13 0.01 0.01 0.923 - 0.98 ±0.2
Intruders 14 - 26 - 0.083 -
Non-intruders 1-13 0.10 0.10 0.545 -
Intruders 14 - 26 - 0.449 -
Non-intruders 14 - 26 0.01 0.01 0.951 - 1.00 ±0.0
Intruders 1 - 13 - 0.212 -
Non-intruders 14 - 26 0.10 0.10 0.495 -
Intruders 1-13 - 0.551 -
Table 5.1 presents an example of the results using subjects 1-13 as non-intruders
and subjects 14-26 as intruders. The results show that approximately 90% of
the subjects were correctly accepted but also that only approximately 8% of
the intruders were correctly rejected. However, changing the nu and gamma
parameters for the SVM RBF changed the TAR and TRR to approximately 50% in
both cases.
Given that all subjects with access (subjects 1-13) passed the rst layer, a multi-
class classier was created for subject identication. An SVM with a linear kernel
was dened and used because of the results obtained in previous studies and also
because it was found experimentally to be the best solution. The owchart of the
5.3. First approach using a two-stage classication process 89
complete method is presented in Fig. 5.1. The accuracy obtained following 10-fold
cross-validation was 0.98, with a standard deviation of 0.02 (see Table 5.1).
This approach was used because the aim was to nd the best conguration
for the entire process. Creating a model using only the subjects with correct
permission who passed the rst layer would have aected the results and therefore
would not nave been valid.
5.3.1 Dening the problem to optimize
Once the non-intruder and intruder subsets were dened, the signals were
pre-processed and the features extracted. They can be used as input for the
authentication system, which can be distributed as presented in Fig. 5.1. However,
the use of a more complex system is required to t certain important parameters
and select the most relevant EEG channels, which in this case was analyzed as an
optimization problem.
The problem to be optimized is dened by four unconstrained objectives:
1) Reduce the number of EEG channels,2) maximize the accuracy of the multi-
class classication,3) maximize the number of accepted subjects with access, and 4)
maximize the number of intruders rejected. Each population size in each iteration is
dened as 30, which was selected experimentally. The termination criterion for the
optimization process is dened by the objective space tolerance, which is dened
as 0
.
0001. This criterion is calculated every 10
th
generation. If optimization is not
achieved, the process stops after a maximum of 500 generations.
The chromosome created to represent the search space in the scalp for this
rst approach is presented in Fig. 5.2, in which genes 1-56 represent the EEG
channels and the nu parameter is calculated using genes 57-60 and the gamma
parameter calculated using genes 61-64. When calculating the nu and gamma
parameters, the binary representation is converted into a decimal value, which
represents the position in a vector with the possible values for the parameter.
Thus possible values were dened experimentally, which in a key-value array are
{
0:0
.
000001
,
1:0
.
0001
,
2:0
.
0005
,
3 : 0
.
001
,
4:0
.
005
,
5:0
.
01
,
6 : 0
.
1
,
7 : 0
.
2
,
8 :
0
.
3
,
9 : 0
.
4
,
10 : 0
.
5
,
11 : 0
.
6
,
12 : 0
.
7
,
13 : 0
.
8
,
14 : 0
.
9
,
15 : 1
.
0
}
, for both nu and
gamma. The complete process is illustrated in Fig. 5.2.
Eight features per EEG channel were extracted for all subjects and each
instance following the previously explained method and that shown in the
90 Channel count optimization for EEG-based biometric systems
Figure 5.2: Example of the complete process for EEG channel selection using
NSGA-II, including the chromosome representation using 56 genes for the EEG
channels and eight for the nu and gamma parameters.
owchart presented in Fig. 3.15, in which the results are organized and stored
for iterative use, as shown in Fig. 5.2. The entire process is then handled by
NSGA-II or NSGA-III, which starts creating all possible candidates using a binary
chromosome representation for which the corresponding subset of features for
the channels is obtained, represented as 1for genes 1-56 of the chromosome, the
nu parameter calculated using genes 57-60, and the gamma parameter calculated
using genes 61-64.
Then, the obtained classication accuracy, number of accepted subjects with
access, number of rejected subjects, and number of EEG channels used are returned
to NSGA-II or NSGA-III to evaluate each chromosome in the current population.
The process is repeated, creating dierent populations by the NSGA until the
termination criterion is reached.
5.3.2 Solving the four-objective optimization problem using
NSGA-II with subjects 1-13 as non-intruders and 14-26 as
intruders.
This Section presents experiments that simultaneously considered all the problems
to investigate whether there is a particular combination that can solve the
optimization problem dened in the Methods Section using NSGA-II.
The experiment consisted of nding the best nu and gamma for the SVM with
5.3. First approach using a two-stage classication process 91
the RBF kernel to increase the TAR, TRR, and accuracy of subject identication or
maintain them as high as possible from previous congurations, while using the
lowest number of EEG channels. Briey, NSGA-II was used for channel selection
using the rst 56 genes in a chromosome to represent the EEG channels and then
four genes each to select the best nu and gamma parameters, obtaining thus a
chromosome of 64 genes.
Several plots of the results obtained considering the four objectives are
presented in Fig. 5.3 to illustrate the importance of the optimization process
(see Sub-gs. 5.3a,5.3b,5.3c and 5.3d), as only 11.11% of the possible channel
combinations resulted in a TAR and TRR between 0.9 and 1.0 (see Sub-g. 5.3e).
The classication accuracy according to the number of channels used and in
relation to the Pareto-front are shown in Sub-gs. 5.3d and 5.3f.
The results for the Pareto-front for all objectives are presented in Table 5.2.
NSGA-II found a two-channel combination for which a TAR of 0.91, TRR of 0.88,
and an accuracy of 0.78 for subject identication were obtained. NSGA-II also
found a 12-channel combination for which the accuracy of subject identication
was 0.93, the TAR 0.93, and the TRR 0.95. This result shows that it is possible to
reduce the number of channels from 23, 24, etcetera (which gave similar accuracy
values) by almost half using this approach.
5.3.3 Solving the four-objective optimization problem using
NSGA-II with subjects 14-26 as non-intruders and subjects
1-13 as intruders.
With the aim of searching for more global results, the previous experiment was
repeated using the same conguration but now considering subjects 14-26 as
non-intruders and subjects 1-13 as intruders. The results obtained for the four
objectives are presented in Table 5.3.
As in the previous experiment, an accuracy of up to 0.83 for subject
identication was obtained, with both a TAR and TRR of 1.00, using just a three-
channel combination (see Table 5.3). Increasing the classication accuracy for
subject identication, while maintaining the same TAR and TRR, required 16 EEG
channels, in contrast to the previous experiment for which the optimal number of
EEG channels was 12.
Table 5.3 presents the results obtained in the Pareto-front for the rst 30 EEG
92 Channel count optimization for EEG-based biometric systems
(a) First view of the candidates and the Pareto-
front.
(b) Second view of the candidates and the
Pareto-front.
(c) Aerial view. (d) Points in the Pareto-front.
(e) Distribution of the results obtained.
(f) Classication accuracy for the combination
in the Pareto-front.
Figure 5.3: Four dierent views of the results obtained with NSGA-II using subjects
1-13 as non-intruders and 14-26 as intruders.
5.3. First approach using a two-stage classication process 93
Table 5.2: TAR, TRR, and accuracy values obtained for the Pareto-front for four
objectives solved with NSGA-II using subjects 1-13 as non-intruders.
No. channels Accuracy TAR TRR nu gamma
1 0.55 0.90 0.90
2 0.78 0.91 0.88 0.0001 0.9
3 0.79 0.34 0.42
4 0.86 0.31 0.35
5 0.85 0.50 0.58
6 0.91 0.56 0.74
7 0.89 0.51 0.60
8 0.89 0.79 0.85 0.0010 0.9
9 0.87 0.82 0.92 0.0001 0.2
10 0.94 0.53 0.66
11 0.97 0.43 0.47
12 0.93 0.93 0.95 0.0001 0.9
13 0.97 0.43 0.54
14 0.98 0.51 0.64
16 0.94 0.76 0.77
17 0.99 0.37 0.44
20 0.98 0.61 0.75
21 0.97 0.76 0.80
22 0.95 0.25 0.30
23 0.97 0.92 0.94
24 0.98 0.96 0.96
25 0.98 1.00 1.00
26 0.98 0.94 0.98
27 0.98 0.96 1.00
29 0.97 0.93 0.96
30 0.99 0.83 1.00
channels, indicating the accuracy values obtained and the TAR and TRR, as well as
the nu and gamma values used for creating the one-class classiers to obtain the
TAR and TRR results. The most relevant accuracy values, TAR, and TRR and the
corresponding number of channels used are marked in gray; the nu and gamma
values used to obtain these results were also added to determine whether there
are similarities between these cases.
The channel combinations for this and the previous experiments were
independent. Venn diagrams were generated to compare the channels used in
the Pareto-front between this and the previous experiment to detect a possible
pattern or a more relevant area (see Fig. 5.4). The EEG channels used to obtain the
results marked in gray in Table 5.2 and the channel localization in Sub-g. 5.4c
94 Channel count optimization for EEG-based biometric systems
Table 5.3: TAR, TRR, and accuracy values obtained for the rst 30 EEG channels
in the Pareto-front for four objectives solved with NSGA-II using subjects 14-26
as non-intruders.
No. channels Accuracy TAR TRR nu gamma
1 0.53 0.70 0.70
2 0.62 0.31 0.31
3 0.83 1.00 1.00 0.00001 0.6
4 0.87 0.41 0.37
5 0.88 0.49 0.49
6 0.96 0.81 0.73
7 0.96 0.74 0.78
8 0.91 0.88 0.89 0.3000 0.8
9 0.97 0.52 0.54
10 0.97 0.90 0.91 0.0005 0.6
11 0.96 0.83 0.88
12 0.97 0.55 0.56
13 0.98 0.40 0.52
14 0.98 0.80 0.84
15 0.98 0.50 0.56
16 1.00 1.00 1.00 0.00001 0.6
17 0.99 0.73 0.65
18 0.98 0.93 0.93
19 0.99 0.38 0.59
20 0.99 0.47 0.57
21 0.98 0.74 0.71
22 0.99 0.99 0.99
23 0.98 0.76 0.72
24 1.00 0.74 0.64
25 1.00 0.99 0.99
26 1.00 1.00 0.99
27 1.00 1.00 1.00
28 1.00 0.96 0.96
29 1.00 0.95 0.97
30 1.00 1.00 1.00
are presented in Sub-g. 5.4a. The results marked in gray in Table 5.3 are shown
in Sub-g. 5.4b and EEG channel localization in Sub-g. 5.4d.
Fig. 5.4 shows certain channels within a black circle if they intersected with
one or more subsets. For example, sub-g.5.4c shows the CPZ channel in a black
circle, which means that it was used in one or more subsets, as shown in sub-g.
5.4a. It is important to highlight these channels for the discussion of the results
and for the purpose of comparison with the following experiments in the thesis.
5.3. First approach using a two-stage classication process 95
(a) Venn diagram of the subsets for 2, 8, 9, and
12 channels in the previous exp. presented in
Table 5.2.
(b) Venn diagram of the subsets for 3, 8, 10,
and 16 channels in the current experiment
presented in Table 5.3.
(c) Channel subsets from Sub-g. 5.4a. (d) Channel subsets from Sub-g. 5.4b.
Figure 5.4: Relevant EEG channel subsets in the Pareto-front for four objectives
using NSGA-II, considering subjects 14-26 as intruders in the previous experiment
and subjects 1-13 as intruders in the current experiment.
5.3.4 NSGA-III for solving the four-objective optimization
problem.
The previous two experiments were repeated to solve the four-objective
optimization problem with the same conguration, but now using NSGA-III.
A comparison between the results obtained in the Pareto-front in the two
96 Channel count optimization for EEG-based biometric systems
Table 5.4: TAR, TRR, and accuracy values obtained in the Pareto-front when using
7-15 EEG channels with four objectives solved with NSGA-III using subjects 1-13
as non-intrudes and 14-26 as intruders and vice-versa.
S Eval. No. channels
7 8 9 10 11 12 13 14 15
1-13 Accuracy 0.96 0.96 0.98 0.98 0.98 0.99 0.99 0.99 0.98
TAR 0.41 0.41 0.94 0.94 0.61 0.70 0.60 1.00 0.29
TRR 0.47 0.48 0.94 0.94 0.84 0.85 0.60 1.00 0.37
nu 0.0005 0.0001 0.0005
gamma 0.1 0.1 0.1
14-26 Accuracy 0.98 0.97 0.98 0.97 0.99 0.98 1.00 1.00 0.99
TAR 0.95 0.93 0.90 0.93 0.95 0.94 0.93 0.94 0.72
TRR 0.93 0.93 0.91 0.94 0.95 0.92 0.93 0.95 0.83
nu 0.0100 0.0001 0.0001
gamma 0.7 0.9 0.9
experiments, using subjects 1-13 for training (subjects 1-13 as non-intruders and
14-26 as intruders) and subjects 14-26 for training (subjects 14-26 as non-intruders
and 1-13 as intruders), is shown in Table 5.4.
In this experiment, subsets with 9, 10, and 14 optimal EEG channels were
found using subjects 1-13 as non-intruders and subsets with 7, 11, and 14 EEG
channels using subjects 14-26 as non-intruders. As in the previous experiments, a
comparison of several relevant subsets presented in Table 5.4 is presented in Fig.
5.5 for both cases, either using subjects 1-13 as non-intruders (see Sub-gs. 5.5a
and 5.5c) or 14-26 as non-intruders (see Sub-gs. 5.5b and 5.5d).
Fig. 5.5 presents a comparison between dierent subsets found by NSGA-III
when using subjects 1-13 as non-intruders and when using them as intruders. This
gure shows a lower number of channels in the interceptions, but it also shows
that most of the EEG channels used for obtaining the best results presented in
Table 5.4 were obtained using channels around the parietal and occipital areas,
which is consistent with the paradigm used for collecting the EEG signals [300].
5.3.5 Testing the proposal in 10 random subdivisions of subjects
using NSGA-II and NSGA-III.
In the previous experiments, the results obtained were presented using dierent
subsets manually selected with 50% of the subjects as non-intruders and 50%
as intruders (i.e., subjects 1-13 as non-intruders and 14-26 as intruders, and
5.3. First approach using a two-stage classication process 97
(a) Venn diagram for the subsets for 9, 10,
and 14 channels using subjects 1-13 as non-
intruders in the current experiment presented
in Table 5.4.
(b) Venn diagram for the subsets for 7, 11,
and 14 channels using subjects 14-26 as non-
intruders in the current experiment presented
in Table 5.4.
(c) Channel subsets from Sub-g. 5.5a. (d) Channel subsets from Sub-g. 5.5b.
Figure 5.5: Relevant EEG channel subsets in the Pareto-front for four objectives
using NSGA-III, considering subjects 14-26 as intruders in the previous experiment
and subjects 1-13 as intruders in current experiment.
vice-versa.). The dierences found when using NSGA-II or NSGA-III were also
presented. However to provide a more general validation of the proposal, random
subsets with 50% of the subjects as non-intruders and 50% as intruders were
created and the optimization problem then solved by simultaneously considering
the four objectives. This process was repeated 10 times, thus obtaining 10-fold
98 Channel count optimization for EEG-based biometric systems
Table 5.5: Mean TAR, TRR, and accuracy values obtained in the Pareto-front when
using 7-15 EEG channels validated in 10 random subdivisions of all the subjects,
using 50% as intruders and 50% as non-intruders.
Method Eval. No. channels
7 8 9 10 11 12 13 14 15
NSGA-II Acc. 0.96±0.02 0.96±0.01 0.97±0.02 0.98±0.02 1.00±0.00 0.99±0.01 1.00±0.00 1.00±0.00 0.99±0.01
TAR 0.74±0.18 0.81±0.18 0.59±0.07 0.74±0.05 0.81±0.08 0.61±0.25 0.81±0.17 0.86±0.13 0.90±0.10
TRR 0.85±0.14 0.79±0.10 0.68±0.16 0.87±0.13 0.69±0.18 0.89±0.10 0.88±0.12 0.90±0.09 0.94±0.06
NSGA-III Acc. 0.97±0.03 0.97±0.01 0.97±0.02 0.98±0.02 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00
TAR 0.72±0.14 0.81±0.12 0.64±0.14 0.79±0.07 0.86±0.08 0.78±0.15 0.82±0.17 0.86±0.13 0.92±0.08
TRR 0.74±0.12 0.85±0.10 0.65±0.21 0.85±0.13 0.80±0.13 0.89±0.10 0.89±0.10 0.89±0.09 0.94±0.02
cross-validation of the proposed method. The experiment was repeated using both
algorithms, NSGA-II and NSGA-III. The mean results and standard deviation are
presented in Table 5.5.
The results presented in Table 5.5 show that the mean accuracy decreased
in both cases when using NSGA-II or NSGA-III when considering 10 random
partitions of the subjects as non-intruders or intruders. In addition, the standard
deviation was
>
10% in most cases when using less than 10 channels. This is
because the number of channels for the best arrays, as well as the best channels,
were not the same in each randomly created partition. For example, in the previous
experiment presented in Table 5.4, the best results were clearly obtained using
subjects 1-13 as non-intruders with nine EEG channels (i.e., an accuracy of 0.98
and a TAR of 0.94, and TRR of 0.94). However, when considering subjects 14-26
as non-intruders, the best results were obtained using seven channels (i.e., an
accuracy of 0.98 and a TAR of 0.95 and TRR of 0.93).
For example, Table 5.5 shows that the accuracy values, TAR, and TRR were
similar in both cases for both NSGA-II and NSGA-III when using eight EEG
channels. However, the standard deviation was
>
10% for the TAR and TRR,
which means that the best results were not obtained with eight channels for
certain subsets of subjects, i.e., sometimes with seven and sometimes with nine
channels, as in the previous experiments. In summary, this new experiment shows
the accuracy for subject identication to be consistently high (i.e., higher than
0.96 in all cases, as in the previous experiments presented), but the TAR and
TRR can vary widely depending on which subset of subjects used as intruders or
5.4. Discussion 99
non-intruders.
5.4 Discussion
EEG-based biometric systems have been presented as good candidate for use
in authentication systems. In previous studies, various paradigms, i.e., resting-
state potentials and ERPs, have been studied and compared using various types
of electrodes, various numbers of channels, and varying channel localization
[
173
,
206
,
222
,
223
]. Several parameters are yet to be optimized. Thus, no industrial-
level EEG-based biometric systems are currently available.
In the context of designing a portable EEG headset, applications for multi-task
purposes and scenarios are being widely studied. NSGA-based algorithms were
proposed for the optimization process, with the nal objective of reducing the
necessary number of EEG channels for subject identication. These algorithms
depend upon several parameters that inuence the performance and results.
In addition, machine-learning algorithms also require the denition of several
parameters, which were dened using eight genes of a created chromosome.
The new scheme introduced for subject identication and authentication shows
that it can identify subjects by their EEG brain signals and distinguish between
subjects who were part of the training dataset from those that are intruders. Using
NSGA-II in the rst experiments, channel subset combinations consisting of only
two EEG channels were found, with which an accuracy of 0.78, a TAR of 0.91 ,and a
TRR of 0.88 were obtained. However, 8, 9, or 12 channels were required to increase
the value of the results for the objectives when they were simultaneously applied .
NSGA-III found subsets with 7, 9, 10, or 11 EEG channels with an accuracy of up
to 0.99 and both a TAR and TRR of 1.00.
Initially, the aim was to create a new xed headset with a limited number
of EEG channels, but as the results of this work show, it is not possible to
argue that a certain “good” subset works better than others, as various factors
are critical when choosing whether it is better to use a lower number of EEG
channels or propose improvements at the classication stage. The proposed
method shows that dierent channel subsets can provide high accuracy, TAR, and
TRR values. However, deeper analysis and further experiments are required on a
larger population.
P300 from ERPs have shown to be good candidates but they are not the gold
100 Channel count optimization for EEG-based biometric systems
standard for this application, as there is not yet sucient research evidence to
support it. They were proposed in this work as candidates as it was shown that
they exhibit strong signatures that are unique to the subject and the process does
not require any training, which will be essential in a real-life application. In a
real-life scenario, the biometric system can display something on a screen (an
image, a weak ashlight beam aimed directly at the eyes, etc.), record the brain
activity corresponding to the response to the presentation, and use it for the
identication and authentication process.
The internal state of the subject, such as the resting state, could also be used as
an alternative to obtain specic information on the subject, as previously discussed
[
173
]. The EEG channel selection process is in itself informative because it can
provide information about the most relevant areas in the brain for a certain neural
task for a certain subject or group of subjects. This can be analyzed using a-
priori information related to the paradigm, which can limit the search space and
therefore the results.
The results presented in the rst experiments show that most of the common
channels in the subsets providing the highest accuracy, TAR, and TRR, come
from the occipital and parietal areas, but certain channels in the frontal area
were also important (FC2, FC3, FC6, FC8, F6, AF7, AF8, and Fp1).
A nal
conclusion about the minimum number of necessary EEG channels for
subject identication, taking into account the classication accuracy,
TAR, and TRR, cannot be proposed solely based on the results of this work,
as the minimum number of necessary channels will be dierent depending
on various factors (i.e., the number of subjects, trials, sessions, feature
extraction method, channel selection approach and their parameters, etc.).
In addition, channel localization for the subsets diered between subjects and
whether NSGA-II or NSGA-III methods were used, as clearly presented in Figs.
5.4 and 5.5. When 10 random subdivisions of the subjects were tested, the mean
TAR and TRR decreased and the standard deviation increased. In addition, the nu
and gamma values used were dierent in each subdivision, but the classication
accuracy was maintained, similar to that of the rst experiments presented.
The complexity of the analysis can be as high as that required. In the rst
experiments, a model with EEG signals from session 1 was trained and the
5.5. Second approach, using a one-stage one-class algorithm 101
authentication and verication process was constructed using EEG signals from
session 2. However, due to the plasticity of the brain, an analysis of sessions from
dierent days/weeks/months is also necessary before a proof of concept, as well
as an analysis of how this can aect the biometric approach. Another important
aspect that requires further study is the scalability; it will be necessary to verify
the number of subjects that can be added to this system while maintaining similar
performance to that when using a small number of subjects.
Here, a rst layer using the EEG data from all the subjects to search for a
method to increase the TAR and TRR was created. Future studies will focus
on all these relevant aspects, involving the optimization of multiple parameters
related to the feature extraction and machine-learning methods by using discrete
values to represent the chromosomes and not only as a binary sequence. Another
important aspect to be further investigated is the use of larger datasets with
kf old
validation to verify whether a possible modication to the proposed
approach can allow identication of a single optimal array of EEG channels for
dierent randomly created subdivisions of subjects while consistently fullling
all of the dened objectives and necessary parameters by optimization as in the
experiments presented and discussed in this thesis.
5.5 Second approach, using a one-stage one-class algorithm
In this Section, EEG signals from 64 channels of 109 subjects and 60 instances of
one second with a sample rate of 160 Hz that were recorded during the resting-state,
in which the eyes of the subject were open, were used, as described in Section 3.6.2.
EMD- or DWT-based features were used and the results evaluated using the TAR
and TRR.
To ensure 10-fold cross-validation, the experiments were performed 10 times,
randomly selecting 80% of the instances for training and 20% for testing, thus
ensuring that the method can be generalized and that the results can be obtained
even when using another subset of instances for training and testing. The models
were created using OCSVM or LOF models. It should be noted that the channels
and parameters were optimized for all the subjects at the same time but a single
machine-learning model was created for each subject. In general, the results
presented in Table 5.6 were obtained by creating a model for each of the 109
subjects in which the model of the subject was used to recognize the subject and
102 Channel count optimization for EEG-based biometric systems
Table 5.6: Average TARs and TRRs for subject detection with EEG data from 64
channels and 109 subjects using dierent parameters for OCSVM and LOF, with
EMD- and DWT-based features.
EMD-based features DWT-based features
Method Algorithm No.
neighbors
TAR TRR TAR TRR
OCSVM 0.502±0.004 0.993±0.001 0.499±0.002 0.998±0.000
LOF ball tree 1 1.000±0.000 0.923±0.005 1.000±0.000 0.979±0.002
LOF ball tree 10 0.926±0.002 0.963±0.007 0.968±0.0038 0.989±0.012
LOF kd tree 1 1.000±0.000 0.989±0.005 1.000±0.000 0.998±0.001
LOF kd tree 10 0.926±0.001 0.955±0.006 0.923±0.001 0.988±0.002
LOF brute 1 1.000±0.000 0.926±0.004 1.000±0.000 0.979±0.004
LOF brute 10 0.927±0.001 0.939±0.007 0.924±0.003 0.989±0.002
reject the rest of the 108 who were not part of the model.
The results obtained with OCSVM showed the lowest TAR (see Table 5.6),
meaning that the models created with OCSVM did not learn from the training set
and thus rejected an average of approximately 50% of the instances, explaining
why the TRR was high when using OCSVM. The results obtained with LOF, using
three dierent algorithms and one or ten neighbors, are also shown in Table 5.6 for
illustrative purposes. LOF using the k-d tree algorithm and one neighbor resulted
in the highest TAR and TRR, meaning that it was possible to identify each subject
and reject almost all the rest that did not correspond to the models.
Previous results have shown that the algorithm and number of neighbors used
are important for increasing the TAR and TRR. The experiments were repeated
using DWT-based features considering only LOF with the k-d tree and 1 to 10
neighbors to provide more information about this behavior. The average results
obtained using 10-fold cross-validation are presented in Fig. 5.6.
The use of a higher number of neighbors resulted in a decrease in the TAR
from 1.000 to 0.923 and an increase in the TRR or its remaining higher than 0.988
(see Fig. 5.6), meaning that the models were unable to learn about the features of
each subject using a higher number of neighbors.
This is relevant, as it shows
the importance of selecting not only the best feature extraction method but
also the LOF algorithm and the best number of neighbors.
5.5. Second approach, using a one-stage one-class algorithm 103
Figure 5.6: TARs and TRRs obtained using various numbers of neighbors with the
LOF k-d tree algorithm and DWT-based features.
5.5.1 Dening the problem to optimize
After the pre-processing and feature extraction stages, a set of features were
obtained for each EEG channel. These features can be used to create a model for
each subject that can recognize it and reject the rest of the subjects. The approach
is to create a model for each subject with 80% of the instances and use 20% for
testing, as this dataset consists of only EEG data from one session, as described in
Section 3.6.2. This requires that certain important parameters be tted and that
the most relevant EEG channels are selected.
Thus, the problem is dened as an optimization problem with three
unconstrained objectives:
1)
minimize the number of necessary EEG channels,
2)
maximize the TAR, and
3)
maximize the TRR. The size of each population in each
iteration is dened as 20, the termination criterion for the optimization process is
dened by the objective space tolerance, which is dened as 0
.
0001. This criterion
is calculated every 10
th
generation. If optimization is not achieved, the process
stops after a maximum of 300 generations.
Sixty-four binary genes in a chromosome were created to represent the 64
EEG channels, as well as one gene with integer values for the algorithm (1: Ball
tree, 2: k-d tree, 3: Brute force) and another with integer values for the number of
neighbors (from 1 to 10, which were proposed experimentally), obtaining thus a
chromosome of 66 genes. When using OCSVM in the optimization process, the
same 64 genes were used for representing the EEG channels, as well as two genes
with decimal values for the nu and gamma parameters, similarly to the approach
presented in Section 5.3. The chromosome created to represent the candidate
channels in the search space and the owchart of the complete optimization
104 Channel count optimization for EEG-based biometric systems
Figure 5.7: Chromosome representation and owchart of the optimization process
for EEG channel selection using NSGA-III and LOF.
process using LOF models is illustrated in Fig. 5.7.
As explained in the feature extraction method, eight features were extracted
per channel when using EMD, and 16 when using DWT. The features were
organized and stored for iterative use, depending on the channels marked as
1in the chromosomes. For example, using EMD-based features, the classication
process would be performed with only eight features from the channel indicated in
the chromosome if the chromosome consists of only one gene. The entire process
was then performed by NSGA-III, as shown in Fig. 5.7, which starts by creating 20
possible candidates for each generation.
The output for each chromosome for each generation is the number of channels
used and the obtained TAR and TRR for the subset of channels in the chromosome.
The results are returned to NSGA-III to evaluate each chromosome in the current
population and the new generation of chromosomes is created based on the best
candidates found. This process is repeated until the termination criterion or the
maximum number of generations is reached.
5.5.2 Channel selection using NSGA-III and OCSVM for EEG
signals for the resting-state with the eyes open
It was previously shown that the TAR and TRR of the models created using
OCSVM can be improved by nding the best nu and gamma parameters [
138
]. The
optimization process dened in the Methods Section was performed to provide
5.5. Second approach, using a one-stage one-class algorithm 105
Table 5.7: TARs and TRRs obtained for the rst ve EEG channels in the Pareto-
front for three objectives solved with NSGA-III using EMD- and DWT-based
features with OCSVM.
EMD-based features DWT-based features
No. channels TAR TRR TAR TRR
10.776 ±0.138 0.851 ±0.055 0.801 ±0.063 0.905 ±0.042
20.776 ±0.092 0.911 ±0.043 0.774 ±0.066 0.958 ±0.023
30.763 ±0.150 0.969 ±0.020 0.629 ±0.180 0.959 ±0.022
40.779 ±0.144 0.966 ±0.033 0.720 ±0.069 0.980 ±0.020
5 0.822 ±0.028 0.969 ±0.022 0.822 ±0.028 0.981 ±0.017
more information about the behavior of the OCSVM models using a larger dataset,
attempting to improve the TAR and TRR while reducing the necessary number of
EEG channels for subject identication.
For this experiment, EEG signals of the 109 subjects in the resting-state, with
their eyes-open, were used, using 80% of the instances for training and 20% for
testing. NSGA-III was used for the channel selection method using 64 binary genes
in a chromosome to represent the EEG channels (1 if the channel is used, 0 if not)
and two genes with decimal values (both from 0 to 1) to select the best nu and
gamma parameters, obtaining thus a chromosome of 66 genes.
The distribution of the results of one run obtained using EMD- and DWT-based
features is shown in Fig. 5.8, as an example. The average and standard deviation
of the results obtained using 10-fold cross-validation are presented in Table 5.7.
As mentioned previously, the optimization was performed 10 times for cross-
validation. For certain runs, the Pareto-front contained only channel combinations
with one to ve channels and others with one to seven. The channels in common
and other subsets can be further analyzed using these identied subsets. Thus,
it may be possible to recommend a set of channels for a new possible headset
(considering the best subset found and those that are the most appropriate for a
new design.). However, it is rst necessary to perform the analysis to choose the
best paradigm or sub-task (i.e., resting-state with the eyes open or closed) for EEG
data collection. For comparative purposes, the average TAR and TRR obtained
using channel combinations of one to ve channels in the Pareto-front of the 10
106 Channel count optimization for EEG-based biometric systems
Figure 5.8: Frontal and aerial view of the TARs and TRRs obtained in the channel-
selection process using EMD-based features (
a)
) and DWT-based features (
b)
)
with OCSVM.
runs are presented.
A TAR of 0.822
±
0.028 and a TRR of 0.969
±
0.022 were obtained with only
ve channels using EMD-based features (see Table 5.7). The TAR and TRR were
0.822
±
0.028 and 0.981
±
0.017, respectively, using DWT-based features and ve
channels with the optimization process.
As presented in Fig. 5.8, the candidates generated using EMD- or DWT-based
features and OCSVM showed a clear tendency to reject all the subjects (which
increased the TRR, since the models correctly rejected the intruders), even those
in each model (which decreased the TAR), meaning that the models created for
each subject did not learn from the provided features. TAR increased only if the
5.5. Second approach, using a one-stage one-class algorithm 107
correct nu and gamma parameters and channels were selected, which also varied
in each run, as reected by the standard deviations.
A set of channels used during the optimization process in the 10 runs is
presented in Fig. 5.9. The set of channels identied when using EMD-based
features is presented in B) and that when using DWT-based features in a). Each
set of channels, from left to right, corresponds to the use of one to ve channels,
and, as mentioned earlier, the channels found by NSGA-III diered between runs
for certain runs. The gure presents one set. Using EMD-based features, the
channels found when using one to ve channels diered, but those around T10
and T8 were consistent across most sets. When using DWT-based features, channel
IZ clearly appeared in all sets, and channels C4 and T10 appeared in most.
5.5.3 Channel selection using NSGA-III and LOF for EEG signals
for the resting-state with the eyes open
The optimization process was performed using the 109 subjects in the dataset,
but now considering LOF for creating the models of each subject. NSGA-III was
used for the channel-selection method using 64 binary genes in a chromosome to
represent the EEG channels and two genes with integer values for the algorithm
(1: ball tree, 2: k-d tree, 3: brute force) and the number of neighbors (From 1 to 10,
which were proposed experimentally) to be used, obtaining thus a chromosome of
66 genes. The experiment was repeated 10 times for validation, each time using
80% of the instances of each subject for training and 20% for testing.
The results of the rst run are presented in Fig. 5.10 as an example of the
distribution of the TARs and TRRs during the optimization process and Table 5.8
presents the average results for both methods of feature extraction, EMD and
DWT.
Using DWT-based features, it was possible to obtain an average TAR of up to
0.993
±
0.001 and an average TRR of 0.941
±
0.002 using only three EEG channels
(see Table 5.8). The distribution of the results was very distinct and clear (see
Fig. 5.10), indicating that similar TARs and TRRs can be obtained with dierent
channel combinations using LOF and EMD- or DWT-based features.
The average distribution of the parameters used in the complete optimization
process (for all generations and all chromosomes) is presented in Fig. 5.11, showing
that the algorithm most often used by LOF was ball tree with three neighbors
108 Channel count optimization for EEG-based biometric systems
Figure 5.9: Set of one to ve channels found during the optimization process for creating the biometric system with OCSVM
using EMD-based features (a)) or DWT-based features(b)) and the resting-state with the eyes open.
5.5. Second approach, using a one-stage one-class algorithm 109
Figure 5.10: Frontal and aerial view of the TARs and TRRs obtained in the channel-
selection process using EMD-based features (
a)
), and DWT-based features (
b)
)
with LOF.
when using EMD-based features. The ball tree and k-d tree algorithms were used
equally, with three neighbors, when DWT-based features were used. Analysis
of only the parameters used for the results in the Pareto-front in the 10-fold
cross-validation (for obtaining the results presented in Table 5.8) conrmed that
the ball tree algorithm with three to four neighbors was the most often used for
EMD-based features and the ball tree and k-d tree algorithms were used with only
two neighbors for DWT-based features, as shown in Fig. 5.12.
Fig. 5.13 presents the set of channels of the 10 runs used to obtain the results
presented in Table 5.8, which correspond to the use of one to seven channels using
EMD-based features (a) in the gure) and DWT-based features (b) in the gure). In
this case, the channels were almost the same using both methods and they did not
110 Channel count optimization for EEG-based biometric systems
Table 5.8: TARs and TRRs obtained for the rst seven EEG channels in the Pareto-
front for three objectives solved with NSGA-III using EMD-based and DWT-based
features and LOF.
EMD-based features DWT-based features
No. channels TAR TRR TAR TRR
10.930 ±0.005 0.904 ±0.006 0.979 ±0.001 0.888 ±0.003
20.949 ±0.002 0.909 ±0.005 0.991 ±0.001 0.922 ±0.002
30.960 ±0.003 0.909 ±0.005 0.993 ±0.001 0.941 ±0.002
40.964 ±0.005 0.918 ±0.028 0.995 ±0.011 0.949 ±0.004
50.969 ±0.008 0.926 ±0.011 0.996 ±0.006 0.952 ±0.004
60.980 ±0.003 0.938 ±0.011 0.997 ±0.006 0.957 ±0.009
70.980 ±0.004 0.940 ±0.005 0.997 ±0.001 0.957 ±0.005
Figure 5.11: Average distribution of the algorithms and number of neighbors used
in the optimization process with EMD-based features (
a)
) and DWT-based features
(b)).
dier much when using one or three channels. Another important point is that
channels IZ, T8, and T10 were used in most cases for both EMD- and DWT-based
features. The most relevant area was clearly centered around channels C6, T8, T10
and F5.
5.5. Second approach, using a one-stage one-class algorithm 111
Figure 5.12: Average distribution of the algorithms and number of neighbors used
for the results in the Pareto-front of the optimization process with EMD-based
features (a)) and DWT-based features (b)).
5.5.4 Channel selection using NSGA-III and LOF for EEG signals
for the resting-state with the eyes closed
Previous experiments using LOF resulted in higher TARs and TRRs with a lower
number of EEG channels than when using OCSVM. The optimization process was
repeated with EEG data from the 109 subjects but considering the resting-state
with the eyes closed to provide additional information about the performance of
LOF with EMD- and DWT-based features.
The chromosome representation was as in the previous experiment: 64 genes
to represent the EEG channels and two additional genes with integer values for the
dierent algorithms and number of neighbors. Each experiment was performed
10 times, randomly selecting 80% of the instances for training and 20% for testing,
thus ensuring 10-fold cross-validation. The results obtained for runs using either
EMD- or DWT-based features are presented in Fig. 5.14 for visualization and
understanding of the behavior during the optimization process.
The average TAR and TRR in the Pareto-front for the rst seven channels
using EMD or DWT for feature extraction are presented in Table 5.9. The results
show that subject identication was possible using the resting-state with the eyes
112 Channel count optimization for EEG-based biometric systems
Figure 5.13: Set of one to seven channels found during the optimization process for creating the biometric system with LOF
and EMD-based features (a)) or DWT-based features(b)) for the resting-state with the eyes open.
5.5. Second approach, using a one-stage one-class algorithm 113
Figure 5.14: Frontal and aerial view of the TARs and TRRs obtained in the channel-
selection process using EMD- (
a)
) and DWT-based features (
b)
) for the resting-state
with the eyes closed, using LOF.
closed. The TAR and TRR were similar to those presented in Table 5.8 for the eyes
open. The results were maintained throughout the 10 runs, especially when using
DWT for feature extraction, as the standard deviation was 0.011 for the TAR and
0.009 for the TRR.
The average distribution of the parameters used during the entire optimization
process is shown in Fig. 5.15. The k-d tree algorithm was the most used in both
cases (using EMD or DWT) and the number of neighbors ranged from one to four,
with a clear advantage of using two neighbors. The average parameters used for
obtaining the results in the Pareto-front are presented in Fig. 5.16, conrming that
the k-d tree algorithm was the most used and the number of neighbors still ranged
114 Channel count optimization for EEG-based biometric systems
Table 5.9: TARs and TRRs obtained with LOF for the rst seven EEG channels in the
Pareto-front for three objectives solved with NSGA-III using EMD- or DWT-based
features and the resting-state with the eyes closed.
EMD-based features DWT-based features
No. channels TAR TRR TAR TRR
10.945 ±0.005 0.888 ±0.008 0.979 ±0.001 0.881 ±0.004
20.945 ±0.005 0.918 ±0.007 0.995 ±0.001 0.935 ±0.005
30.955 ±0.005 0.918 ±0.007 0.997 ±0.002 0.950 ±0.005
40.969 ±0.003 0.926 ±0.006 0.997 ±0.002 0.950 ±0.003
50.971 ±0.002 0.933 ±0.002 0.997 ±0.002 0.951 ±0.003
60.975 ±0.001 0.945 ±0.002 0.998 ±0.000 0.953 ±0.002
70.979 ±0.002 0.955 ±0.005 0.998 ±0.000 0.955 ±0.002
Figure 5.15: Average distribution of the algorithms and number of neighbors used
in the optimization process with EMD-based features (a)) and DWT-based features
(b)) using EEG signals for the resting-state with the eyes closed.
from one to four, with preferential use of only two neighbors.
As for the previous experiment using the resting-state with eyes open, Fig.
5.17 presents the set of channels found by the optimization process of the 10 runs
used to create the models for the biometric system using the resting-state with
the eyes closed and EMD-based features (a) in the gure), as well as DWT-based
5.6. Discussion 115
Figure 5.16: Average distribution of the algorithms and number of neighbors used
for the results in the Pareto-front of the optimization process with EMD-based
features (a)) and DWT-based features (b)) using EEG signals for the resting-state
with the eyes closed.
features (b) in the gure). The results presented in 5.13 and 5.17 diered little,
even between methods and the sets of dierent numbers of channels (In the sets
created in the 10 runs with 1 to 7 channels). The most relevant area was still
centered around channels C6, T8, T10, and IZ.
5.6 Discussion
This Chapter presented the application of EEG channel selection for biometric
systems focused on the study and comparison of various task-dependent and
task-independent paradigms, i.e., resting-state and ERPs, using various types of
electrodes and various numbers of channels [
173
,
206
,
222
,
223
]. The resting-state
has been used in the state-of-the-art for this purpose as it does not require any
training process for the subject. There are several approaches based on multi-
class classication using machine-/deep-learning and one-class classication.
Although most of the approaches can discriminate between the subjects
in a database when using multi-class classication, they do not consider
possible intruders.
In the best case, one study presented a set of eight EEG
channels selected beforehand [
297
]. Another used deep learning with a set of ve
116 Channel count optimization for EEG-based biometric systems
Figure 5.17: Set of one to seven channels found during the optimization process for creating the biometric system with LOF
using EMD-based features (a)) or DWT-based features(b)) and the resting-state with the eyes closed.
5.6. Discussion 117
EEG channels, also selected beforehand, but they did not use the resting-state
[281].
A method for channel selection was presented in Section 5.3 using a two-stage
method tested on a dataset with 26 subjects for detecting intruders and then using
multi-class classication to detect the name of the subject [
138
]. The stage for
intruder detection was created using OCSVM with nu and gamma parameters
determined by a genetic algorithm that also selected the most relevant channels for
the task. However, OCSVM was very sensitive to the nu and gamma parameters.
Later, a new approach for an EEG-based biometric system was presented using
brain signals recorded during the resting-state with the eyes open and the resting-
state with the eyes closed using LOF and channels selected by NSGA-III. Briey, a
model using LOF with EMD-/DWT-based features was created for each subject
that was able to reject the other 108 subjects in the dataset,
conrming that
the features extracted from each subject can help to discriminate between
the subject in the model and the rest of the subjects, with good results, even
with a low number of EEG channels and using 108 subjects as intruders.
In this new approach, experiments using EEG signals for the resting-state
with the eyes open and 64 EEG channels, with OCSVM and LOF using dierent
parameters, were conducted. It was shown that a TAR of up to 1.000
±
0.000 and a
TRR of 0.998
±
0.001 can be achieved using LOF and the k-d tree algorithm with only
one neighbor, all using DWT-based features. Then, the experiment was repeated
using 1 to 10 neighbors with DWT-based features, LOF, and the k-d tree algorithm,
as they were the best parameters found in the previous experiment and also to
show that a dierent number of neighbors aects the TAR and TRR.
It was also shown that OCSVM resulted in a TAR of 0.502
±
0.004 and a TRR
of 0.993
±
0.001, meaning that the models were unable to learn from any of the
features of the subjects (EMD- or DWT-based). It was thus necessary to t the best
nu and gamma parameters by using the multi-objective optimization process [
138
].
This resulted in substantially higher TAR and TRR values (see Fig. 5.8). In the
best case, a TAR of up to 0.822
±
0.028 and a TRR of 0.969
±
0.22 using EMD-based
features, and a TAR of 0.822
±
0.28 and a TRR of 0.981
±
0.017 using DWT-based
features were obtained. However, the standard deviation was high.
The results presented with LOF when using the resting-state with the eyes
118 Channel count optimization for EEG-based biometric systems
open show that a TAR of up to 0.993
±
0.01 and a TRR of 0.941
±
0.002, with only
three EEG channels and with only two EEG channels using DWT-based features,
can be obtained. TAR and TRR values above 0.900 were obtained, which are higher
than the best results obtained in the Pareto-front using EMD-based features. As
shown in Fig. 5.10, the distribution of the TAR and TRR values was consistent
when reducing the number of EEG channels during the optimization process,
showing that the models created with LOF learned well from the features provided
and that dierent channel combinations were used to obtain the best results,
as presented in Table 5.8. In this case, the most highly used algorithm for the
complete optimization process was ball tree, with three neighbors. Analysis of
the parameters using DWT-based features and only the results obtained in the
Pareto-front show the use of the ball tree and k-d tree algorithms to be highly
similar using only two neighbors.
The use of EEG signals from the resting-state with the eyes closed and LOF
conrmed that DWT-based features work better, with a TAR of up to 0.997
±
0.002
and TRR of up to 0.950
±
0.005 with only three EEG channels. The k-d tree algorithm
with two to four neighbors was the most used for the complete optimization
process, as well as the results obtained for the Pareto-front.
The use of OCSVM can provide good results if the appropriate parameters are
chosen. Otherwise, the TAR can decrease substantially. This behavior needs to be
further investigated using dierent feature extraction methods and compared to
the results using dierent-sized datasets. On the other hand, LOF proved to be
a robust classier for creating an EEG-based biometric system, especially using
DWT-based features with the ball tree or k-d tree algorithms and two to four
neighbors. In the future, it will be evaluated to determine whether solving the
problems related to EMD (best spline, end eects, mode mixing, etc.) can improve
the results presented in this study.
Comparing the results presented in Figs. 5.9,5.13 and 5.17, it is evident that the
use of LOF allowed localization of the potentially most relevant area for choosing
a possible set of channels, which will require further investigation in the future.
It is noteworthy that the channel distribution did not substantially vary
whether the eyes were open or closed in the resting state.
The localization of most of the relevant channels, i.e., the channels that were
5.6. Discussion 119
found in most of the sets, was mainly centered around channels F5, T8, T10, and
IZ, and as shown in Fig. 5.13, it was clearer for the resting-state with the eyes
open. In general, most of the channels are localized in the temporal and frontal
areas, as well as around the inion, which may be associated with the previous task
performed during the data collection. This is an aspect that must be tested using
other datasets [301303].
One of the purposes of this study was to prove that the resting-state can be used
as a paradigm to create a biometric system in large datasets. A set of experiments
was provided in which high-density EEG data was available for the training and
testing stages, but for real-time implementation of a biometric system, only a
few of the best channels will be selected for designing a new portable headset
tailored for this purpose. With the set of experiments and the methods tested for
classication and optimization, a proof-of-concept for a biometric system based
on the resting-state was provided using a small number of electrodes using a
pool with a large number of subjects (109 subjects) versus previous studies using
smaller datasets.
However, the current results do show whether or not there is a unique subset
of EEG channels or brain regions that works better for creating a biometric system
using the resting-state. This study lays the groundwork for pursuing further
research into the analysis of various public and private datasets to identify a
unique subset of channels that can be used in the design of a new portable and
easy-to-use EEG headset that can be tested in real-time, adding new subjects to
the system and identifying them using only a few electrodes.
The progress in subject identication using EEG signals from various
paradigms has been remarkable in the last several years, but one of the most
relevant unsolved problems is the fact that the new approaches have all been
tested and validated using EEG datasets recorded in well-controlled environments
[
296
,
304
]. Most of the studies using high-density EEG signals were recorded
with medical-grade sensor systems (using a gel or saline solution for improving
conductivity), which may increase the performance of the methods. However,
ease-of-use will be essential for practical and portable devices and dry electrodes
may oer certain opportunities [
304
,
305
]. In general, analysis and validation in
real-life scenarios is necessary. In this context, the best and fastest methods will
120 Channel count optimization for EEG-based biometric systems
be studied in a more realistic way and the appropriate and necessary number of
trials per subject will be considered [173].
For certain BCI applications, the problem of recognizing new instances from
new sessions has been studied using EEG data from dierent sessions or adding
new instances for calibration. In the case of session-to-session or subject-to-subject
transfer, the learning problem has been studied using LDA and SVM, based on
motor imagery or P300 paradigms [
148
,
306
309
]. To adapt the EEG feature space
and thus reduce session-to-session variability, a data-space adaptation method
based on the Kullback-Leibler divergence criterion (also called relative entropy)
can be used, aiming to minimize the distribution of dierences from the training
session to a dierent session [
307
]. There is evidence that for certain BCIs, it is
possible to use background noise immediately before a new session to improve
session-to-session variability using a regularized spatio-temporal lter [308].
The dataset used in the second approach consists of EEG signals from a single
session (see Section 5.5), which limits the experimental congurations and does
not allow evaluation of whether one can create models for each subject from a
certain session and be able to recognize the subjects or reject them using data
from another session. Future steps will be focused on tackling this problem by
analyzing possible ways to use new correctly-classied instances to decrease
session-to-session variability, data augmentation techniques, as well as using and
comparing current progress in transfer learning using machine-/deep-learning
methods to address this problem [282,309].
Another point to be analyzed in future work is to develop new ways to extract
and select the features to improve the TRR and TAR.
This can be achieved using
a big bag-of-features from the dierent sub-bands (possibly from both the
EMD and DWT methods) and by adding additional GA genes to represent
such features in the chromosomes and thus select the best features during
the optimization process, at the same time as selection of the best channels.
In general, the resting-state has been shown to be a good candidate but
there is not yet sucient research evidence using larger datasets and dierent
stages. Future eorts will be focused on relevant parameters that can be extracted
from the EEG signals of each subject and thus add information for the complete
authentication and verication process, such as re-evaluating the accepted subject
5.6. Discussion 121
using multi-class classication, detecting the age-range and sex of the subjects,
etcetera [86].
This research has been focused towards a portable (non-invasive) wireless low-
density EEG system for various applications that can help the subject-identication
process by providing EEG information from dierent channel combinations using
a movable sensor [
57
,
173
]. Following the results found in this work and the
proposed experiments, the possibility of a xed or movable electrode version of
a new EEG headset that incorporates the best results obtained in this thesis for
subject identication and authentication will be evaluated.
122 Channel count optimization for EEG-based biometric systems
Chapter 6
Conclusions and future work
In this Chapter, an overview of the achieved results in comparison with the
objectives of the thesis formulated in Section 1.2 is provided and their implications
for future work discussed.
6.1 Summary of findings
6.1.1 Feature extraction and channel count optimization for
epileptic seizure classication
In the rst paper related to this thesis [
135
], the backward-elimination algorithm
was used to reduce the number of necessary EEG channels for epileptic seizure
classication and was the basis for understanding the problem and the necessary
parameters to be optimized for this task. Later, in Chapter 4and [
200
] the method
for channel selection was improved using NSGA-II and proved to be robust for
epileptic-seizure classication.
It was shown that SVM was the most highly-used classier, independently of
whether the features were extracted using the EMD-based or DWT-based method
or whether NSGA-II or NSGA-III were used for channel selection. The presented
results show that KNN was also highly used but only when the features were
extracted using the DWT.
The presented methods show that it is possible to classify between epileptic
seizures and seizure-free instances using only one channel, obtaining accuracy
values of up to 0.97
±
0
.
05 using DWT-based features and selecting the channels
using the NSGA-III algorithm. An important nding is that NSGA-III is able
to nd the most relevant EEG channels with features based on DWT, selecting
123
124 Conclusions and future work
combinations with only two or three channels, obtaining accuracy values of up to
0.98 and 0.99, respectively.
The results discussed in Chapter4and, in general, the methods implemented
for channel selection and feature extraction will enable the prediction of epileptic
seizures with low-density EEG headsets for long-term monitoring in daily life,
attaining the advantages related to channel selection described in Section 3.5.
6.1.2 Channel count optimization for EEG-based biometric
systems
This thesis has argued that EEG-based biometric systems are a good candidate for
use in authentication systems [
87
,
138
,
173
,
206
,
222
,
223
]. The presented results
have shown that it is possible to identify subjects by their brain signals using the
methods proposed for feature extraction and classication. The most important
aspect is that it is also possible to distinguish between subjects who were part of
the trained dataset from those who are intruders.
The rst approach presented consisted of a two-stage method tested in a
dataset with 26 subjects. The rst stage consisted of OCSVM, validating the results
with the TAR and TRR, and the second stage used multi-class classication to
identify the name of the subject. This set of experiments showed that OCSVM is
sensitive to the nu and gamma parameters.
NSGA-II found channel sets of two EEG channels to obtain accuracy values
of up to 0.78, with a TAR of 0.91 and a TRR of 0.88. However, using NSGA-III, it
was possible to nd subsets with 7, 9, 10, or 11 EEG channels to obtain accuracy
values of up to 0.99 and both a TAR and TRR of 1.00.
Several facts make it impossible to draw any nal conclusions about the
minimum number of necessary EEG channels for a new biometric system based
on ERPs or P300, as the channel subsets diered depending on the number of
instances per subject, the sessions available, and the method used for feature
extraction. The sets of channels also diered depending on whether the NSGA-II
or NSGA-III algorithm was used for channel selection.
When the biometric system was created using the resting-state, LOF for one-
class classication, and the channels selected by NSGA-III, the results were more
robust using EMD or DWT for feature extraction and a low number of EEG
channels, as the models were able to reject 108 subjects.
6.2. Conclusion of the thesis contributions 125
The results obtained with EEG signals while the subjects had their eyes open
show that it is possible to obtain a TAR of up to 0.993
±
0.01 and a TRR of 0.941
±
0.002
using two or three channels with DWT-based features.
From the results presented in Chapter 5, it is possible to argue that LOF proved
to be a robust classier for creating an EEG-based biometric system, especially
using DWT-based features with the ball tree or k-d tree algorithms and two to
four neighbors.
It is noteworthy that the subsets of channels selected by NSGA-III did not
substantially dier whether the eyes were open or closed during the resting state,
i.e., it is possible to nd certain relevant areas, which in this case was centred
around channels F5, T8, T10, and IZ.
It is not currently possible to argue that there is a unique set of channels
that works better for extracting features to create a biometric system using the
resting-state. This will need to be tested in a larger population and the inuence
of the main four micro-states during the resting-state veried [89,90,9294].
6.2 Conclusion of the thesis contributions
The work presented in this thesis consisted of a method for decomposing EEG
signals into dierent sub-bands using EMD or DWT, followed by the extraction of
four features: the Teager and instantaneous energy distributions and the Higuchi
and Petrosian fractal dimensions. With these features, the EEG signal segment
corresponding to the resting-state, P300 response, or epileptic seizures, as well
as seizure-free periods, are successfully represented. Thus, the proposed method
has been presented as a robust method for extracting information from EEG
signals and thus represents the events of interest in a compact form for creating
a classier model that can be used for classication in real-time. In this context,
various classiers were tested, either multi-class classiers or one-class classiers,
depending on the case of the study.
Tailored experiments were performed using methods for channel reduction
(using the backward-elimination and and forward-addition greedy algorithms) and
selection [
86
,
87
,
135
,
138
,
173
,
200
,
206
,
206
,
223
]. However, for the experiments
presented in this thesis, the backward-elimination algorithm was only briey used.
Most of the experiments for channel selection were carried out using NSGA-based
algorithms, especially NSGA-III.
126 Conclusions and future work
In the rst approaches using NSGA, certain important features for the
classiers were optimized by adding genes with only two possible values, 0or
1. However, the possible values that can be generated by these combinations
are reduced. Thus, the parameters to be optimized were later represented using
decimal values. An example is the optimization of the nu and gamma parameters
of OCSVM, in which both genes were dened using decimal values. However, in
other cases, the range of possible values for the genes was dened as an interval
to select the number of neighbors for the LOF classier. Thus, the chromosome
representation for the optimization process is reduced and the interpretation of the
results made easier. in addition, the possible values of these genes better represent
the problem.
A method that showed good performance was presented in two dierent
case studies, thus contributing to the idea that a general method for EEG signal
processing and feature extraction can be proposed. This thesis focused on
case
study 1
, in which it was shown that the classication of epileptic seizures is
possible, even when using a reduced array of EEG channels, and
case study 2
, in
which various experiments were presented comparing methods and approaches
for creating a biometric system using EEG signals.
The method for representing the EEG channels, as well as important
parameters for the classiers, were shown to be robust for selecting the most
important source of information in the classication process. With these results,
it appears to be possible to work with a small array of non-invasive EEG sensors
for dierent classication problems using brain signals. This is important, as
this could contribute to a reduction in the current size of EEG headsets and
caps for portability, thus increasing the classication performance by using only
the important information related to the task and widening the spectrum of
applications using brain signals.
The results presented and the ideas discussed support the objective of channel
selection presented in Section 3.5. Importantly, they will also help to reduce
the preparation time for using an EEG headset and help to achieve a low-power
hardware design.
Some of the proposed work has already been carried out on dierent EEG signal
classication tasks. For example, a similar process was used in a Master’s degree
6.3. Future work 127
theses [
310
312
] and the same process for feature extraction and classication of
the response to RGB color exposure [
313
315
]. The process for channel selection
using NSGA-II was also used for source localization, reducing the number of
EEG channels from 231 to less than 10, while obtaining similar localization errors
[
316
]. This shows that the method can be adapted to dierent problems with the
same objective of reducing the number of necessary EEG channels for diverse BCI
applications.
6.3 Future work
For the rst case study, the multi-class classier used was selected by rst testing
all the classiers and performing iterations between a set of parameters, i.e.,
SVM was tested with the linear, RBF, sigmoid and polynomial. However, all
possible parameters for the classiers will be represented in the same chromosome
representation in future work, as for the channels. Thus, a set of the best
parameters for epileptic-seizure classication will be ensured, as for the case
of EEG-based biometric systems.
As discussed in Chapter 5, the EEG-based biometric system can be modied
to include more stages, in which, for example, the age of the subject, their sex,
stress level, and other important descriptors can be identied [
86
]. By doing this,
intruder detection will be easier to handle and the biometric system more robust
to manage a larger number of subjects in the database.
Future studies will therefore be focused on:
1)
improving the proposal for
the biometric system and validating it using a larger dataset with EEG signals
from dierent sessions on the same day and
2)
using larger datasets from dierent
days.
3)
The proposed biometric system must manage the problem of reducing
the number of channels for real-time use, as well as for portability and comfort.
However, it must be able to train a model for recognizing the subjects with just
a few instances, as in ngerprint and face-recognition systems. In this context,
another important problem that must be tackled, which is also important for most
BCI applications, is related to data augmentation. Collecting a few EEG instances
and then creating articial instances with information from the collected signal
will increase the feasibility of the biometric system. Thus, this proposal will be
more competitive with current biometric systems.
Data augmentation methods will be proposed in an attempt to solve this
128 Conclusions and future work
problem and will also help in the transfer learning problem related to epileptic
seizure classication.
4)
In the case of epilepsy, the machine-learning models must
be able to recognize the seizures of new subjects in the database, without adding
any seizure data, but by rst testing whether it is improved by adding instances
from the new subject to be analyzed, as well as adding new articial instances for
increasing the performance of the models.
The dataset used in the second approach of
case study 2
consists of EEG
signals from a single session (see Section 5.5), which limits the experimental
congurations and does not allow evaluation of whether one can create models
for each subject from a certain session and be able to recognize the subjects or
reject them using data from another session.
Future steps will be focused on tackling this problem and analyzing a
possible way to use new correctly-classied instances to decrease session-to-
session variability, data augmentation techniques, and comparing current progress
in transfer learning, using machine-/deep-learning methods for this problem
[282,309].
The use of deep-learning techniques for real-time applications in EEG is still a
challenge, due to the normally high computational cost. However, an interesting
future study is related to the use of auto-encoders for one-class classication and
will compare their performance to that of LOF and OCSVM [317].
The use of ever-larger datasets (i.e., a larger number of subjects) is still
necessary using EEG data from dierent sessions and of dierent lengths, as
well as considering fewer instances for training for both studying epileptic-seizure
classication and creating a biometric system. Additionally, whether solving the
problems related to EMD (best spline, end eects, mode mixing, etc.) or using
dierent EMD-based algorithms, such as multivariate EMD (MEMD) [
318
] or
Adaptive EMD (AEMD) [
319
], etc., can improve the results presented in both study
cases will be evaluated.
As mentioned in Section 3.5, various approaches for channel selection
in motor imagery classication have been proposed, but there has been no
evaluation between all these techniques to identify a set of EEG channels
[
172
,
174
,
176
,
179
,
188
,
196
,
198
,
199
]. Therefore, future eorts will also focus on
testing the various approaches for the classication of motor imagery and the
6.3. Future work 129
selection of channels to compare them with the methods proposed in this thesis.
The energy and fractal features extracted from the sub-bands obtained after
applying DWT or EMD were shown to be useful and robust across experimental
setups and for both study cases. However, as mentioned in the discussion of
Chapter 5, future work will include selection of the best subset of features by
including it during the optimization process (which could be by using a big bag-
of-features). This wold make it possible to verify whether this set is still the best
for these and new EEG-based applications and whether there are new features
capable of extracting useful patterns from EEG signals.
Future eorts will also be focused on feature selection by using NSGA-
III or recent proposals in multi-objective optimization, such as multi-objective
evolutionary algorithms based on decomposition (MOEA/D) [
320
]. These could
be used to select the best levels of decomposition from DWT or the best IMFs
from EMD by selecting the best subsets of features while reducing the number
of required EEG channels, which could be for epileptic-seizure classication and
prediction, improving the biometric system, or for a dierent task associated with
EEG signal analysis.
Towards nding a unique set of channels for EEG signal processing, it will be
necessary to test whether it is possible to force NSGA-based (especially NSGA-III)
or MOEA/D-based algorithms to select a single array of EEG channels by running
dierent folds in parallel while using the same chromosome for selecting the
channels and the necessary parameters for one-class or multi-class classication.
Future studies will focus on all these relevant aspects, involving the
optimization of multiple parameters related to feature extraction and machine-
learning methods by using discrete values for representing the chromosomes, as
carried out in the second approach of biometric systems presented in Section 5.5,
and not only as a binary sequence.
130 Conclusions and future work
References
[1]
Elena Ratti, Shani Waninger, Chris Berka, Giulio Runi, and Ajay Verma. Comparison of
medical and consumer wireless EEG systems for use in clinical trials. Frontiers in human
neuroscience, 11:398, 2017.
[2]
Herbert Jasper. Report of the committee on methods of clinical examination in
electroencephalography. Electroencephalogr Clin Neurophysiol, 10:370–375, 1958.
[3]
Robert Oostenveld and Peter Praamstra. The ve percent electrode system for high-resolution
EEG and ERP measurements. Clinical neurophysiology, 112(4):713–719, 2001.
[4]
American Electroencephalographic Society. Guideline thirteen: Guidelines for standard
electrode position nomenclature. Journal of Clinical Neurophysiology, 11(1):111–3, 1994.
[5]
Marc R Nuwer, Giancarlo Comi, Ronald Emerson, Anders Fuglsang-Frederiksen, Jean-Michel
Guérit, Hermann Hinrichs, Akio Ikeda, Fransisco Jose C Luccas, and Peter Rappelsburger.
IFCN standards for digital recording of clinical EEG. Electroencephalography and clinical
Neurophysiology, 106(3):259–261, 1998.
[6]
Jerey M Rogers, Stuart J Johnstone, Anna Aminov, James Donnelly, and Peter H Wilson.
Test-retest reliability of a single-channel, wireless EEG system. International Journal of
Psychophysiology, 106:87–96, 2016.
[7]
Silvia Erika Kober and Christa Neuper. Sex dierences in human EEG theta oscillations during
spatial navigation in virtual reality. International Journal of Psychophysiology, 79(3):347–355,
2011.
[8]
Yuji Wada, Yuko Takizawa, Jiang Zheng-Yan, and Nariyoshi Yamaguchi. Gender dierences
in quantitative EEG at rest and during photic stimulation in normal young adults. Clinical
Electroencephalography, 25(2):81–85, 1994.
[9]
Nsreen Alahmadi, Sergey A Evdokimov, Yury Juri Kropotov, Andreas M Müller, and Lutz
Jäncke. Dierent resting state EEG features in children from Switzerland and Saudi Arabia.
Frontiers in human neuroscience, 10:559, 2016.
[10]
Jeannette McGlone. Sex dierences in human brain asymmetry: A critical survey. Behavioral
and brain sciences, 3(2):215–227, 1980.
[11]
Rytis Maskeliunas, Robertas Damasevicius, Ignas Martisius, and Mindaugas Vasiljevas.
Consumer-grade EEG devices: are they usable for control tasks? PeerJ, 4:e1746, 2016.
[12]
Richard Caton. Electrical currents of the brain. The Journal of Nervous and Mental Disease,
131
132 REFERENCES
2(4):610, 1875.
[13]
Lindsay F Haas. Hans Berger (1873-1941), Richard Caton (1842-1926), and
electroencephalography. Journal of Neurology, Neurosurgery & Psychiatry, 74(1):9–9, 2003.
[14]
Anton Coenen and Oksana Zayachkivska. Adolf Beck: A pioneer in electroencephalography
in between Richard Caton and Hans Berger. Advances in cognitive psychology, 9(4):216, 2013.
[15]
Anton Coenen, Edward Fine, and Oksana Zayachkivska. Adolf Beck: a forgotten pioneer in
electroencephalography. Journal of the History of the Neurosciences, 23(3):276–286, 2014.
[16]
Hans Berger. Über das elektroenkephalogramm des menschen. Archiv für psychiatrie und
nervenkrankheiten, 87(1):527–570, 1929.
[17]
Christoph M Michel and Micah M Murray. Towards the utilization of EEG as a brain imaging
tool. Neuroimage, 61(2):371–385, 2012.
[18]
Jerey W Britton, Lauren C Frey, Jennifer L Hopp, Pearce Korb, Mohamad Z Koubeissi,
William E Lievens, Elia M Pestana-Knight, and EK Louis St. Electroencephalography (EEG):
An introductory text and atlas of normal and abnormal ndings in adults, children, and infants.
American Epilepsy Society, Chicago, 2016.
[19]
Fabian Pedregosa-Izquierdo. Feature extraction and supervised learning on fMRI: from practice
to theory. PhD thesis, Université Pierre et Marie Curie, 2015.
[20] Arthur W Toga. Brain mapping: An encyclopedic reference. Academic Press, 2015.
[21]
John William Carey Medithe and Usha Rani Nelakuditi. Study of normal and abnormal
EEG. In 2016 3rd International conference on advanced computing and communication systems
(ICACCS), volume 1, pages 1–4. IEEE, 2016.
[22]
Maria Emilia Cosenza Andraus and Soniza Vieira Alves-Leon. Non-epileptiform EEG
abnormalities: an overview. Arquivos de Neuro-Psiquiatria, 69(5):829–835, 2011.
[23]
Claudio Babiloni, Robert J Barry, Erol Başar, Katarzyna J Blinowska, Andrzej Cichocki,
Wilhelmus HIM Drinkenburg, Wolfgang Klimesch, Robert T Knight, Fernando Lopes da Silva,
Paul Nunez, et al. International Federation of Clinical Neurophysiology (IFCN)–EEG research
workgroup: Recommendations on frequency and topographic analysis of resting state EEG
rhythms. Part 1: Applications in clinical research studies. Clinical Neurophysiology, 131(1):285–
307, 2020.
[24]
Catherine Tallon-Baudry. Oscillatory synchrony and human visual cognition. Journal of
Physiology-Paris, 97(2-3):355–363, 2003.
[25]
Lawrence M Ward. Synchronous neural oscillations and cognitive processes. Trends in
cognitive sciences, 7(12):553–559, 2003.
[26]
Derk-Jan Dijk, Daniel P Brunner, Domien GM Beersma, and Alexander A Borbély.
Electroencephalogram power density and slow wave sleep as a function of prior waking and
circadian phase. Sleep, 13(5):430–440, 1990.
[27]
Jean Reiher, Michel Beaudry, and Charles P Leduc. Temporal intermittent rhythmic delta
activity (TIRDA) in the diagnosis of complex partial epilepsy: sensitivity, specicity and
predictive value. Canadian journal of neurological sciences, 16(4):398–401, 1989.
[28]
Chetan S Nayak and Arayamparambil C Anilkumar. Eeg normal waveforms. StatPearls
[Internet], 2020.
REFERENCES 133
[29]
José Luis Cantero and Mercedes Atienza. Alpha burst activity during human REM sleep:
descriptive study and functional hypotheses. Clinical neurophysiology, 111(5):909–915, 2000.
[30]
Jose L Cantero, Mercedes Atienza, and Rosa M Salas. Human alpha oscillations in wakefulness,
drowsiness period, and REM sleep: dierent electroencephalographic phenomena within the
alpha band. Neurophysiologie Clinique/Clinical Neurophysiology, 32(1):54–71, 2002.
[31]
Paul Gerrard and Robert Malcolm. Mechanisms of modanil: a review of current research.
Neuropsychiatric disease and treatment, 3(3):349, 2007.
[32]
Robert B Aird and Y Gastaut. Occipital and posterior electroencephalographic ryhthms.
Electroencephalography and clinical neurophysiology, 11(4):637–656, 1959.
[33]
Martica Hall, Julian F Thayer, Anne Germain, Douglas Moul, Raymond Vasko, Matthew
Puhl, Jean Miewald, and Daniel J Buysse. Psychological stress is associated with heightened
physiological arousal during NREM sleep in primary insomnia. Behavioral sleep medicine,
5(3):178–193, 2007.
[34]
Gert Pfurtscheller and FH Lopes Da Silva. Event-related EEG/MEG synchronization and
desynchronization: basic principles. Clinical neurophysiology, 110(11):1842–1857, 1999.
[35]
Greg Worrell and Jean Gotman. High-frequency oscillations and other electrophysiological
biomarkers of epilepsy: clinical studies. Biomarkers in medicine, 5(5):557–566, 2011.
[36]
Nicole Ille, Patrick Berg, and Michael Scherg. Artifact correction of the ongoing EEG
using spatial lters based on artifact and brain signal topographies. Journal of clinical
neurophysiology, 19(2):113–124, 2002.
[37]
Peter Anderer, Stephen Roberts, Alois Schlögl, Georg Gruber, Gerhard Klösch, Werner
Herrmann, Peter Rappelsberger, Oliver Filz, Manel J Barbanoj, Georg Dorner, et al. Artifact
processing in computerized analysis of sleep EEG–a review. Neuropsychobiology, 40(3):150–
157, 1999.
[38]
Jose Antonio Urigüen and Begoña Garcia-Zapirain. EEG artifact removal—state-of-the-art
and guidelines. Journal of neural engineering, 12(3):031001, 2015.
[39]
William O Tatum, Barbara A Dworetzky, and Donald L Schomer. Artifact and recording
concepts in EEG. Journal of clinical neurophysiology, 28(3):252–263, 2011.
[40]
Mehrdad Fatourechi, Ali Bashashati, Rabab K Ward, and Gary E Birch. EMG and EOG artifacts
in brain computer interface systems: A survey. Clinical neurophysiology, 118(3):480–494,
2007.
[41]
Franklin F Oner. The EEG as potential mapping: the value of the average monopolar
reference. Electroencephalography and clinical neurophysiology, 2(2):213, 1950.
[42]
Pablo F Diez, Vicente Mut, Eric Laciar, and Enrique Avila. A comparison of monopolar and
bipolar EEG recordings for SSVEP detection. In 2010 Annual International Conference of the
IEEE Engineering in Medicine and Biology, pages 5803–5806. IEEE, 2010.
[43]
Marc Saab. Basic concepts of surface electroencephalography and signal processing as applied
to the practice of biofeedback. Biofeedback, 36(4):128, 2008.
[44]
Christoph M Michel and Denis Brunet. EEG source imaging: a practical review of the analysis
steps. Frontiers in neurology, 10:325, 2019.
[45]
Uros Topalovic, Zahra M Aghajan, Diane Villaroman, Sonja Hiller, Leonardo Christov-Moore,
134 REFERENCES
Tyler J Wishard, Matthias Stangl, Nicholas R Hasulak, Cory Inman, Tony A Fields, et al.
Wireless Programmable Recording and Stimulation of Deep Brain Activity in Freely Moving
Humans. bioRxiv, 2020.
[46]
GE Chatrian, E Lettich, and PL Nelson. Ten percent electrode system for topographic studies
of spontaneous and evoked EEG activities. American Journal of EEG technology, 25(2):83–92,
1985.
[47]
Catherine J Chu. High density EEG—What do we have to lose? Clinical neurophysiology:
ocial journal of the International Federation of Clinical Neurophysiology, 126(3):433, 2015.
[48]
I Pisarenco, M Caporro, C Prosperetti, and M Manconi. High-density electroencephalography
as an innovative tool to explore sleep physiology and sleep related disorders. International
Journal of Psychophysiology, 92(1):8–15, 2014.
[49]
Amanda K Robinson, Praveen Venkatesh, Matthew J Boring, Michael J Tarr, Pulkit Grover,
and Marlene Behrmann. Very high density EEG elucidates spatiotemporal aspects of early
visual processing. Scientic reports, 7(1):1–11, 2017.
[50]
Anders Bach Justesen, Mette Thrane Foged, Martin Fabricius, Christian Skaarup, Nizar
Hamrouni, Terje Martens, Olaf B Paulson, Lars H Pinborg, and Sándor Beniczky. Diagnostic
yield of high-density versus low-density EEG: The eect of spatial sampling, timing and
duration of recording. Clinical Neurophysiology, 130(11):2060–2064, 2019.
[51]
Andres Soler, Pablo A Muñoz-Gutiérrez, Maximiliano Bueno-López, Eduardo Giraldo, and
Marta Molinas. Low-Density EEG for Neural Activity Reconstruction Using Multivariate
Empirical Mode Decomposition. Frontiers in Neuroscience, 14, 2020.
[52]
Phattarapong Sawangjai, Supanida Hompoonsup, Pitshaporn Leelaarporn, Supavit
Kongwudhikunakorn, and Theerawit Wilaiprasitporn. Consumer grade eeg measuring
sensors as research tools: A review. IEEE Sensors Journal, 20(8):3996–4024, 2019.
[53]
John LaRocco, Minh Dong Le, and Dong-Guk Paeng. A systemic review of available low-cost
EEG headsets used for drowsiness detection. Frontiers in neuroinformatics, 14, 2020.
[54]
Nikolas Williams, Genevieve M McArthur, and Nicholas A Badcock. 10 years of epoc: A
scoping review of emotiv’s portable eeg device. BioRxiv, 2020.
[55]
Jérémy Frey. Comparison of an open-hardware electroencephalography amplier with
medical grade device in brain-computer interface applications. arXiv preprint arXiv:1606.02438,
2016.
[56]
Marta Molinas, Audrey Van der Meer, Nils Kristian Skjærvold, and Lars Lundheim. David
versus Goliath: single-channel EEG unravels its power through adaptive signal analysis-
FlexEEG. Research project, 2018.
[57]
Luis Alfredo Moctezuma, Andres Felipe Soler Guevara, Erwin Habibzadeh Tonekabony Shad,
Alejandro Antonio Torres-Garcia, and Marta Molinas. David versus Goliath: Low-density
EEG unravels its power through adaptive signal analysis – FlexEEG. In 4th HBP Student
Conference On Interdisciplinary Brain Research, 2020.
[58]
Marta Molinas, Trond Ytterdal, Audrey Van der Meer, and Luis Romundstad. FlexEEG: EEG
scanning for highly portable, real-time functional brain mapping. Research project, 2018.
[59]
Lloyd M Nirenberg, John Hanley, and Edwin B Stear. A new approach to prosthetic control:
REFERENCES 135
EEG motor signal tracking with an adaptively designed phase-locked loop. IEEE Transactions
on Biomedical Engineering, BME-18(6):389–398, 1971.
[60]
Jonathan R Wolpaw, Niels Birbaumer, Dennis J McFarland, Gert Pfurtscheller, and Theresa M
Vaughan. Brain–computer interfaces for communication and control. Clinical neurophysiology,
113(6):767–791, 2002.
[61]
Fabien Lotte, Marco Congedo, Anatole Lécuyer, Fabrice Lamarche, and Bruno Arnaldi. A
review of classication algorithms for EEG-based brain–computer interfaces. Journal of
neural engineering, 4(2):R1, 2007.
[62]
Jonathan R Wolpaw and Dennis J McFarland. Control of a two-dimensional movement signal
by a noninvasive brain-computer interface in humans. Proceedings of the national academy of
sciences, 101(51):17849–17854, 2004.
[63]
Jonathan R Wolpaw. Brain–computer interfaces as new brain output pathways. The Journal
of physiology, 579(3):613–619, 2007.
[64]
Jose M Carmena, Mikhail A Lebedev, Roy E Crist, Joseph E O’Doherty, David M Santucci,
Dragan F Dimitrov, Parag G Patil, Craig S Henriquez, and Miguel AL Nicolelis. Learning to
control a brain–machine interface for reaching and grasping by primates. PLoS biol, 1(2):e42,
2003.
[65]
Dawn M Taylor, Stephen I Helms Tillery, and Andrew B Schwartz. Direct cortical control of
3D neuroprosthetic devices. Science, 296(5574):1829–1832, 2002.
[66]
Mijail D Serruya, Nicholas G Hatsopoulos, Liam Paninski, Matthew R Fellows, and John P
Donoghue. Instant neural control of a movement signal. Nature, 416(6877):141–142, 2002.
[67]
B Wodlinger, JE Downey, EC Tyler-Kabara, AB Schwartz, ML Boninger, and JL Collinger. Ten-
dimensional anthropomorphic arm control in a human brain- machine interface: diculties,
solutions, and limitations. Journal of neural engineering, 12(1):016011, 2014.
[68]
Aya Rezeika, Mihaly Benda, Piotr Stawicki, Felix Gembler, Abdul Saboor, and Ivan Volosyak.
Brain–computer interface spellers: A review. Brain sciences, 8(4):57, 2018.
[69]
Reza Abiri, Soheil Borhani, Eric W Sellers, Yang Jiang, and Xiaopeng Zhao. A comprehensive
review of EEG-based brain–computer interface paradigms. Journal of neural engineering,
16(1):011001, 2019.
[70]
Monica Fabiani, Gabriele Gratton, Demetrios Karis, Emanuel Donchin, et al. Denition,
identication, and reliability of measurement of the P300 component of the event-related
brain potential. Advances in psychophysiology, 2(S 1):78, 1987.
[71]
John Polich. Updating P300: an integrative theory of P3a and P3b. Clinical neurophysiology,
118(10):2128–2148, 2007.
[72]
Pietro Cipresso, Laura Carelli, Federica Solca, Daniela Meazzi, Paolo Meriggi, Barbara Poletti,
Dorothée Lulé, Albert C Ludolph, Vincenzo Silani, and Giuseppe Riva. The use of P300-based
BCIs in amyotrophic lateral sclerosis: from augmentative and alternative communication to
cognitive assessment. Brain and behavior, 2(4):479–498, 2012.
[73]
Lawrence Ashley Farwell and Emanuel Donchin. Talking o the top of your head: toward a
mental prosthesis utilizing event-related brain potentials. Electroencephalography and clinical
Neurophysiology, 70(6):510–523, 1988.
136 REFERENCES
[74]
Theresa M Vaughan, Jonathan R Wolpaw, and Emanuel Donchin. EEG-based communication:
prospects and problems. IEEE transactions on rehabilitation engineering, 4(4):425–430, 1996.
[75]
Reza Fazel-Rezai, Brendan Z Allison, Christoph Guger, Eric W Sellers, Sonja C Kleih, and
Andrea Kübler. P300 brain computer interface: current challenges and emerging trends.
Frontiers in neuroengineering, 5:14, 2012.
[76]
Lynn M McCane, Eric W Sellers, Dennis J McFarland, Joseph N Mak, C Steve Carmack,
Debra Zeitlin, Jonathan R Wolpaw, and Theresa M Vaughan. Brain-computer interface (BCI)
evaluation in people with amyotrophic lateral sclerosis. Amyotrophic lateral sclerosis and
frontotemporal degeneration, 15(3-4):207–215, 2014.
[77]
Jinhu Xiong, Liangsuo Ma, Binquan Wang, Shalini Narayana, Eugene P Du, Gary F Egan,
and Peter T Fox. Long-term motor training induced changes in regional cerebral blood ow
in both task and resting states. Neuroimage, 45(1):75–82, 2009.
[78]
EUGENE V Golanov, SEIJI Yamamoto, and DONALD J Reis. Spontaneous waves of cerebral
blood ow associated with a pattern of electrocortical activity. American Journal of Physiology-
Regulatory, Integrative and Comparative Physiology, 266(1):R204–R214, 1994.
[79]
Dante Mantini, Mauro G Perrucci, Cosimo Del Gratta, Gian L Romani, and Maurizio Corbetta.
Electrophysiological signatures of resting state networks in the human brain. Proceedings of
the National Academy of Sciences, 104(32):13170–13175, 2007.
[80]
CJ Stam, T Montez, BF Jones, SARB Rombouts, Y Van Der Made, YAL Pijnenburg, and
Ph Scheltens. Disturbed uctuations of resting state EEG synchronization in Alzheimer’s
disease. Clinical neurophysiology, 116(3):708–715, 2005.
[81]
Peter Putman. Resting state EEG delta–beta coherence in relation to anxiety, behavioral
inhibition, and selective attentional processing of threatening stimuli. International journal
of psychophysiology, 80(1):63–68, 2011.
[82]
Jun Wang, Jamie Barstein, Lauren E Ethridge, Matthew W Mosconi, Yukari Takarae, and
John A Sweeney. Resting state EEG abnormalities in autism spectrum disorders. Journal of
neurodevelopmental disorders, 5(1):24, 2013.
[83]
Lin Gao, Wei Cheng, Jinhua Zhang, and Jue Wang. EEG classication for motor imagery
and resting state in BCI applications using multi-class Adaboost extreme learning machine.
Review of Scientic Instruments, 87(8):085110, 2016.
[84]
Rui Zhang, Dezhong Yao, Pedro A Valdés-Sosa, Fali Li, Peiyang Li, Tao Zhang, Teng Ma,
Yongjie Li, and Peng Xu. Ecient resting-state EEG network facilitates motor imagery
performance. Journal of neural engineering, 12(6):066024, 2015.
[85]
Yang Di, Xingwei An, Feng He, Shuang Liu, Yufeng Ke, and Dong Ming. Robustness Analysis
of Identication Using Resting-State EEG Signals. IEEE Access, 7:42113–42122, 2019.
[86]
Luis Alfredo Moctezuma and Marta Molinas. Sex dierences observed in a study of EEG of
linguistic activity and resting-state: Exploring optimal EEG channel congurations. In 2019
7th International Winter Conference on Brain-Computer Interface (BCI), pages 1–6. IEEE, 2019.
[87]
Luis Alfredo Moctezuma and Marta Molinas. Towards a minimal EEG channel array for a
biometric system using resting-state and a genetic algorithm for channel selection. Scientic
Reports, 10(1):1–14, 2020.
REFERENCES 137
[88]
Ernst Niedermeyer and FH Lopes da Silva. Electroencephalography: basic principles, clinical
applications, and related elds. Lippincott Williams & Wilkins, 2005.
[89]
Dr Lehmann, H Ozaki, and I Pal. EEG alpha map series: brain micro-states by space-oriented
adaptive segmentation. Electroencephalography and clinical neurophysiology, 67(3):271–288,
1987.
[90]
Arjun Khanna, Alvaro Pascual-Leone, Christoph M Michel, and Faranak Farzan. Microstates
in resting-state EEG: current status and future directions. Neuroscience & Biobehavioral
Reviews, 49:105–113, 2015.
[91]
Michael D Greicius, Ben Krasnow, Allan L Reiss, and Vinod Menon. Functional connectivity
in the resting brain: a network analysis of the default mode hypothesis. Proceedings of the
National Academy of Sciences, 100(1):253–258, 2003.
[92]
Thomas Koenig, Leslie Prichep, Dietrich Lehmann, Pedro Valdes Sosa, Elisabeth Braeker,
Horst Kleinlogel, Robert Isenhart, and E Roy John. Millisecond by millisecond, year by year:
normative EEG microstates and developmental stages. Neuroimage, 16(1):41–48, 2002.
[93]
Dietrich Lehmann, Roberto D Pascual-Marqui, and Christoph Michel. EEG microstates.
Scholarpedia, 4(3):7632, 2009.
[94]
Anna Custo, Dimitri Van De Ville, William M Wells, Miralena I Tomescu, Denis Brunet, and
Christoph M Michel. Electroencephalographic resting-state networks: source localization of
microstates. Brain connectivity, 7(10):671–682, 2017.
[95]
Christoph M Michel and Thomas Koenig. EEG microstates as a tool for studying the temporal
dynamics of whole-brain neuronal networks: A review. Neuroimage, 180:577–593, 2018.
[96]
Saam Iranmanesh and Esther Rodriguez-Villegas. A 950 nW analog-based data reduction chip
for wearable EEG systems in epilepsy. IEEE Journal of Solid-State Circuits, 52(9):2362–2373,
2017.
[97]
M Rajya Lakshmi, TV Prasad, and Dr V Chandra Prakash. Survey on EEG signal processing
methods. International Journal of Advanced Research in Computer Science and Software
Engineering, 4(1), 2014.
[98]
Mamunur Rashid, Norizam Sulaiman, Anwar PP Abdul Majeed, Rabiu Muazu Musa,
Ahmad Fakhri Ab Nasir, Bifta Sama Bari, and Sabira Khatun. Current Status, Challenges,
and Possible Solutions of EEG-Based Brain-Computer Interface: A Comprehensive Review.
Frontiers in Neurorobotics, 2020.
[99]
Jesus Minguillon, M Angel Lopez-Gordo, and Francisco Pelayo. Trends in EEG-BCI for daily-
life: Requirements for artifact removal. Biomedical Signal Processing and Control, 31:407–418,
2017.
[100]
Stefan Debener, Cornelia Kranczioch, and Maarten De Vos. Electroencephalography: Current
Trends and Future Directions. In Neuroeconomics, pages 359–373. Springer, 2016.
[101]
Mamunur Rashid, Norizam Sulaiman, Mahfuzah Mustafa, Sabira Khatun, Bifta Sama Bari,
and Md Jahid Hasan. Recent Trends and Open Challenges in EEG Based Brain-Computer
Interface Systems. In InECCE2019, pages 367–378. Springer, 2020.
[102]
David Looney, Preben Kidmose, Cheolsoo Park, Michael Ungstrup, Mike Lind Rank, Karin
Rosenkranz, and Danilo P Mandic. The in-the-ear recording concept: User-centered and
138 REFERENCES
wearable brain monitoring. IEEE pulse, 3(6):32–42, 2012.
[103]
Martin G Bleichner and Stefan Debener. Concealed, unobtrusive ear-centered EEG acquisition:
cEEGrids for transparent EEG. Frontiers in human neuroscience, 11:163, 2017.
[104]
Alexander J Casson, Shelagh Smith, John S Duncan, and Esther Rodriguez-Villegas. Wearable
EEG: what is it, why is it needed and what does it entail? In 2008 30th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society, pages 5867–5870. IEEE,
2008.
[105]
Alexander J Casson, David C Yates, Shelagh JM Smith, John S Duncan, and Esther Rodriguez-
Villegas. Wearable electroencephalography. IEEE engineering in medicine and biology
magazine, 29(3):44–56, 2010.
[106]
Michal Teplan et al. Fundamentals of EEG measurement. Measurement science review,
2(2):1–11, 2002.
[107]
Rodney J Croft and Robert J Barry. Removal of ocular artifact from the EEG: a review.
Neurophysiologie Clinique/Clinical Neurophysiology, 30(1):5–19, 2000.
[108]
Chi Qin Lai, Haidi Ibrahim, Mohd Zaid Abdullah, Jafri Malin Abdullah, Shahrel Azmin Suandi,
and Azlinda Azman. Artifacts and noise removal for electroencephalogram (EEG): A literature
review. In 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE),
pages 326–332. IEEE, 2018.
[109]
Xiao Jiang, Gui-Bin Bian, and Zean Tian. Removal of artifacts from EEG signals: a review.
Sensors, 19(5):987, 2019.
[110]
Jun Lu, Dennis J McFarland, and Jonathan R Wolpaw. Adaptive Laplacian ltering for
sensorimotor rhythm-based brain–computer interfaces. Journal of neural engineering,
10(1):016002, 2012.
[111]
Kai Keng Ang, Juanhong Yu, and Cuntai Guan. Extracting eective features from high density
nirs-based BCI for assessing numerical cognition. In 2012 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), pages 2233–2236. IEEE, 2012.
[112]
Syahrull Hi Fi Syam, Heba Lakany, RB Ahmad, and Bernard A Conway. Comparing common
average referencing to laplacian referencing in detecting imagination and intention of
movement for brain computer interface. In MATEC Web of Conferences, volume 140, 2017.
[113]
Yash Paul. Various epileptic seizure detection techniques using biomedical signals: a review.
Brain informatics, 5(2):6, 2018.
[114]
Yizhang Jiang, Dongrui Wu, Zhaohong Deng, Pengjiang Qian, Jun Wang, Guanjin Wang,
Fu-Lai Chung, Kup-Sze Choi, and Shitong Wang. Seizure classication from EEG signals
using transfer learning, semi-supervised learning and TSK fuzzy system. IEEE Transactions
on Neural Systems and Rehabilitation Engineering, 25(12):2270–2284, 2017.
[115]
Sanjeev Kumar Dhull, Krishna Kant Singh, et al. A Review on Automatic Epilepsy Detection
from EEG Signals. In Advances in Communication and Computational Technology, pages
1441–1454. Springer, 2021.
[116]
Norden E Huang, Zheng Shen, Steven R Long, Manli C Wu, Hsing H Shih, Quanan Zheng,
Nai-Chyuan Yen, Chi Chao Tung, and Henry H Liu. The empirical mode decomposition
and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings
REFERENCES 139
of the Royal Society of London. Series A: mathematical, physical and engineering sciences,
454(1971):903–995, 1998.
[117]
Norden Eh Huang. Hilbert-Huang transform and its applications, volume 16. World Scientic,
2014.
[118]
Norden E Huang, Man-Li C Wu, Steven R Long, Samuel SP Shen, Wendong Qu, Per Gloersen,
and Kuang L Fan. A condence limit for the empirical mode decomposition and Hilbert
spectral analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical
and Engineering Sciences, 459(2037):2317–2345, 2003.
[119]
ZHAO Jin-Ping and Huang Da-ji. Mirror extending and circular spline function for empirical
mode decomposition method. Journal of Zhejiang University-Science A, 2(3):247–252, 2001.
[120]
Liu Zhengkun and Zhang Ze. The improved algorithm of the EMD endpoint eect based on
the mirror continuation. In 2016 Eighth International Conference on Measuring Technology
and Mechatronics Automation (ICMTMA), pages 792–795. IEEE, 2016.
[121] LV Chenhuan, ZHAO Jun, WU Chao, GUO Tiantai, and CHEN Hongjiang. Optimization of
the end eect of Hilbert-Huang transform (HHT). Chinese Journal of Mechanical Engineering,
30(3):732–745, 2017.
[122]
Jian Wang, Wenyuan Liu, and Shuai Zhang. An approach to eliminating end eects of
EMD through mirror extension coupled with support vector machine method. Personal and
Ubiquitous Computing, 23(3-4):443–452, 2019.
[123]
Yunchao Gao, Guangtao Ge, Zhengyan Sheng, and Enfang Sang. Analysis and solution to
the mode mixing phenomenon in EMD. In 2008 Congress on Image and Signal Processing,
volume 5, pages 223–227. IEEE, 2008.
[124]
Zhaohua Wu and Norden E Huang. Ensemble empirical mode decomposition: a noise-assisted
data analysis method. Advances in adaptive data analysis, 1(01):1–41, 2009.
[125]
J. Jebaraj and R. Arumugam. Ensemble empirical mode decomposition-based optimised
power line interference removal algorithm for electrocardiogram signal. IET Signal Processing,
10(6):583–591, 2016.
[126]
Gabriel Rilling, Patrick Flandrin, Paulo Goncalves, et al. On empirical mode decomposition
and its algorithms. In IEEE-EURASIP workshop on nonlinear signal and image processing,
volume 3, pages 8–11. NSIP-03, Grado (I), 2003.
[127]
Douglas Baptista de Souza, Jocelyn Chanussot, and Anne-Catherine Favre. On selecting
relevant intrinsic mode functions in empirical mode decomposition: An energy-based
approach. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP), pages 325–329. IEEE, 2014.
[128]
Daoud Boutana, Messaoud Benidir, and Braham Barkat. On the selection of intrinsic mode
function in EMD method: application on heart sound signal. In 2010 3rd International
Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2010),
pages 1–5. IEEE, 2010.
[129]
Albert Ayenu-Prah and Nii Attoh-Okine. A criterion for selecting relevant intrinsic mode
functions in empirical mode decomposition. Advances in Adaptive Data Analysis, 2(01):1–24,
2010.
140 REFERENCES
[130]
Stephane G Mallat. A theory for multiresolution signal decomposition: the wavelet
representation. IEEE transactions on pattern analysis and machine intelligence, 11(7):674–693,
1989.
[131]
HM Teager and SM Teager. Evidence for nonlinear sound production mechanisms in the
vocal tract. In Speech production and speech modelling, pages 241–261. Springer, 1990.
[132]
Firas Jabloun and A Enis Cetin. The Teager energy based feature parameters for robust
speech recognition in car noise. In 1999 IEEE International Conference on Acoustics, Speech,
and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), volume 1, pages 273–276.
IEEE, 1999.
[133]
Emmanuel Didiot, Irina Illina, Dominique Fohr, and Odile Mella. A wavelet-based
parameterization for speech/music discrimination. Computer Speech & Language, 24(2):341–
357, 2010.
[134]
Truong Quang Dang Khoa, Vo Quang Ha, and Vo Van Toi. Higuchi fractal properties of onset
epilepsy electroencephalogram. Computational and mathematical methods in medicine, 2012,
2012.
[135]
Luis Alfredo Moctezuma and Marta Molinas. Classication of low-density EEG epileptic
seizures by energy and fractal features based on EMD. Journal of Biomedical Research, 2019.
[136]
Benoit B Mandelbrot. Self-ane fractals and fractal dimension. Physica scripta, 32(4):257,
1985.
[137]
Wlodzimierz Klonowski. Fractal Analysis of Electroencephalographic Time Series (EEG
Signals). In The Fractal Geometry of the Brain, pages 413–429. Springer, 2016.
[138]
Luis Alfredo Moctezuma and Marta Molinas. Multi-objective optimization for eeG channel
selection and accurate intruder detection in an eeG-based subject identication system.
Scientic Reports, 10(1):1–12, 2020.
[139]
Agostino Accardo, M Anito, M Carrozzi, and F Bouquet. Use of the fractal dimension for
the analysis of electroencephalographic time series. Biological cybernetics, 77(5):339–350,
1997.
[140]
Werner Lutzenberger, Hubert Preissl, and Friedemann Pulvermüller. Fractal dimension of
electroencephalographic time series and underlying brain processes. Biological Cybernetics,
73(5):477–482, 1995.
[141]
Karolina Lebiecka, Urszula Zuchowicz, Agata Wozniak-Kwasniewska, David Szekely, Elzbieta
Olejarczyk, and Olivier David. Complexity analysis of EEG data in persons with depression
subjected to transcranial magnetic stimulation. Frontiers in physiology, 9:1385, 2018.
[142]
Tomoyuki Higuchi. Approach to an irregular time series on the basis of the fractal theory.
Physica D: Nonlinear Phenomena, 31(2):277–283, 1988.
[143]
Carlos Gómez, Ángela Mediavilla, Roberto Hornero, Daniel Abásolo, and Alberto Fernández.
Use of the Higuchi’s fractal dimension for the analysis of MEG recordings from Alzheimer’s
disease patients. Medical engineering & physics, 31(3):306–313, 2009.
[144]
Elisabeth Ruiz-Padial and Antonio J Ibáñez-Molina. Fractal dimension of EEG signals and
heart dynamics in discrete emotional states. Biological psychology, 137:42–48, 2018.
[145]
Sladana Spasic, Aleksandar Kalauzi, G Grbic, Ljiljana Martac, and Milka Culic. Fractal
REFERENCES 141
analysis of rat brain activity after injury. Medical and Biological Engineering and Computing,
43(3):345–348, 2005.
[146]
Arthur Petrosian. Kolmogorov complexity of nite sequences and recognition of dierent
preictal EEG patterns. In Proceedings Eighth IEEE Symposium on Computer-Based Medical
Systems, pages 212–217. IEEE, 1995.
[147]
Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning.
MIT press, 2018.
[148]
Fabien Lotte, Laurent Bougrain, Andrzej Cichocki, Maureen Clerc, Marco Congedo, Alain
Rakotomamonjy, and Florian Yger. A review of classication algorithms for EEG-based
brain–computer interfaces: a 10 year update. Journal of neural engineering, 15(3):031005,
2018.
[149]
Meysam Golmohammadi, Amir Hossein Harati Nejad Torbati, Silvia Lopez de Diego, Iyad
Obeid, and Joseph Picone. Automatic analysis of EEGs using big data and hybrid deep
learning architectures. Frontiers in human neuroscience, 13:76, 2019.
[150]
Yannick Roy, Hubert Banville, Isabela Albuquerque, Alexandre Gramfort, Tiago H Falk, and
Jocelyn Faubert. Deep learning-based electroencephalography analysis: a systematic review.
Journal of neural engineering, 16(5):051001, 2019.
[151]
Gen Li, Chang Ha Lee, Jason J Jung, Young Chul Youn, and David Camacho. Deep learning
for EEG data analytics: A survey. Concurrency and Computation: Practice and Experience,
32(18):e5199, 2020.
[152]
Grigorios Tsoumakas and Ioannis Katakis. Multi-label classication: An overview.
International Journal of Data Warehousing and Mining (IJDWM), 3(3):1–13, 2007.
[153]
Faraz Akram, Seung Moo Han, and Tae-Seong Kim. An ecient word typing P300-BCI
system using a modied T9 interface and random forest classier. Computers in biology and
medicine, 56:30–36, 2015.
[154]
David Steyrl, Reinhold Scherer, Josef Faller, and Gernot R Müller-Putz. Random forests in
non-invasive sensorimotor rhythm brain-computer interfaces: a practical and convenient
non-linear classier. Biomedical Engineering/Biomedizinische Technik, 61(1):77–86, 2016.
[155]
Chongsheng Zhang, Changchang Liu, Xiangliang Zhang, and George Almpanidis. An up-to-
date comparison of state-of-the-art classication algorithms. Expert Systems with Applications,
82:128–150, 2017.
[156]
Stuart J Russell and Peter Norvig. Articial intelligence: a modern approach. Malaysia; Pearson
Education Limited„ 2016.
[157]
Thorsten Joachims. Making large-scale svm learning practical. Technical Report 1998,28,
Universität Dortmund, http://hdl.handle.net/10419/77178, 1998.
[158]
Abdiansah Abdiansah and Retantyo Wardoyo. Time complexity analysis of support vector
machines (SVM) in LibSVM. International journal computer and application, 2015.
[159]
Thomas Cover and Peter Hart. Nearest neighbor pattern classication. IEEE transactions on
information theory, 13(1):21–27, 1967.
[160]
Keinosuke Fukunaga and Patrenahalli M. Narendra. A branch and bound algorithm for
computing k-nearest neighbors. IEEE transactions on computers, 100(7):750–753, 1975.
142 REFERENCES
[161]
Naomi S Altman. An introduction to kernel and nearest-neighbor nonparametric regression.
The American Statistician, 46(3):175–185, 1992.
[162] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
[163]
Andy Liaw, Matthew Wiener, et al. Classication and regression by randomForest. R news,
2(3):18–22, 2002.
[164]
Mia Stern, Joseph Beck, and Beverly Park Woolf. Naive Bayes classiers for user
modeling. Center for Knowledge Communication, Computer Science Department, University of
Massachusetts, 1999.
[165]
David Martinus Johannes Tax. One-class classication: Concept learning in the absence of
counter-examples. PhD thesis, Delft University of Technology, 2002.
[166]
Iwan Syarif, Adam Prugel-Bennett, and Gary Wills. SVM parameter optimization using grid
search and genetic algorithm to improve classication performance. Telkomnika, 14(4):1502,
2016.
[167]
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. LOF: identifying
density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference
on Management of data, pages 93–104, 2000.
[168]
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An introduction to
statistical learning, volume 112. Springer, 2013.
[169] Max Kuhn, Kjell Johnson, et al. Applied predictive modeling, volume 26. Springer, 2013.
[170]
Claude Sammut and Georey I Webb. Encyclopedia of machine learning. Springer Science &
Business Media, 2011.
[171]
Turky Alotaiby, Fathi E Abd El-Samie, Saleh A Alshebeili, and Ishtiaq Ahmad. A review of
channel selection algorithms for EEG signal processing. EURASIP Journal on Advances in
Signal Processing, 2015(1):66, 2015.
[172]
Muhammad Zeeshan Baig, Nauman Aslam, and Hubert PH Shum. Filtering techniques for
channel selection in motor imagery EEG applications: a survey. Articial intelligence review,
53(2):1207–1232, 2020.
[173]
Luis Alfredo Moctezuma and Marta Molinas. Subject identication from low-density EEG-
recordings of resting-states: A study of feature extraction and classication. In Future of
Information and Communication Conference, pages 830–846. Springer, 2019.
[174]
Yanru Bai, Zhiguo Zhang, and Dong Ming. Feature selection and channel optimization
for biometric identication based on visual evoked potentials. In 2014 19th International
Conference on Digital Signal Processing, pages 772–776. IEEE, 2014.
[175]
Ying Wang, Xi Long, Hans van Dijk, Ronald Aarts, and Johan Arends. Adaptive EEG channel
selection for nonconvulsive seizure analysis. In 2018 IEEE 23rd International Conference on
Digital Signal Processing (DSP), pages 1–5. IEEE, 2018.
[176]
Tao Yang, Kai Keng Ang, Kok Soon Phua, Juanhong Yu, Valerie Toh, Wai Hoe Ng, and Rosa Q
So. Eeg channel selection based on correlation coecient for motor imagery classication: A
study on healthy subjects and als patient. In 2018 40th Annual International Conference of the
IEEE Engineering in Medicine and Biology Society (EMBC), pages 1996–1999. IEEE, 2018.
[177]
Mustafa Turan Arslan, Server Göksel Eraldemir, and Esen Yildirim. Channel selection from
REFERENCES 143
EEG signals and application of support vector machine on EEG data. In 2017 International
Articial Intelligence and Data Processing Symposium (IDAP), pages 1–4. IEEE, 2017.
[178]
Huijuan Yang, Cuntai Guan, Chuan Chu Wang, and Kai Keng Ang. Maximum dependency
and minimum redundancy-based channel selection for motor imagery of walking EEG signal
detection. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing,
pages 1187–1191. IEEE, 2013.
[179]
Huijuan Yang, Cuntai Guan, Kai Keng Ang, Kok Soon Phua, and Chuanchu Wang. Selection
of eective EEG channels in brain computer interfaces based on inconsistencies of classiers.
In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology
Society, pages 672–675. IEEE, 2014.
[180]
Karim Ansari-Asl, Guillaume Chanel, and Thierry Pun. A channel selection method for
EEG classication in emotion assessment based on synchronization likelihood. In 2007 15th
European Signal Processing Conference, pages 1241–1245. IEEE, 2007.
[181]
Yongkoo Park and Wonzoo Chung. Optimal Channel Selection Using Correlation Coecient
for CSP Based EEG Classication. IEEE Access, 8:111514–111521, 2020.
[182]
Zhong-Min Wang, Shu-Yuan Hu, and Hui Song. Channel selection method for eeg emotion
recognition using normalized mutual information. IEEE Access, 7:143303–143311, 2019.
[183]
Michael Schröder, Thomas Navin Lal, Thilo Hinterberger, Martin Bogdan, N Jeremy Hill, Niels
Birbaumer, Wolfgang Rosenstiel, and Bernhard Schölkopf. Robust EEG channel selection
across subjects for brain-computer interfaces. EURASIP Journal on Advances in Signal
Processing, 2005(19):174746, 2005.
[184]
Fatma Ibrahim, Saly Abd-Elateif El-Gindy, Sami M El-Dolil, Adel S El-Fishawy, El-Sayed M
El-Rabaie, Moawaed I Dessouky, Ibrahim M Eldokany, Turky N Alotaiby, Saleh A Alshebeili,
and Fathi E Abd El-Samie. A statistical framework for EEG channel selection and seizure
prediction on mobile. International Journal of Speech Technology, 22(1):191–203, 2019.
[185]
Jonas Duun-Henriksen, Troels Wesenberg Kjaer, Rasmus Elsborg Madsen, Line Soe Remvig,
Carsten Eckhart Thomsen, and Helge Bjarup Dissing Sorensen. Channel selection for
automatic seizure detection. Clinical Neurophysiology, 123(1):84–92, 2012.
[186]
Jianhai Zhang, Ming Chen, Shaokai Zhao, Sanqing Hu, Zhiguo Shi, and Yu Cao. ReliefF-based
EEG sensor selection methods for emotion recognition. Sensors, 16(10):1558, 2016.
[187]
M Murugappan and Sazali Yaacob. Asymmetric ratio and FCM based salient channel selection
for human emotion detection using EEG. WSEAS Transactions on Signal Processing, 2008.
[188]
Yi-Hung Liu, Shiuan Huang, and Yi-De Huang. Motor imagery EEG classication for patients
with amyotrophic lateral sclerosis using fractal dimension and Fisher’s criterion-based
channel selection. Sensors, 17(7):1557, 2017.
[189]
Ahmed Al-Ani and Mostefa Mesbah. EEG rhythm/channel selection for fuzzy rule-based
alertness state characterization. Neural Computing and Applications, 30(7):2257–2267, 2018.
[190]
Annushree Bablani, Damodar Reddy Edla, Diwakar Tripathi, Shubham Dodia, and Sridhar
Chintala. A synergistic concealed information test with novel approach for EEG channel
selection and SVM parameter optimization. IEEE Transactions on Information Forensics and
Security, 14(11):3057–3068, 2019.
144 REFERENCES
[191]
Jianhua Yang, Harsimrat Singh, Evor L Hines, Friederike Schlaghecken, Daciana D
Iliescu, Mark S Leeson, and Nigel G Stocks. Channel selection and classication of
electroencephalogram signals: an articial neural network and genetic algorithm-based
approach. Articial intelligence in medicine, 55(2):117–126, 2012.
[192]
Mahnaz Arvaneh, Cuntai Guan, Kai Keng Ang, and Chai Quek. Optimizing the channel
selection and classication accuracy in EEG-based BCI. IEEE Transactions on Biomedical
Engineering, 58(6):1865–1873, 2011.
[193]
Ahmed Al-Ani and Akram Al-Sukker. Eect of feature and channel selection on EEG
classication. In 2006 International Conference of the IEEE Engineering in Medicine and Biology
Society, pages 2171–2174. IEEE, 2006.
[194]
Beatriz A Garro, Rocio Salazar-Varas, and Roberto A Vazquez. EEG Channel Selection using
Fractal Dimension and Articial Bee Colony Algorithm. In 2018 IEEE Symposium Series on
Computational Intelligence (SSCI), pages 499–504. IEEE, 2018.
[195]
Vikram Shenoy Handiru and Vinod A Prasad. Optimized bi-objective eeg channel selection
and cross-subject generalization with brain–computer interfaces. IEEE Transactions on
Human-Machine Systems, 46(6):777–786, 2016.
[196]
Hao Sun, Jing Jin, Wanzeng Kong, Cili Zuo, Shurui Li, and Xingyu Wang. Novel channel
selection method based on position priori weighted permutation entropy and binary gravity
search algorithm. Cognitive Neurodynamics, pages 1–16, 2020.
[197]
Alejandro A Torres-García, Carlos A Reyes-García, Luis Villaseñor-Pineda, and Gregorio
García-Aguilar. Implementing a fuzzy inference system in a multi-objective EEG channel
selection model for imagined speech classication. Expert Systems with Applications, 59:1–12,
2016.
[198]
Lin He, Youpan Hu, Yuanqing Li, and Daoli Li. Channel selection by Rayleigh coecient
maximization based genetic algorithm for classifying single-trial motor imagery EEG.
Neurocomputing, 121:423–433, 2013.
[199]
Chea-Yau Kee, Sivalinga Govinda Ponnambalam, and Chu-Kiong Loo. Multi-objective genetic
algorithm as channel selection method for P300 and motor imagery data set. Neurocomputing,
161:120–131, 2015.
[200]
Luis Alfredo Moctezuma and Marta Molinas. EEG Channel-selection method for epileptic-
seizure classication based on multi-objective optimization. Frontiers in Neuroscience, 14:593,
2020.
[201]
Douglas Rodrigues, Gabriel FA Silva, João P Papa, Aparecido N Marana, and Xin-She Yang.
EEG-based person identication through binary ower pollination algorithm. Expert Systems
with Applications, 62:81–90, 2016.
[202]
Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Cliord Stein. Introduction to
algorithms. MIT press, 2009.
[203]
Patrenahalli M. Narendra and Keinosuke Fukunaga. A branch and bound algorithm for
feature subset selection. IEEE Transactions on computers, pages 917–922, 1977.
[204]
Iman Foroutan and Jack Sklansky. Feature selection for automatic classication of non-
gaussian data. IEEE Transactions on Systems, Man, and Cybernetics, 17(2):187–198, 1987.
REFERENCES 145
[205]
Jihoon Yang and Vasant Honavar. Feature subset selection using a genetic algorithm. In
Feature extraction, construction and selection, pages 117–136. Springer, 1998.
[206]
Luis Alfredo Moctezuma and Marta Molinas. Event-related potential from eeg for a two-step
identity authentication system. In 2019 IEEE 17th International Conference on Industrial
Informatics (INDIN), volume 1, pages 392–399. IEEE, 2019.
[207]
Kalyanmoy Deb. Multi-objective optimization using evolutionary algorithms, volume 16. John
Wiley & Sons, 2001.
[208]
Tinkle Chugh, Karthik Sindhya, Jussi Hakanen, and Kaisa Miettinen. A survey on
handling computationally expensive multiobjective optimization problems with evolutionary
algorithms. Soft Computing, 23(9):3137–3166, 2019.
[209] Oliver Kramer. Genetic algorithm essentials, volume 679. Springer, 2017.
[210]
Nidamarthi Srinivas and Kalyanmoy Deb. Muiltiobjective optimization using nondominated
sorting in genetic algorithms. Evolutionary computation, 2(3):221–248, 1994.
[211]
Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. A fast and elitist
multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation,
6(2):182–197, 2002.
[212]
Kalyanmoy Deb and Himanshu Jain. An evolutionary many-objective optimization algorithm
using reference-point-based nondominated sorting approach, part I: solving problems with
box constraints. IEEE Transactions on Evolutionary Computation, 18(4):577–601, 2013.
[213]
Himanshu Jain and Kalyanmoy Deb. An evolutionary many-objective optimization algorithm
using reference-point based nondominated sorting approach, part II: handling constraints
and extending to an adaptive approach. IEEE Transactions on Evolutionary Computation,
18(4):602–622, 2013.
[214]
Indraneel Das and John E Dennis. Normal-boundary intersection: A new method for
generating the Pareto surface in nonlinear multicriteria optimization problems. SIAM journal
on optimization, 8(3):631–657, 1998.
[215]
Ary L Goldberger, Luis AN Amaral, Leon Glass, Jerey M Hausdor, Plamen Ch Ivanov,
Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley.
PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for
complex physiologic signals. Circulation, 101(23):e215–e220, 2000.
[216]
António Dourado, M Le Van Quyen, B Schelter, G Favaro, A Schulze-Bonhage, S Sales, and
V Navarro. EPILEPSIAE-EVOLVING PLATFORM FOR IMPROVING LIVING EXPECTATION
OF PATIENTS SUFFERING FROM ICTAL EVENTS: E595. Epilepsia, 50:210–211, 2009.
[217]
Iyad Obeid and Joseph Picone. The temple university hospital EEG data corpus. Frontiers in
neuroscience, 10:196, 2016.
[218]
Ali Hossam Shoeb. Application of machine learning to epileptic seizure onset detection and
treatment. PhD thesis, Massachusetts Institute of Technology, 2009.
[219]
Gerwin Schalk, Dennis J McFarland, Thilo Hinterberger, Niels Birbaumer, and Jonathan R
Wolpaw. BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE
Transactions on biomedical engineering, 51(6):1034–1043, 2004.
[220]
Perrin Margaux, Maby Emmanuel, Daligault Sébastien, Bertrand Olivier, and Mattout Jérémie.
146 REFERENCES
Objective and subjective evaluation of online error correction during P300-based spelling.
Advances in Human-Computer Interaction, 2012:4, 2012.
[221]
Luis Alfredo Moctezuma. Distinción de estados de actividad e inactividad lingüıstica para
interfaces cerebro computadora. Master’s thesis, Benemérita Universidad Autónoma de
Puebla, 2017.
[222]
Luis Alfredo Moctezuma, Alejandro A Torres-García, Luis Villaseñor-Pineda, and Maya
Carrillo. Subjects identication using EEG-recorded imagined speech. Expert Systems with
Applications, 118:201–208, 2019.
[223]
Luis Alfredo Moctezuma and Marta Molinas. EEG-based Subjects Identication based on
Biometrics of Imagined Speech using EMD. In International Conference on Brain Informatics,
pages 458–467. Springer, 2018.
[224]
Petre Lameski, Eftim Zdravevski, Riste Mingov, and Andrea Kulakov. SVM parameter tuning
with grid search and its impact on reduction of model over-tting. In Rough sets, fuzzy sets,
data mining, and granular computing, pages 464–474. Springer, 2015.
[225]
Guido Van Rossum and Fred L. Drake. Python 3 Reference Manual. CreateSpace, Scotts Valley,
CA, 2009.
[226]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel,
P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher,
M. Perrot, and E. Duchesnay. Scikit-learn: Machine Learning in Python. Journal of Machine
Learning Research, 12:2825–2830, 2011.
[227]
Julian Blank and Kalyanmoy Deb. pymoo: Multi-objective Optimization in Python. IEEE
Access, 8:89497–89509, 2020.
[228]
Matthew Rocklin. Dask: Parallel computation with blocked algorithms and task scheduling.
In Proceedings of the 14th python in science conference, pages 130–136. Citeseer, 2015.
[229]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David
Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J.
van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew
R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W.
Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A.
Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul
van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientic
Computing in Python. Nature Methods, 17:261–272, 2020.
[230]
Charles R. Harris, K. Jarrod Millman, Stéfan J van der Walt, Ralf Gommers, Pauli Virtanen,
David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern,
Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime
Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard,
Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant.
Array programming with NumPy. Nature, 585:357–362, 2020.
[231]
Gregory R Lee, Ralf Gommers, Filip Waselewski, Kai Wohlfahrt, and Aaron O’Leary.
PyWavelets: A Python package for wavelet analysis. Journal of Open Source Software,
4(36):1237, 2019.
REFERENCES 147
[232]
Jaidev Deshpande. pyhht Documentation.
https://pyhht.readthedocs.io/en/latest/
,
2018. Accessed: 2021-01-01.
[233]
Magnus Själander, Magnus Jahre, Gunnar Tufte, and Nico Reissmann. EPIC: An energy-
ecient, high-performance GPGPU computing research infrastructure. arXiv preprint
arXiv:1912.05848, 2019.
[234]
Florian Mormann, Ralph G Andrzejak, Christian E Elger, and Klaus Lehnertz. Seizure
prediction: the long and winding road. Brain, 130(2):314–333, 2006.
[235]
Rajendra Kale. Bringing epilepsy out of the shadows: Wide treatment gap needs to be reduced,
1997.
[236]
Jr J Engel. A practical guide for routine EEG studies in epilepsy. Journal of clinical
neurophysiology: ocial publication of the American Electroencephalographic Society, 1(2):109–
142, 1984.
[237]
Hojjat Adeli and Samanwoy Ghosh-Dastidar. Automated EEG-based diagnosis of neurological
disorders: Inventing the future of neurology. CRC press, 2010.
[238]
Orrin Devinsky. Diagnosis and treatment of temporal lobe epilepsy. Rev Neurol Dis, 1(1):2–9,
2004.
[239]
Jerome Engel Jr. Mesial temporal lobe epilepsy: what have we learned? The neuroscientist,
7(4):340–352, 2001.
[240]
Vairavan Srinivasan, Chikkannan Eswaran, and N. Sriraam. Articial neural network based
epileptic detection using time-domain and frequency-domain features. Journal of Medical
Systems, 29(6):647–660, 2005.
[241]
Yatindra Kumar, ML Dewal, and RS Anand. Epileptic seizure detection using DWT based
fuzzy approximate entropy and support vector machine. Neurocomputing, 133:271–279, 2014.
[242]
Yusuf Uzzaman Khan, Nidal Rauddin, and Omar Farooq. Automated seizure detection in
scalp EEG using multiple wavelet scales. In 2012 IEEE International Conference on Signal
Processing, Computing and Control, pages 1–5. IEEE, 2012.
[243]
Morteza Zabihi, Serkan Kiranyaz, Ali Bahrami Rad, Aggelos K Katsaggelos, Moncef Gabbouj,
and Turker Ince. Analysis of high-dimensional phase space via Poincaré section for patient-
specic seizure detection. IEEE Transactions on Neural Systems and Rehabilitation Engineering,
24(3):386–398, 2015.
[244]
Muhammad Sohaib J Solaija, Sajid Saleem, Khawar Khurshid, Syed Ali Hassan, and
Awais Mehmood Kamboh. Dynamic mode decomposition based epileptic seizure detection
from scalp EEG. IEEE Access, 6:38683–38692, 2018.
[245]
Abhijit Bhattacharyya and Ram Bilas Pachori. A multivariate approach for patient-specic
EEG seizure detection using empirical wavelet transform. IEEE Transactions on Biomedical
Engineering, 64(9):2003–2015, 2017.
[246]
Yinda Zhang, Shuhan Yang, Yang Liu, Yexian Zhang, Bingfeng Han, and Fengfeng Zhou.
Integration of 24 feature types to accurately detect and predict seizures using scalp EEG
Signals. Sensors, 18(5):1372, 2018.
[247]
U Rajendra Acharya, Filippo Molinari, S Vinitha Sree, Subhagata Chattopadhyay, Kwan-
Hoong Ng, and Jasjit S Suri. Automated diagnosis of epileptic EEG using entropies. Biomedical
148 REFERENCES
Signal Processing and Control, 7(4):401–408, 2012.
[248]
Rajeev Sharma and Ram Bilas Pachori. Classication of epileptic seizures in EEG signals based
on phase space representation of intrinsic mode functions. Expert Systems with Applications,
42(3):1106–1117, 2015.
[249]
Vipin Gupta and Ram Bilas Pachori. Epileptic seizure identication using entropy of FBSE
based EEG rhythms. Biomedical Signal Processing and Control, 53:101569, 2019.
[250]
Vipin Gupta, Abhijit Bhattacharyya, and Ram Bilas Pachori. Automated identication of
epileptic seizures from EEG signals using FBSE-EWT method. In Biomedical Signal Processing,
pages 157–179. Springer, 2020.
[251]
José Antonio de la O Serna, Mario R Arrieta Paternina, Alejandro Zamora-Méndez,
Rajesh Kumar Tripathy, and Ram Bilas Pachori. EEG-Rhythm Specic Taylor-Fourier lter
bank Implemented with O-splines for the Detection of Epilepsy using EEG Signals. IEEE
Sensors Journal, 2020.
[252]
Rahul Sharma, Ram Bilas Pachori, and Pradip Sircar. Seizures classication based on higher
order statistics and deep neural network. Biomedical Signal Processing and Control, 59:101921,
2020.
[253]
Ralph G Andrzejak, Klaus Lehnertz, Florian Mormann, Christoph Rieke, Peter David, and
Christian E Elger. Indications of nonlinear deterministic and nite-dimensional structures
in time series of brain electrical activity: Dependence on recording region and brain state.
Physical Review E, 64(6):061907, 2001.
[254]
P Fiedler, P Pedrosa, S Griebel, C Fonseca, F Vaz, E Supriyanto, F Zanow, and J Haueisen.
Novel multipin electrode cap system for dry electroencephalography. Brain topography,
28(5):647–656, 2015.
[255]
Selenia di Fronso, Patrique Fiedler, Gabriella Tamburro, Jens Haueisen, Maurizio Bertollo,
and Silvia Comani. Dry EEG in sport sciences: a fast and reliable tool to assess individual
alpha peak frequency changes induced by physical eort. Frontiers in Neuroscience, 13:982,
2019.
[256]
Nidal Rauddin, Yusuf Uzzaman Khan, and Omar Farooq. Feature extraction and classication
of EEG for automatic seizure detection. In 2011 International Conference on Multimedia, Signal
Processing and Communication Technologies, pages 184–187. IEEE, 2011.
[257]
Vairavan Srinivasan, Chikkannan Eswaran, and Natarajan Sriraam. Approximate entropy-
based epileptic EEG detection using articial neural networks. IEEE Transactions on
information Technology in Biomedicine, 11(3):288–295, 2007.
[258]
Abdulhamit Subasi and M Ismail Gursoy. EEG signal classication using PCA, ICA, LDA and
support vector machines. Expert systems with applications, 37(12):8659–8666, 2010.
[259]
CHRYSOTOMOS P Panayiotopoulos and MICHALIS Koutroumanidis. The signicance of
the syndromic diagnosis of the epilepsies. National Society for Epilepsy, 2005.
[260]
Yong Won Cho and Keun Tae Kim. The Latest Classication of Epilepsy and Clinical
Signicance of Electroencephalography. Journal of Neurointensive Care, 2(1):1–3, 2019.
[261]
Ena Bingham and Victor Patterson. A telemedicine-enabled nurse-led epilepsy service is
acceptable and sustainable. Journal of Telemedicine and Telecare, 13(3_suppl):19–21, 2007.
REFERENCES 149
[262]
Phil Smith. Telephone review for people with epilepsy. Practical neurology, 16(6):475–477,
2016.
[263]
Najib Kissani, Yilédoma Thierry Modeste Lengané, Victor Patterson, Boulenouar Mesraoua,
Eliashiv Dawn, Cigdem Ozkara, Graeme Shears, Harmiena Riphagen, Ali A Asadi-Pooya,
Alicia Bogacz, et al. Telemedicine in epilepsy: How can we improve care, teaching, and
awareness? Epilepsy & Behavior, page 106854, 2020.
[264]
Carmen Terranova, Vincenzo Rizzo, Alberto Cacciola, Gaetana Chillemi, Alessandro
Calamuneri, Demetrio Milardi, and Angelo Quartarone. Is there a future for non-invasive
brain stimulation as a therapeutic tool? Frontiers in neurology, 9:1146, 2019.
[265]
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. Numba: A llvm-based python jit compiler.
In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–6,
2015.
[266]
Emmanuel K Kalunga, Sylvain Chevallier, Quentin Barthélemy, Karim Djouani, Eric
Monacelli, and Yskandar Hamam. Online SSVEP-based BCI using Riemannian geometry.
Neurocomputing, 191:55–68, 2016.
[267]
Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin
Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard,
and Tonio Ball. Deep learning with convolutional neural networks for EEG decoding and
visualization. Human brain mapping, 38(11):5391–5420, 2017.
[268]
Anil K Jain, Arun Ross, and Salil Prabhakar. An introduction to biometric recognition. IEEE
Transactions on circuits and systems for video technology, 14(1):4–20, 2004.
[269]
Anil K Jain, Arun Ross, and Umut Uludag. Biometric template security: Challenges and
solutions. In 2005 13th European signal processing conference, pages 1–4. IEEE, 2005.
[270]
Umut Uludag and Anil K Jain. Attacks on biometric systems: a case study in ngerprints.
In Security, steganography, and watermarking of multimedia contents VI, volume 5306, pages
622–633. International Society for Optics and Photonics, 2004.
[271]
Seyed Abolfazl Valizadeh, Franziskus Liem, Susan Mérillat, Jürgen Hänggi, and Lutz Jäncke.
Identication of individual subjects on the basis of their brain anatomical features. Scientic
reports, 8(1):1–9, 2018.
[272]
Katharine Brigham and BVK Vijaya Kumar. Subject identication from electroencephalogram
(EEG) signals during imagined speech. In 2010 Fourth IEEE International Conference on
Biometrics: Theory, Applications and Systems (BTAS), pages 1–8. IEEE, 2010.
[273]
Gonzalo Safont, Addisson Salazar, Antonio Soriano, and Luis Vergara. Combination of
multiple detectors for EEG based biometric identication/authentication. In 2012 IEEE
International Carnahan Conference on Security Technology (ICCST), pages 230–236. IEEE, 2012.
[274]
Matteo Fraschini, Arjan Hillebrand, Matteo Demuru, Luca Didaci, and Gian Luca Marcialis.
An EEG-based biometric system using eigenvector centrality in resting state brain networks.
IEEE Signal Processing Letters, 22(6):666–670, 2014.
[275]
Jae-Hwan Kang, Young Chang Jo, and Sung-Phil Kim. Electroencephalographic feature
evaluation for improving personal authentication performance. Neurocomputing, 287:93–101,
2018.
150 REFERENCES
[276]
Alejandro Riera, Aureli Soria-Frisch, Marco Caparrini, Carles Grau, and Giulio Runi.
Unobtrusive biometric system based on electroencephalogram analysis. EURASIP Journal on
Advances in Signal Processing, 2008:18, 2008.
[277]
Bin Hu, Quanying Liu, Qinglin Zhao, Yanbing Qi, and Hong Peng. A real-time
electroencephalogram (EEG) based individual identication interface for mobile security
in ubiquitous environment. In 2011 IEEE Asia-Pacic Services Computing Conference, pages
436–441. IEEE, 2011.
[278]
Qiong Gui, Maria V. Ruiz-Blondet, Sarah Laszlo, and Zhanpeng Jin. A survey on brain
biometrics. ACM Comput. Surv., 51(6):112:1–112:38, February 2019.
[279]
JX Chen, ZJ Mao, WX Yao, and YF Huang. EEG-based biometric identication with
convolutional neural network. Multimedia Tools and Applications, pages 1–21, 2019.
[280]
Yingnan Sun, Frank P-W Lo, and Benny Lo. EEG-based user identication system using
1D-convolutional long short-term memory neural networks. Expert Systems with Applications,
125:259–267, 2019.
[281]
Theerawit Wilaiprasitporn, Apiwat Ditthapron, Karis Matchaparn, Tanaboon Tongbuasirilai,
Nannapas Banluesombatkul, and Ekapol Chuangsuwanich. Aective EEG-based person
identication using the deep learning approach. IEEE Transactions on Cognitive and
Developmental Systems, 2019.
[282]
Ozan Özdenizci, Ye Wang, Toshiaki Koike-Akino, and Deniz Erdoğmuş. Adversarial deep
learning in EEG biometrics. IEEE Signal Processing Letters, 26(5):710–714, 2019.
[283]
Philip Davis, Charles D Creusere, and Jim Kroger. Subject identication based on EEG
responses to video stimuli. In 2015 IEEE International Conference on Image Processing (ICIP),
pages 1523–1527. IEEE, 2015.
[284]
Thiago Schons, Gladston JP Moreira, Pedro HL Silva, Vitor N Coelho, and Eduardo JS Luz.
Convolutional network for EEG-based biometric. In Iberoamerican Congress on Pattern
Recognition, pages 601–608. Springer, 2017.
[285]
Xiang Zhang, Lina Yao, Salil S Kanhere, Yunhao Liu, Tao Gu, and Kaixuan Chen. MindID:
Person identication from brain waves through attention-based recurrent neural network.
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3):1–23,
2018.
[286]
Longbin Jin, Jaeyoung Chang, and Eunyi Kim. EEG-Based User Identication Using Channel-
Wise Features. In Asian Conference on Pattern Recognition, pages 750–762. Springer, 2019.
[287]
Daria La Rocca, Patrizio Campisi, Balazs Vegso, Peter Cserti, György Kozmann, Fabio Babiloni,
and F De Vico Fallani. Human brain distinctiveness based on EEG spectral coherence
connectivity. IEEE transactions on Biomedical Engineering, 61(9):2406–2412, 2014.
[288]
Alessandra Crobe, Matteo Demuru, Luca Didaci, Gian Luca Marcialis, and Matteo Fraschini.
Minimum spanning tree and k-core decomposition as measure of subject-specic EEG traits.
Biomedical Physics & Engineering Express, 2(1):017001, 2016.
[289]
Marco Garau, Matteo Fraschini, Luca Didaci, and Gian Luca Marcialis. Experimental results
on multi-modal fusion of EEG-based personal verication algorithms. In 2016 International
Conference on Biometrics (ICB), pages 1–6. IEEE, 2016.
REFERENCES 151
[290]
Kavitha P Thomas and A Prasad Vinod. Biometric identication of persons using sample
entropy features of EEG during rest state. In 2016 IEEE International Conference on Systems,
Man, and Cybernetics (SMC), pages 003487–003492. IEEE, 2016.
[291]
Kavitha P Thomas and A Prasad Vinod. Utilizing individual alpha frequency and delta band
power in EEG based biometric recognition. In 2016 IEEE International Conference on Systems,
Man, and Cybernetics (SMC), pages 004787–004791. IEEE, 2016.
[292]
Silvio Barra, Andrea Casanova, Matteo Fraschini, and Michele Nappi. Fusion of physiological
measures for multimodal biometric systems. Multimedia Tools and Applications, 76(4):4835–
4847, 2017.
[293]
Su Yang, Farzin Deravi, and Sanaul Hoque. Task sensitivity in EEG biometric recognition.
Pattern Analysis and Applications, 21(1):105–117, 2018.
[294]
Patrizio Campisi and Daria La Rocca. Brain waves for automatic biometric-based user
recognition. IEEE transactions on information forensics and security, 9(5):782–800, 2014.
[295]
Mohammed Abo-Zahhad, Sabah Mohammed Ahmed, and Sherif Nagib Abbas. State-of-the-
art methods and future perspectives for personal recognition based on electroencephalogram
signals. IET Biometrics, 4(3):179–190, 2015.
[296]
Amir Jalaly Bidgoly, Hamed Jalaly Bidgoly, and Zeynab Arezoumand. A survey on methods
and challenges in EEG based authentication. Computers & Security, page 101788, 2020.
[297]
Salahiddin Altahat, Michael Wagner, and Elisa Martinez Marroquin. Robust
electroencephalogram channel set for person authentication. In 2015 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 997–1001. IEEE, 2015.
[298]
Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani,
Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. Deap: A database for
emotion analysis; using physiological signals. IEEE transactions on aective computing,
3(1):18–31, 2011.
[299]
Zijing Mao, Wan Xiang Yao, and Yufei Huang. EEG-based biometric identication with deep
learning. In 2017 8th International IEEE/EMBS Conference on Neural Engineering (NER), pages
609–612. IEEE, 2017.
[300]
Alejandro Gonzalez, Isao Nambu, Haruhide Hokari, and Yasuhiro Wada. EEG channel
selection using particle swarm optimization for the classication of auditory event-related
potentials. The Scientic World Journal, 2014, 2014.
[301]
Nobuaki Mizuguchi, Hiroki Nakata, Takuji Hayashi, Masanori Sakamoto, Tetsuro Muraoka,
Yusuke Uchida, and Kazuyuki Kanosue. Brain activity during motor imagery of an action
with an object: a functional magnetic resonance imaging study. Neuroscience research,
76(3):150–155, 2013.
[302]
Kai J Miller, Gerwin Schalk, Eberhard E Fetz, Marcel den Nijs, Jerey G Ojemann, and
Rajesh PN Rao. Cortical activity during motor execution, motor imagery, and imagery-based
online feedback. Proceedings of the National Academy of Sciences, 107(9):4430–4435, 2010.
[303]
Wolfgang Taube, Michael Mouthon, Christian Leukel, Henri-Marcel Hoogewoud, Jean-Marie
Annoni, and Martin Keller. Brain activity during observation and motor imagery of dierent
balance tasks: an fMRI study. cortex, 64:102–114, 2015.
152 REFERENCES
[304]
Su Yang and Farzin Deravi. On the usability of electroencephalographic signals for biometric
recognition: A survey. IEEE Transactions on Human-Machine Systems, 47(6):958–969, 2017.
[305]
Erwin HT Shad, Marta Molinas, and Trond Ytterdal. Impedance and Noise of Passive and
Active Dry EEG Electrodes: A Review. IEEE Sensors Journal, 2020.
[306]
Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. IEEE Transactions on
knowledge and data engineering, 22(10):1345–1359, 2009.
[307]
Mahnaz Arvaneh, Cuntai Guan, Kai Keng Ang, and Chai Quek. EEG data space adaptation
to reduce intersession nonstationarity in brain-computer interface. Neural computation,
25(8):2146–2171, 2013.
[308]
Hohyun Cho, Minkyu Ahn, Kiwoong Kim, and Sung Chan Jun. Increasing session-to-session
transfer in a brain–computer interface with on-site background noise acquisition. Journal of
neural engineering, 12(6):066009, 2015.
[309]
Feng Li, Yi Xia, Fei Wang, Dengyong Zhang, Xiaoyu Li, and Fan He. Transfer Learning
Algorithm of P300-EEG Signal Based on XDAWN Spatial Filter and Riemannian Geometry
Classier. Applied Sciences, 10(5):1804, 2020.
[310]
Sara Hegdahl Åsly. Supervised learning for classication of EEG signals evoked by visual
exposure to RGB colors. Master’s thesis, NTNU, 2019.
[311]
Shobiha Premkumar. Subject Identication using EEG Signals and Supervised Learning.
Master’s thesis, NTNU, 2020.
[312] Julie Haga. Biometric system using EEG signals from resting-state and one-class classiers.
Master’s thesis, NTNU, 2020.
[313]
Sara H Åsly, Luis Alfredo Moctezuma, Marta Molinas, and Monika Gilde. Towards EEG-based
signals classication of RGB color-based stimuli. In GBCIC, 2019.
[314]
Alejandro A Torres-Garcıa, Luis Alfredo Moctezuma, Sara Asly, and Marta Molinas.
Discriminating between color exposure and idle state using EEG signals for BCI application.
In International Conference on e-Health and Bioengineering (EHB), 2019.
[315]
Alejandro A. Torres-García., Luis Alfredo Moctezuma., and Marta Molinas. Assessing the
Impact of Idle State Type on the Identication of RGB Color Exposure for BCI. In Proceedings
of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies
- Volume 4: BIOSIGNALS,, pages 187–194. INSTICC, SciTePress, 2020.
[316]
Andres Felipe Soler Guevara, Luis Alfredo Moctezuma, Eduardo Giraldo, and Marta Molinas.
Low-density EEG source reconstruction with channel selection enabled by evolutionary
optimization. arXiv preprint, 2019.
[317]
Pierre Baldi. Autoencoders, unsupervised learning, and deep architectures. In Proceedings of
ICML workshop on unsupervised and transfer learning, pages 37–49, 2012.
[318]
Naveed Rehman and Danilo P Mandic. Multivariate empirical mode decomposition.
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences,
466(2117):1291–1302, 2010.
[319]
Mruthun R Thirumalaisamy and Phillip J Ansell. Fast and adaptive empirical mode
decomposition for multidimensional, multivariate signals. IEEE Signal Processing Letters,
25(10):1550–1554, 2018.
REFERENCES 153
[320]
Qingfu Zhang and Hui Li. MOEA/D: A multiobjective evolutionary algorithm based on
decomposition. IEEE Transactions on evolutionary computation, 11(6):712–731, 2007.
... The Empirical Mode Decomposition (EMD) is a data driven decomposition method, that works with nonlinear and non-stationary processes such as EEG-signals. EMD uses the characteristics of the signal to decomposes it into several Intrinsic mode functions (IMF) [12,13]. IMF is defined by the two criteria: ...
... The algorithm uses a shifting process, where it iterates thought the signal and removes the IMFs of the signal so create a smoother signal to extract features from [12,13]. The step-wise algorithm is given The stopping criteria could be that there are no other IMFs that can be extracted, i.e. the minima and maxima will be the same, or that the desired number of IMF has been extracted. ...
... The result of the algorithm will be n number of IMFs and one residue. The signal can then be represented by the sum of the IMF and the residue r n (t ), given by equation (2.1) [12]. ...
Technical Report
Full-text available
This thesis investigates possible signal processing and classification methods used for Brain Computer interface (BCI) based on motor imagery (MI). The purpose for the investigation is to create a BCI capable of classifying MI tasks into commands to be used in a real time system. Electroen-cephalography (EEG) is a noninvasive method that can be used to record brain activity of the motor imagery. The main challenge of the BCI system is the processing of the EEG measurement, and extracting the meaningful data that contains the MI task. The analysis of this problem used two public EEG datasets, one dataset from Norwegian University of Science and Technology (NTNU) and another from Ganz University of technology. The MI tasks explored in this thesis consist of right and left hand, creating a binary classification problem. Three main approaches to signal processing are explored. The first approach consists of signal decomposition using Frequency Band Extraction (FBE). The second signal decomposition method was the Discrete Wavelet Transformation (DWT). The third and last signal processing method was Empirical Mode Decomposition (EMD). The Frequency band extraction method was further explored using different frequency bands. For each of the decomposition methods, twelve features were extracted. After the feature extraction, two classifiers were explored, namely Random Forest (RF) and Gradient boosting (GB). The results of the experiments showed that the FBE and DWT method outperformed EMD. The performance between the two classifiers was not large enough to conclude any difference between them to say anything about which of the two worked best with classification of MI. The performance of each of the classifiers was subject dependent. For the first dataset, the majority of the subject had performance below or around 50% while a few subjects had performance up to 70.89%. For the second dataset, the majority of the subject had a performance around 70%. The highest performance was 95.51% using FBE with RF. The exploration of feature importance gave that extracting the Teager energy (TE), Instantaneous Energy (IE), root-mean-square (RMS) and variance (var) gave the most information about the difference of the classes.
... The Empirical Mode Decomposition (EMD) is a data driven decomposition method, that works with nonlinear and non-stationary processes such as EEG-signals. EMD uses the characteristics of the signal to decomposes it into several Intrinsic mode functions (IMF) [12,13]. IMF is defined by the two criteria: ...
... The algorithm uses a shifting process, where it iterates thought the signal and removes the IMFs of the signal so create a smoother signal to extract features from [12,13]. The step-wise algorithm is given The stopping criteria could be that there are no other IMFs that can be extracted, i.e. the minima and maxima will be the same, or that the desired number of IMF has been extracted. ...
... The result of the algorithm will be n number of IMFs and one residue. The signal can then be represented by the sum of the IMF and the residue r n (t ), given by equation (2.1) [12]. ...
... Since a single synaptic excitation generates a low electrical current, thousands of neurons need to activate to get a readable signal. In addition, these neurons need to be placed perpendicular to the brain's surface for the signal to be strong enough [25,16,26]. The activity is measured with either intracranial electrodes (invasive method) or on the scalp's surface (non-invasive method). ...
... However, there are three problems with the separation. Firstly, literature disagrees on which band is associated with what frequencies [25]. Secondly, the association with a frequency band does not imply that this is the only brain process. ...
... Monopolar places the reference electrode away from the area of interest. Distancing the reference electrodes from the other electrodes maximize the rejections of the common voltages in the electrode, and the reference [25]. ...
Thesis
Full-text available
This thesis investigates the feasibility of a simple communication system for persons with Locked-in syndrome (LIS) by using a combination of the brain’s color perception and the eye movement of the user. A person diagnosed with LIS is conscious and awake but trapped in his/her own body, unable to move and communicate. The communication system proposed here consists of a brain-computer interface (BCI) that uses recorded electroencephalography (EEG) signals generated after a dedicated visual stimulation protocol. The BCI design needs a classification model, and this thesis explores different state-of-the-art pro- cessing and classification methods for the EEG signal. The classification task is split into two prob- lems. The first problem consists of differentiating between a task state where the subject looks at a presented color and a resting state. The second problem consists of differentiating between the vari- ous task states, a subject looking at one of four different colors. An in-house experiment was designed and conducted to create a dataset that fits the designed BCIs specifications. The dataset includes recorded data from 22 healthy subjects, where everyone was exposed to two different protocols. The first protocol alternated between exposing the participants to one of four colors and a resting state. The second protocol displayed the color with a superimposed background icon indicative of a user- oriented need. The results from the experiments showed that the proposed methods predicted similarly well on in- put data from both protocols. A random forest (RF) classifier proved to predict best on average when trained and tested on data from just one subject. The results calculated from the 22 individual RF models reached the average accuracies of 74.3 % and 61.4 % for differentiating between a task and resting state and between the four task states, respectively. RF reached these results by decomposing the input signal with variational mode decomposition (VMD), where the fractals, energies, and sta- tistical features extracted from the modes were used. Finally, a general model that could predict task-related information from new subjects was tested. The best performing model was a state-of-the-art convolutional neural network (CNN). The model was pre-trained on data from an optimized selection of subject data from a new dataset by the non- dominated sorting genetic algorithm II (NSGA-II). Then, the model performed a short calibration of its weights on 60 % of the data from the new subject the model was going to predict. The average accuracy for differentiating between a task and resting state and between the four task states was 69.8 % and 73.6 %, respectively. This demonstrates that a general model, only needing to calibrate on a few new samples from the user, can be used to create a BCI communication system.
... The EEG channel selection process is in itself informative because it can provide information about the most relevant areas in the brain for a certain neural task for a certain subject or group of subjects. This can be analyzed using apriori information related to the paradigm, which can limit the search space and therefore the results 57 . ...
... Selecting a set of channels will allow us to focus on the most relevant information or brain area, and with this decrease the computational cost for real-time processing and selecting the correct channels contribute to increase the classification performance. Additionally, these techniques will enable cheap home EEG devices that can facilitate long-term monitoring in daily life not limited to hospital/laboratories service 57 . ...
... For tacking the channel selection problem we applied the non-dominated sorting genetic algorithm II (NSGA-II) for optimizing two objectives: 1) maximize the accuracy obtained for Low vs High Arousal or Low vs High Valence classification, and 2) minimize the number of EEG channels used for achieving 1). We selected NSGA-II because it has shown to be robust in dealing with two-objective optimization problems 53,[57][58][59] . ...
Article
Full-text available
In this study we explore how different levels of emotional intensity (Arousal) and pleasantness (Valence) are reflected in Electroencephalographic (EEG) signals. We performed the experiments on EEG data of 32 subjects from the DEAP public dataset, where the subjects were stimulated using 60-second videos to elicitate different levels of Arousal/Valence and then self-reported the rating from 1-9 using the Self-Assessment Manikin (SAM). The EEG data was pre-processed and used as input to a Convolutional Neural Network (CNN). First, the 32 EEG channels were used to compute the maximum accuracy level obtainable for each subject as well as for creating a single model using data from all the subjects. The experiment was repeated using one channel at a time, to see if specific channels contain more information to discriminate between Low vs High Arousal/Valence. The results indicate than using one channel the accuracy is lower compared to using all the 32 channels. An optimization process for EEG channel selection is then designed with the Non-dominated Sorting Genetic Algorithm II (NSGA-II) with the objective to obtain optimal channel combinations with high accuracy recognition. The genetic algorithm evaluates all possible combinations using a chromosome representation for all the 32 channels, and the EEG data from each chromosome in the different populations are tested iteratively solving two unconstrained objectives; to maximize classification accuracy and to reduce the number of required EEG channels for the classification process. Best combinations obtained from a Pareto-front suggests that as few as 8-10 channels can fulfill this condition and provide the basis for a lighter design of EEG systems for emotion recognition. In the best case, the results show accuracies of up to 1.00 for Low vs High Arousal using 8 EEG channels, and 1.00 for Low vs High Valence using only 2 EEG channels. These results are encouraging for research and healthcare applications that will require automatic emotion recognition with wearable EEG.
... The frequencies in the EEG signals separates into five different frequency bands that correspond to the subject's condition. The frequency bands can vary slightly, based on the laboratory or recording device used [14]. In addition, the literature disagrees on which band is associated with what frequencies. ...
... Implying that the brain is never entirely at rest, so a resting state refers to when there is no goaldirected neuronal action with the integration of the external environment and when the subject is not actively engaged in sensory or cognitive processing [14]. ...
... The reference node should be independent of the other electrodes to reduce the noise, as mentioned in section 2.3. As this is not possible, creating a synthetic reference can help with enhancing the signal-to-noise ratio, where Common Average Reference (CAR) [14] is such a method. CAR removes simultaneously-recorded common information from all electrodes. ...
Experiment Findings
Full-text available
This work tests a possible Brain-Computer Interface (BCI) design that can be used for communication for persons with socked-in syndrome (LIS). Persons diagnosed with LIS are conscious and awake but trapped within their bodies, unable to move any muscle except their eyes. The tested design utilizes eye movement and color detection recognized in recorded electroencephalography (EEG) signals. The main challenge and motivation is to create an accurate and fast predicting BCI, which will not exhaust the user. To test the proposal, an EEG dataset from 33 subjects was collected. The EEG data were collected from 8 EEG channels while the subject was exposed to two different protocols based on eye movement. Both protocols have five different classes, a rest class, and four different task classes. The first protocol utilized eye movement and color perception, while the second utilized the same as the first protocol, but also added image association. Aiming for a real-time implementation, the problem was divided into two different challenges solved with two different models. The first model was designed to differentiate between resting-state and any other task. Once the task is identified, the second model is planned to be used, which differentiates between four classes. A state-of-the-art Convolutional Neural Network (CNN) design was applied for classification. Its hyperparameters were tried optimized with a hyperparameter search algorithm, Hyperband. Two different experiments were conducted, creating models based on individual subject data and creating models with cross-subject data. The highest accuracy created with cross-subject data were 80.6% on a resting-state vs task model and 65.1% on a 4-class model. So the experimental results obtained with the dataset shows that the BCI design where possible to implement. However, more advanced research and development will be necessary before the BCI can be implemented into a real-time application for a person diagnsosed with LIS.
... These signals have a wave amplitude that ranges from 0.5 to 100 µV [21]. Table 2.1 displays the various wave types, which can be linked to the individual's condition [22]. The associated states can be used to make better judgments on which frequency ranges are more or less important to our task. ...
... 1: Frequency bands of the brain, with the associated frequency ranges and states where the wave is most prominent[22] ...
Preprint
Full-text available
The work is done with the intention of developing a Brain-Computer-Interface (BCI) for communication for patients with Locked-in Syndrome (LIS), based on EEG signals evoked by RGB colors. This study investigates the differences in classification performance between models based on single individuals vs. general models, which are optimized on a test subject, also known as transfer models. The data set used in the project was collected in Helsinki at Aalto University. The data was captured using a 58-channel cap from antNeuro, where 31 subjects were shown red, green, and blue in a random order, with an additional gray color for capturing the baseline EEG signals. Each run of the experiment lasted for roughly 25 minutes, capturing 140 responses to each primary color, and 420 responses to gray. The data has been processed in different ways to remove ocular artifacts from the EEG signals. This was done to investigate the effect of different processing techniques on classification accuracy. The methods used were; Independent Component Analysis (ICA), Artifact Subspace Reconstruction (ASR), Signal-Space Projection (SSP), and a modified, online version of SSP. For classification, a Convolu-tional Neural Network (CNN) known as EEGNeX was used, as this network is proven to perform well on classifying raw EEG signals. The results show that models based on single individuals perform the best, with the best classification accuracy of 87%. This is expected, as there are large individual differences in EEG responses. Models based on transfer learning do not perform as well, the best accuracy obtained being 84.8%, but the transfer models is able to generalize well based on very small amounts of data. By using only 5 minutes of training data, the transfer models obtain a classification accuracy of 10% higher than the corresponding general models, not optimized on single individuals. This, in addition to the fact that transfer models seem to produce low subject variation, indicates that using transfer learning for this classification problem might work well in the future. Preface This project is the continuation of previous masters-and project-theses [1, 2, 3, 4, 5, 6], which have paved the way for the work presented in this thesis. The project was proposed by Professor Marta Molinas at the Norwegian University of Science and Technology (NTNU) under the Department of Engineering Cybernetics. The project has been a collaboration between Vegard Omsland and myself, and our project theses will, therefore, touch on many similar topics. i
... Experimentally and, based on previous research findings, we used the mother wavelet biorthogonal 2.2 [24], [25]. ...
... The authors in [9] also found differences in posterior cortical area during NREM and REM, they saw that a decrease in low frequencies is related to the report of dream experiences, and an increase in low frequencies is associated with no experiences. In this sense, future research should find a relationship with previous findings through the use of highdensity EEG and channel selection algorithms [24], [25], [30], [31]. ...
Conference Paper
Full-text available
We explored the automatic classification of dreams with emotional content, which were collected by awakening 38 subjects after they had entered to Rapid Eye Movement (REM) sleep, and the dreams were recorded using 6 electroen-cephalographic (EEG) channels. We used the discrete wavelet transform for feature extraction and well-known classification algorithms, such as gradient boosting and random forest, as well a convolutional neural network for creating subject-independent models in different experimental setups. When creating a model to classify dreams with neutral emotion versus a dream with posi-tive/negative emotion, we obtained accuracies of up to 0.66±0.02. We classified dreams with positive versus negative emotional content, obtaining accuracies of up to 0.64 ± 0.03. We were also able to classify dreamless sleep versus sleep with dreams with accuracies of up to 0.85 ± 0.02, and obtained similar accuracies using 2-3 channels selected by the Non-dominated Sorting Genetic Algorithm II. Our results indicate that the proposed methods can classify dream-containing EEG signals with high accuracies. These are encouraging results towards the development of automatic methods that can facilitate the study of emotions in dreams and provide insight into the human psyche to address symptoms of psychiatric and sleep disorders.
... This is relevant because using IAPS/OASIS values may contribute to future work for creating subject-independent models, since presenting the same image to different subjects may report different SAM values. As in our previous research in EEG-related tasks [7,24], we compare both datasets using a pre-processing step, DWT-based feature extraction method and random forest classifier. ...
... Previous research has considered decomposing the EEG signals into different sub-bands and then extracting a set of features, and in some cases, it has shown high performance. To analyze how it works on the tested datasets, we applied DWT to decompose the EEG signals into 5 sub-bands (4 arrays of detail coefficients and 1 approximation), then for each sub-band we computed 10 features: Skewness, Kurtosis, Instantaneous energy, Teager energy, Hjort mobility, Hjort complexity, Selvik fractal dimension, Higuchi fractal dimension, Katz fractal dimension, Petrosian fractal dimension [7,24]. ...
Conference Paper
Full-text available
This study aims to compare the automatic classification of emotions based on the self-reported level of arousal and valence with the Self-Assessment Manikin (SAM) when subjects were exposed to videos or images. The classification is performed on electroencephalographic (EEG) signals from the DEAP public dataset, and a dataset collected at the University of Tsukuba, Japan. The experiments were defined to classify low versus high arousal/valence using a Convolutional Neural Network (CNN). The obtained results show a higher performance when the subjects were exposed to videos, i.e., using DEAP dataset we obtained an area under the receiver operating characteristic (AUROC) of 0.844±0.008 and 0.836±0.009 to classify low versus high arousal/valence, respectively. In contrast, when subjects were stimulated with images, the obtained performance was 0.621±0.007 for both, arousal and valence classification. The obtained difference was confirmed by testing the experiments using a method based on the Discrete Wavelet Transform (DWT) for feature extraction and classification using random forest. Using image-based stimulation may help to better understand low and high arousal/valence when analyzing event-related potentials (ERP), however, according to the obtained results, for classification purposes, the performance is higher using video-based stimulation.
... The latter is usually a significant problem when dealing with EEG signals [31]. This can, for example, be done the same way as presented in [32]. ...
Preprint
Full-text available
This work presents two approaches for epileptic seizure detection. One patient-independent and one patient-dependent approach. Feature and channel reduction was done on the patient-independent approach. An accuracy between 95.9% and 100% was obtained for the patient-dependent approach, depending on which machine learning method was used. An accuracy of 97.6%, 96.4% and 88.4% were obtained for the patient-independent approach using 1-3 features and one channel, depending on which machine learning method is used.
Chapter
Full-text available
This study aims to compare the automatic classification of emotions based on the self-reported level of arousal and valence with the Self-Assessment Manikin (SAM) when subjects were exposed to videos or images. The classification is performed on electroencephalographic (EEG) signals from the DEAP public dataset, and a dataset collected at the University of Tsukuba, Japan. The experiments were defined to classify low versus high arousal/valence using a Convolutional Neural Network (CNN). The obtained results show a higher performance when the subjects were exposed to videos, i.e., using DEAP dataset we obtained an area under the receiver operating characteristic (AUROC) of 0.844 ± 0.008 and 0.836 ± 0.009 to classify low versus high arousal/valence, respectively. In contrast, when subjects were stimulated with images, the obtained performance was 0.621 ± 0.007 for both, arousal and valence classification. The obtained difference was confirmed by testing the experiments using a method based on the Discrete Wavelet Transform (DWT) for feature extraction and classification using random forest. Using image-based stimulation may help to better understand low and high arousal/valence when analyzing event-related potentials (ERP), however, according to the obtained results, for classification purposes, the performance is higher using video-based stimulation.
Conference Paper
Full-text available
A new EEG concept is envisioned to realize a low-cost, real-time and flexible EEG solution for everyone. This new EEG concept will be based on an optimized design with a reduced number of channels and the use of wireless dry non-invasive active electrodes to support portability and ease of use. While a laboratory setting and research-grade EEG equipment ensure a controlled environment and high-quality multiple-channel EEG recording, there are applications, situations, and populations for which this is not suitable. Conventional EEG is challenged by high computational cost, high-density, immobility of equipment and the use of inconvenient conductive gels/saline solutions. One consequence of high-density EEG is that interpretation in real-time is not available today. Technological advancements in dry sensor systems have opened avenues of possibilities to develop wireless and portable EEG systems with dry electrodes to reduce many of these barriers. While being portable and relying on dry-sensor technology, it will be expected to produce recordings of comparable quality to a research-grade EEG system but with wider scope and capabilities than conventional lab-based EEG equipment. In short, a single more intelligent active EEG electrode could defeat high-density EEG. Through this new concept, the range of applications of EEG signals will be expanded from clinical diagnosis and research to health-care, to a better understanding of cognitive processes, to learning and education, flexible neurofeedback and to today hidden/unknown properties behind ordinary human activity and ailments (e.g., acute chronic pain, resting-state, walking, complex cognitive activity, etc.). The effect of both, electrode localization and the number of electrodes, will be explored by gradually removing electrode information, taking into account very important characteristics; sex, age, hemisphere lateralization, intelligence quotient, and the paradigm used. It will make possible to materialize a low-cost EEG device within the reach of everyone. A low-density EEG device with dry electrodes will take less time to install, will be more user-friendly, will consume less power and possible to use for a prolonged time. All these achieved at a lower cost.
Research Proposal
Full-text available
FlexEEG anticipates a new low-density EEG scanning concept based on dry electrodes that will bring real-time brain imaging from the scalp signals into the hands of the user. This will materialize into a real-time Brain Computer Interface (BCI) with brain mapping capabilities. FlexEEG will address the hardware and software challenges together in an embedded design solution that will merge dry electrode-amplifier with the brain mapping tool into a wireless digital EEG sensor. To achieve this, it will exploit methods of inverse problems, path tracking and integrated circuit design for EEG scanning that can attain comparable quality to high-density EEG, to be tested in infants and intensive care units. FlexEEG will have significant impact in expanding the use of EEG brain mapping from research to daily clinical use and to domains of cognitive development, intensive care medicine and rehabilitation.
Article
Full-text available
6 Korea 7 * These two writers contributed equally. Abstract 18 Drowsiness is a leading cause of traffic and industrial accidents, costing lives and productivity. 19 Electroencephalography (EEG) signals can reflect awareness and attentiveness, and low-cost 20 consumer EEG headsets are available on the market. The use of these devices as drowsiness 21 detectors could increase the accessibility of safety and productivity-enhancing devices for small 22 businesses and developing countries. We conducted a systemic review of currently available, low-23 cost, consumer EEG-based drowsiness detection systems. We sought to determine whether 24 consumer EEG headsets could be reliably utilized as rudimentary drowsiness detection systems. 25 We included documented cases describing successful drowsiness detection using consumer EEG-26 based devices, including the Neurosky MindWave, InteraXon Muse, Emotiv Epoc, Emotiv Insight, 27 and OpenBCI. Of 46 relevant studies, approximately 27 reported an accuracy score. The lowest of 28 these was the Neurosky Mindwave, with a minimum of 31%. The second lowest accuracy reported 29 was 79.4% with an OpenBCI study. In many cases, algorithmic optimization remains necessary. 30 Different methods for accuracy calculation, system calibration, and different definitions of 31 drowsiness made direct comparisons problematic. However, even basic features, such as the power 32 spectra of EEG bands, were able to consistently detect drowsiness. Each specific device has its 33 own capabilities, tradeoffs, and limitations. Widely used spectral features can achieve successful 34 drowsiness detection, even with low-cost consumer devices; however, reliability issues must still 35 be addressed in an occupational context. 36 37 38
Article
Full-text available
Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.
Article
Full-text available
We present a new approach for a biometric system based on electroencephalographic (EEG) signals of resting-state, that can identify a subject and reject intruders with a minimal subset of EEG channels. To select features, we first use the discrete wavelet transform (DWT) or empirical mode decomposition (EMD) to decompose the EEG signals into a set of sub-bands, for which we compute the instantaneous and Teager energy and the Higuchi and Petrosian fractal dimensions for each sub-band. The obtained features are used as input for the local outlier factor (LOF) algorithm to create a model for each subject, with the aim of learning from it and rejecting instances not related to the subject in the model. In search of a minimal subset of EEG channels, we used a channel-selection method based on the non-dominated sorting genetic algorithm (NSGA)-III, designed with the objectives of minimizing the required number EEG channels and increasing the true acceptance rate (TAR) and true rejection rate (TRR). This method was tested on EEG signals from 109 subjects of the public motor movement/imagery dataset (EEGMMIDB) using the resting-state with the eyes-open and the resting-state with the eyes-closed. We were able to obtain a TAR of 1.000 ± 0.000 and TRR of 0.998 ± 0.001 using 64 EEG channels. More importantly, with only three channels, we were able to obtain a TAR of up to 0.993 ± 0.01 and a TRR of up to 0.941 ± 0.002 for the Pareto-front, using NSGA-III and DWT-based features in the resting-state with the eyes-open. In the resting-state with the eyes-closed, the TAR was 0.997 ± 0.02 and the TRR 0.950 ± 0.05, also using DWT-based features from three channels. These results show that our approach makes it possible to create a model for each subject using EEG signals from a reduced number of channels and reject most instances of the other 108 subjects, who are intruders in the model of the subject under evaluation. Furthermore, the candidates obtained throughout the optimization process of NSGA-III showed that it is possible to obtain TARs and TRRs above 0.900 using LOF and DWT- or EMD-based features with only one to three EEG channels, opening the way to testing this approach on bigger datasets to develop a more realistic and usable EEG-based biometric system.
Chapter
Full-text available
Epilepsy is a well-known neurological disorder which affects moreover 2% of the World’s population. Irregular excessive neuronal activities to the human brain cause epileptic seizures onset. Electroencephalograph (EEG) signals are mostly examined for the detection of epileptic seizure onsets. But an EEG signal consists of a huge amount of complicated information and it is very difficult to analyze it manually. Over the decades, a lot of research has been focused on the development of automated epilepsy diagnosis systems. These systems are dependent on sophisticated feature captureization and classification techniques. The paper aims to present a generalized review and performance comparison of the work reported over a decade in the area of automated epilepsy diagnosis systems that will help future researchers lead a better direction.
Article
Full-text available
Dry electrodes are a promising solution for prolonged EEG signal acquisition, whereas wet electrodes may lose their signal quality in the same situation and require skin preparation for set-up. Here, we review the impedance and noise of passive and active dry EEG electrodes. In addition, we compare noise and input impedance of the EEG amplifiers. As there are multiple definitions of impedance in each EEG system, they are all first defined. Electrodes must be compatible with amplifiers to accurately record EEG signals. This implies that their impedance plays a significant role in amplifier compatibility and affects total input-referred noise. Therefore, we review the impedance and noise of state-of-the-art amplifiers and electrodes. Furthermore, we compare the various structures and materials used and their final impedance to that of wet electrodes. Finally, we compare state-of-the-art electrodes and amplifiers to the standards of the IFCN and IEC80601-2-26. We investigate bottlenecks and propose a guideline for future work on passive and active dry electrodes, as well as EEG amplifiers.
Article
Full-text available
Brain-computer interface (BCI) system based on motor imagery (MI) usually adopts multichannel Electroencephalograph (EEG) signal recording method. However, EEG signals recorded in multi-channel mode usually contain many redundant and artifact information. Therefore, selecting a few effective channels from whole channels may be a means to improve the performance of MI-based BCI systems. We proposed a channel evaluation parameter called position priori weight-permutation entropy (PPWPE), which include amplitude information and position information of a channel. According to the order of PPWPE values, we initially selected half of the channels with large PPWPE value from all sampling electrode channels. Then, the binary gravitational search algorithm (BGSA) was used in searching a channel combination that will be used in determining an optimal channel combination. The features were extracted by common spatial pattern (CSP) method from the final selected channels, and the classifier was trained by support vector machine. The PPWPE + BGSA + CSP channel selection method is validated on two data sets. Results showed that the PPWPE + BGSA + CSP method obtained better mean classification accuracy (88.0% vs. 57.5% for Data set 1 and 91.1% vs. 79.4% for Data set 2) than All-C + CSP method. The PPWPE + BGSA + CSP method can achieve higher classification in fewer channels selected. This method has great potential to improve the performance of MI-based BCI systems.
Article
Full-text available
In this paper, we present an optimal channel selection method to improve common spatial pattern (CSP) related features for motor imagery (MI) classification. In contrast to existing channel selection methods, in which channels significantly contributing to the classification in terms of the signal power are selected, distinctive channels in terms of correlation coefficient values are selected in the proposed method. The distinctiveness of a channel is quantified by the number of channels with which it yields large difference in correlation coefficient values for binary motor imagery (MI) tasks, rather than by the largeness of the difference itself. For each distinctive channel, a group of channels is formed by gathering strongly correlated channels and the Fisher score is computed using the feature output, based on the filter-bank CSP (FBCSP) exclusively applied to the channel group. Finally, the channel group with the highest Fisher score is chosen as the selected channels. The proposed method selects the fewest channels on average and outperforms existing channel selection approaches. The simulation results confirm performance improvement for two publicly available BCI datasets, BCI competition III dataset IVa and BCI competition IV dataset I, in comparison with existing methods.
Article
Full-text available
Brain-Computer Interface (BCI), in essence, aims at controlling different assistive devices through the utilization of brain waves. It is worth noting that the application of BCI is not limited to medical applications, and hence, the research in this field has gained due attention. Moreover, the significant number of related publications over the past two decades further indicates the consistent improvements and breakthroughs that have been made in this particular field. Nonetheless, it is also worth mentioning that with these improvements, new challenges are constantly discovered. This article provides a comprehensive review of the state-of-the-art of a complete BCI system. First, a brief overview of electroencephalogram (EEG)-based BCI systems is given. Secondly, a considerable number of popular BCI applications are reviewed in terms of electrophysiological control signals, feature extraction, classification algorithms, and performance evaluation metrics. Finally, the challenges to the recent BCI systems are discussed, and possible solutions to mitigate the issues are recommended.