ArticlePDF Available

Abstract and Figures

Emotion is a significant parameter in daily life and is considered an important factor for human interactions. The human-machine interactions and their advanced stages like humanoid robots essentially require emotional investigation. This paper proposes a novel method for human emotion recognition using electroencephalogram (EEG) signals. We have considered three emotions namely neutral, positive, and negative. These EEG signals are separated into five frequency bands according to EEG rhythms and the differential entropy is computed over the different frequency band components. The convolution neural network (CNN) and long short-term memory (LSTM) based hybrid model is developed for accurate emotion detection. Further, the extracted features are fed to all three models for emotion recognition. Finally, an ensemble model combines the predictions of all three models. The proposed approach is validated on two datasets namely SEED and DEAP for EEG based emotion analysis. The developed method achieved 97.16% accuracy on SEED dataset for emotion classification. The experimental results indicate that the proposed approach is effective and yields better performance than the compared methods for EEG-based emotion analysis.
This content is subject to copyright. Terms and conditions apply.
Noname manuscript No.
(will be inserted by the editor)
CNN and LSTM based Ensemble Learning for
Human Emotion Recognition using EEG Recordings
Abhishek Iyer ·Srimit Sritik Das ·Reva
Teotia ·Shishir Maheshwari ·Rishi Raj
Sharma
Received: date / Accepted: date
Abstract Emotion is a significant parameter in daily life and is considered an
important factor for human interactions. The human-machine interactions and
their advanced stages like humanoid robots essentially require emotional inves-
tigation. This paper proposes a novel method for human emotion recognition
using electroencephalogram (EEG) signals. We have considered three emotions
namely neutral, positive, and negative . These EEG signals are separated into
five frequency bands according to EEG rhythms and the differential entropy
is computed over the different frequency band components. The convolution
neural network (CNN) and long short-term memory (LSTM) based hybrid
model is developed for accurate emotion detection. Further, the extracted fea-
tures are fed to all three models for emotion recognition. Finally, an ensemble
model combines the predictions of all three models. The proposed approach is
validated on two datasets namely SEED and DEAP for EEG based emotion
analysis. The developed method achieved 97.16% accuracy on SEED dataset
for emotion classification. The experimental results indicate that the proposed
approach is effective and yields better performance than the compared meth-
ods for EEG-based emotion analysis.
Keywords Emotion recognition ·EEG ·Hybrid model ·Differential entropy ·
LSTM
Abhishek Iyer ·Srimit Sritik Das ·Reva Teotia ·Shishir Maheshwari
Department of Electrical and Electronics Engineering, Birla Institute of Technology and
Science, Pilani-333031, India
E-mail: f20181105@pilani.bits-pilani.ac.in, f20180527@pilani.bits-pilani.ac.in,
f20190268@pilani.bits-pilani.ac.in, shishir.maheshwari@pilani.bits-pilani.ac.in
Rishi Raj Sharma
Department of Electronics Engineering, Defence Institute of Advanced Technology, Pune-
411025, India
E-mail: dr.rrsrrs@gmail.com
2 Abhishek Iyer et al.
1 Introduction
Emotion is a pivotal factor in human life as it affects the working ability, men-
tal state, and judgment of human beings. Numerous experts worked on this
topic in different disciplines like psychology, cognitive science, neuroscience,
computer technology, brain-computer interfacing (BCI) [39], and others. The
electroencephalogram (EEG) based emotion recognition has created a lot of
scope in these disciplines using the actual affective state [21]. BCI has nu-
merous applications of EEG-based emotion recognition including humanoid
robots. Most of the humanoid robots lack in terms of emotion and this field is
not much explored. In effective BCI, emotion recognition is a major parameter
and it becomes complex due to its fuzzy property. Human emotion correlates
with context, language, time, space, culture, and other components. There-
fore, the absolute true labels are also not possible with different emotions with
EEG recording, which creates the issue [3].
Many authors proposed facial expression [51], gesture [31],[9], posture,
speech [28], and other physical signal-based emotion recognition methods.
These types of data are easy to record but can be easily controlled and falsify
the true emotion [15]. The controlling or mimicking of nervous system related
signals is very tough as they are involuntarily activated [40] and only subject
experts can control these signals. Therefore, true emotion signature can be
observed in nervous system related recordings. Several physiological record-
ings such as EEG, electrocardiogram (ECG) [35], temperature, electromyo-
gram (EMG) [5], respiratory, galvanic skin response (GSR) can be used to
study the human emotion [40]. The minute investigation of brain activity with
various emotions can assist the accurate and computation efficient emotion
recognition models. Recent research in dry electrode implementation [11], [13]
and wearable devices promote the EEG recording based emotion identification
in real-time accomplishment for mental state monitoring [24], [36]. EEG based
emotion recognition is one of the key feature required in human-machine in-
teraction (HMI) and a humanoid robot. This study is focused on EEG based
human emotion analysis in which the electrical activity of the brain is inves-
tigated during different emotions (neutral, positive, and negative emotions).
Many studies are performed for EEG based human emotion diagnosis and
tried to form a definitive relationship of EEG signals with different emotions
[29], [34]. EEG signal analysis is a very challenging task as it is non-stationary
[37]. In a real-time scenario, other signals are added into EEG recording and
signal to noise ratio (SNR) becomes low. Matrix decomposition-based EEG
signal analysis methods are proposed but due to high complexity, real-time
implementation is tough [7], [42], [38]. The emotion related stable patterns of
EEG recordings are observed in [55] which uses the DEAP dataset. The critical
frequencies of emotion and significant channel selection in EEG recordings are
micro-observed in [54]. This channel selection is good to find the position
of the electrode for emotion analysis. The time-frequency based and various
non-linear features are studied for EEG based emotion recognition and achieve
Title Suppressed Due to Excessive Length 3
59.06% accuracy (ACC) with DEAP data [21]. It is suggested that the gamma-
band in EEG recordings is more correlated with the emotion function [19].
Many machine learning-based architectures are proposed for EEG based
emotion examination. The bi-hemisphere based neural network is designed for
EEG emotion detection, and the experiment is performed on SEED dataset
with 63.50% ACC [22]. The graph neural network-based emotion recognition
is performed on the same dataset with 89.23% ACC using gamma band, and
94.24% ACC with all bands [56]. A regional asymmetric convolution neural
network (CNN) based study is carried on DEAP data and acquire 95% ACC
for arousal and valence emotion detection [6]. In most of the methods, existing
models are improved to achieve a good classification of human emotion. The
proposed approach employs multiple models and develops a hybrid approach
to attain better ACC than the developed methods. The two models, based
on CNN and long short term memory (LSTM), are hybridized to improve the
final prediction using the ensemble model.
Rest of the article is organized in the following manner. Section 2, presents
the dataset. Proposed approach with features is explained in section 3. The
proposed hybrid model along with the CNN and LSTM based models are
presented in section 4. This also includes the implementation of ensemble
learning. Results are explained in section 5. Finally, the article is concluded in
section 6.
2 Dataset
We have used two datasets for EEG based emotion recognition. The detailed
explanation of both the datasets is given next to this section.
2.1 SEED data
The database employed in the proposed approach has been obtained from the
brain like computing and machine learning (BCMI) methods. We employed
the SJTU emotion EEG dataset (SEED) [54], [8]. The dataset contains EEG
data of 15 subjects (7 males and 8 females) recorded in three separate ses-
sions, each session having 15 trials. In each trial, the EEG signal is recorded
when the subject is watching Chinese film clips with three types of emotions,
namely positive, neutral, and negative. The duration of each film clip is about
4 minutes and two film clips targeting the same emotion are not shown consec-
utively. The participants reported their emotional reactions to each film clip
by completing the questionnaire immediately after watching each film clip.
The EEG signals are recorded using a 62- channel electrode cap according to
the international 10-20 system. The data is then down-sampled to 200Hz to
make system faster and a band-pass frequency filter from 0-75Hz is applied
which contains all the EEG rhythm information.
4 Abhishek Iyer et al.
2.2 DEAP data
The DEAP data has been recorded for the analysis of human emotion using
EEG signals. It is recorded for 32 healthy participants aged between 19 years
to 37 years and out of 32 participants 16 were female. Each participant has
been exposed to 40 music videos each of which has a duration of 1-min with the
same emotion throughout the video length. The data comprises 40 channels
out of which 32 EEG channels have been investigated in this paper. The data
is recorded with Biosem ActiveTwo devices at a sampling rate of 512 Hz. It is
further downsampled to 128 Hz to reduce the system complexity. The DEAP
data provides 32 files, where each file contains the 40-channel EEG recording
of 40 videos of one minute duration each.
3 Proposed approach
Block diagram of the proposed approach is shown in Fig. 1. All the subjects
were made to sit on a chair in the resting state and are asked to watch the
videos portraying different emotions. Simultaneously, EEG signals are recorded
and pre-processing is done. The differential entropy (DE) based features are
computed in five EEG rhythms. The DE based features are explained in the
next sub-section. Further, CNN and LSTM models are employed and combined
to obtain the hybrid model. Thereafter, the ensemble model is proposed based
on these models.
3.1 Features extraction
We have employed DE as a feature in the proposed approach. DE extends
the idea of Shannon entropy and is used to measure the complexity of a con-
tinuous random variable. DE as a feature was first introduced to EEG-based
emotion recognition by Duan et al.[8]. It has been found to be more suited for
emotion recognition than the traditional feature. DE has the balanced abil-
ity to discriminate EEG patterns between low and high-frequency energy. DE
feature extracted from EEG data provides stable and accurate information for
emotion classification [53]. The differential entropy feature is as defined below:
h(Y) = Z
−∞
1
2πσ2e(yµ)2
2σ2log( 1
2πσ2e(yµ)2
2σ2)dy
=1
2log(2πeσ2)
(1)
where the time series Y obeys the Gaussian distribution N(µ,σ2).
DE was employed to construct features in the five frequency bands: delta
(1- 3Hz), theta (4-7Hz), alpha (8-13Hz), beta (14-30Hz), and gamma (31-
50Hz). For the SEED dataset, the extracted DE feature for a sample EEG
signal has 310 dimensions as there are 62 channels for each frequency band
Title Suppressed Due to Excessive Length 5
Data
preprocessing
and feature
extraction
Stimuli
Emotion
classification
Positive
Negative
Neutral
Model & training
Data
Input Ensemble
CNN
LSTM
Hybrid
Fig. 1 Block diagram of the proposed system for EEG-based emotion recognition.
[54]. Similarly, the 32 channels are considered for DEAP dataset in five EEG
sub-bands which leads to total 160 DE features.
Various models along with the hybrid model and ensemble model are ex-
plained in next section.
4 Model employed for emotion recognition in the proposed system
In the proposed work, initially, the CNN and LSTM based models are employed
for emotion recognition. Thereafter, a hybrid model is proposed which is a
combination of CNN and LSTM. Finally, an ensemble model of these three
proposed models is taken into consideration. All these models are explained
in this section.
4.1 CNN-based model
The idea behind CNNs bears a resemblance to traditional artificial neural
networks (ANNs), consisting of neurons that self-optimize through learning.
CNN’s are powerful performers on large sequential data represented by ma-
trices such as images broken down to their pixel values [46]. A smaller n×n
kernel slides over the entire feature matrix performing convolutions over the
superposed space [12]. The feature map size can be kept consistent across mul-
tiple convolutions using padding of 0s. However, functions like Max Pooling
are employed to reduce the amount of computational data and still retain the
important information [27]. As the feature maps pass through the different
convolutional layers, the filters learn to detect patterns and more abstract
features.
EEG based emotion classification using the CNN method was also ex-
plored in the approaches of [47]. Cascade and parallel convolutional recurrent
neural networks have been used for EEG human-intended movement classifi-
cation tasks [52]. Additionally, before applying the CNN, EEG data could be
converted to image representation after feature extraction [43]. However, the
accuracy of emotion recognition by using only CNN is not high.
The details of the CNN architecture employed in the proposed approach
are shown in Fig. 2: The CNN model consists of four convolutional (conv)
blocks with 64, 128, 256, 512 filters, respectively. The kernel size of conv filters
6 Abhishek Iyer et al.
Input shape
(62,265,5)
Output shape
(31,132,64)
Output shape
(15,66,128)
Output shape
(7,33,256)
Output shape
(3,16,512)
2457651225664
Conv Block 2
Conv2D
Conv Block 3
Conv2D
Maxpool
Dropout
Conv2D
Maxpool
Dropout
Conv Block 1
Conv2D
Conv2D
Conv2D
Maxpool
Dropout
Fully Connected Block Conv Block 4
Conv2D
Maxpool
Dropout
Flatten
Dense
Dropout
Dense
Dropout
Dense
Drpoout
Softmax
Output
Fig. 2 CNN-based model architecture employed in the proposed approach for EEG-based
emotion recognition.
is 5×5 and 3×3. All the layers use padding, followed by maximum sub-sampling
layers, which operate over 2 ×2 sub-windows at each conv layer, known as the
Max Pooling layers. The network ends with three fully connected dense layers
fed to the c-way softmax [25] classification layer. Relu activation is employed
due to its unity gradient, where the maximum amount of error is passed during
back-propagation [1]. Dropout regularization is used after every layer which
improves the performance of the model via a modest regularization effect [30].
Thereafter, the predictions of the CNN model are fed to the proposed ensemble
model for emotion recognition.
4.2 LSTM-based model
The LSTM networks are modified recurrent neural networks (RNN), capable
of learning long-term dependencies. LSTM network is parametrized by weight
matrices from the input and the previous state for each of the gates, in addition
to the memory cell, which overcomes the issue of vanishing/exploding gradient
[10].
We use the standard formulation of LSTMs with the logistic function (σ)
[4] on the gates and the hyperbolic tangent [2] on the activations. The input
is of the shape 1325 ×62. The model has 4 LSTM layers with dropouts in be-
tween, and then the output is passed to the fully connected network. SoftMax
activation function [25] is used to predict the final output. The block diagram
of the LSTM architecture is shown in Fig. 3.
Title Suppressed Due to Excessive Length 7
Input
Fully Connected Block
Flatten
Dense
Dropout
Dense
Dropout
Dense
Softmax
Input shape
(1325,62)
Output shape
(1325,256)
Output shape
(1325,128)
339200 512 256 64 Output
LSTM Block
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
LSTM
Fig. 3 LSTM-based architecture employed in the proposed approach for EEG-based emo-
tion recognition.
4.3 Hybrid model
The hybrid model combines more than one base model in series. Fig. 4 shows
the structure of the hybrid model employed in the proposed approach. The
hybrid model improves the performance by capturing more information that
is left undetected previously.
The first three blocks of the hybrid model consist of convolutional (conv)
blocks. The conv block consists of max pool layers and the Dropout regu-
larisation to avoid overfitting [30]. The output shape of the third conv block
is 15 ×66 ×512. On the other hand, the input shape to the LSTM block is
66 ×7680. The reshape layer is employed between the conv and LSTM block
to facilitate this dimensional mismatch. In general, 2D conv block work on in-
puts which are R3, while LSTM inputs are in R2. The LSTM network uses the
Tanh activation function [2] and batch norm regularization [16]. The output
of the LSTM block is passed to a fully connected network that uses softMax
[25] to calculate the probabilities of the output.
4.4 Ensemble learning-based model
Ensemble learning is mainly of two types, namely, homogeneous and hetero-
geneous. It combines the prediction from multiple models and integrates the
individual strengths of the base models. This results in the robustness and
the improved performance of the overall approach [50]. Ensemble learning is
homogeneous when the base models are of the same type. In the proposed
approach, ensemble learning is heterogeneous as the base models are different.
Once these models are trained, a statistical method is used to combine
the predictions of the different models. The statistical method involves the
methods of bagging, boosting, and stacking. We have employed stacking as it
is suitable for heterogeneous ensemble models [44]. Stacking is the process in
which separate models learn parallely on the dataset and a small meta model,
usually a feed-forward neural network (FNN) is used to combine individual
predictions and come up with the final outputs. Stacking introduces a meta-
model that receives the different predictions of the base models as its input.
8 Abhishek Iyer et al.
Conv Block 3Conv Block 1
Conv2D
Conv2D
Conv2D
Maxpool
Dropout
Conv Block 2
Conv2D
Conv2D
Maxpool
Dropout
Conv2D
Reshape
LSTM Block Fully Connected Block
Softmax
Output shape
(66,512)
3379251225664Output
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
Flatten
Dense
Dropout
Dense
Dropout
Desne
Input shape
(62,265,5)
Output shape
(31,132,128)
Output shape
(15,66,256)
Output shape
(15,66,512)
Output shape
(66,7680)
Fig. 4 Hybrid model employed in the proposed approach for EEG-based emotion recogni-
tion.
The combining function is
either a max-function or a
meta prediction model
Ensemble Model
Prediction
Prediction
Prediction
Final
Prediction
CNN Model
LSTM Model
Hybrid Model
Fig. 5 Ensemble model employed in the proposed approach for EEG-based emotion recog-
nition.
The meta-model [48] learns to maximize the output prediction, and this be-
comes our final output. In addition to stacking, we have also investigated the
max function as a statistical method to combine the predictions. Fig. 5 shows
the block diagram of ensemble model. The meta model used in the stack-
ing method consists of 4 fully connected (FC) layers followed by a softmax
classifier [25].
Title Suppressed Due to Excessive Length 9
5 Results & discussion
The proposed approach has been evaluated on two datasets namely SEED
and DEAP. In the proposed approach, the performance of various models has
been investigated using the k-fold cross-validation test [33] with k = 10. The
individual performances of the CNN, LSTM and the hybrid model have been
obtained. Further, we have also obtained the performance of the ensemble
model. Each model is trained for 60 epochs with a batch size of 64. The
learning rate (LR) has not been fixed due to saturation in loss which results
in no further improvement in the performance of the model. To overcome this
limitation, we have employed LR annealer which makes the learning rate a
variable parameter. It should ne noted that we have used same feature and
experimental setup for both the datasets.
5.1 Experimental results for SEED data
The performance of individual models are measured by evaluating certain
parameters such as weighted average precision (WAP), weighted average sen-
sitivity (WAS), and weighted average F1 score (WAF1). F1 score is a good
metric to check stability of the model. Table 1 tabulates the performance pa-
rameters of individual models, hybrid model and ensemble model for EEG
emotion recognition.
The experimental results suggest that the CNN and LSTM model individ-
ually achieves the classification accuracy (ACC) of 89.53% and 89.99%, re-
spectively. The hybrid model achieves an ACC of 93.46%. On the other hand,
the ensemble model achieves the ACC of 97.16% for the stack-based ensemble
learning. The results for SEED data are tabulated in Table 1. From Table 1,
it can be noticed that the ensemble-based method provides improved perfor-
mance over other models. We believe that this is because the base models are
not weak and provide good accuracy by themselves.
Fig. 7 shows the plot of loss function and LR with respect to epoch. When
LR saturates after some epochs, there is no significant decreases in loss which
results in poor model performance. On the other hand, as we decrease the RL
when loss saturates, the loss tends to settle more quickly and improves system
performance.
We have also shown Box and Whisker plots of ACC to shed some more
light on the results. The inter-quartile range (IQR) is indicated with the box
and the orange line showing where the median lies. This includes all the results
from 25 percentile to 75 percentile. The minimum and maximum values are
marked by the solid black line at the top and bottom of the box and whisker
plot. The outliers, marked by circles, are results that did not fall in the whisker
range, which contains results in the range of 1.5×IQR.
We further compare our experimental results of the proposed approach
with some of the past benchmark methodologies on emotion recognition on
the SEED dataset. Table 2 tabulates the comparison of the proposed approach
10 Abhishek Iyer et al.
Table 1 Classification performance of individual and ensemble model for EEG-based emo-
tion recognition on SEED data.
Model MCC WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)
CNN-based 0.819 89 88 88.49 89.53 1.78
LSTM-based 0.821 89 89 89.00 89.99 1.92
Hybrid 0.867 94 93 93.49 93.46 1.36
Ensemble 0.90 97 97 97.00 97.16 1.08
Fig. 6 The box and Whisker plots for ACC achieved by proposed (top-left) CNN-based
model, (top-right) LSTM-based model, (bottom-left) Hybrid-model, (bottom-right) Ensem-
ble stack model for EEG-based emotion recognition
with other past benchmark results. It can be observed that the proposed ap-
proach outperforms the previous methodologies. It can also be noticed that the
standard deviation (STD) of the proposed approach is very less as compared
with other approaches tabulated in Table 2. This also reflects the repeatability
and reproducibility of the proposed approach.
5.2 Experimental results for DEAP data
We have also employed DEAP dataset for evaluating the performance of pro-
posed approach for EEG-based emotion analyis with same feature and exper-
imental setup. The performance of the proposed approach on DEAP dataset
Title Suppressed Due to Excessive Length 11
(a) (b)
Fig. 7 (a) Loss figures (b) Learning rate figures
has been tabulated in Table 3. It can be observed from Table 3 that the ensem-
ble obtains maximum performance as compared to other individual models.
The CNN-based, LSTM-based and hybrid models achieve classification per-
formances of 63.50%, 63.89%, 64.02%, respectively. Tables demonstrate that
the ensemble model achieves better performance than individual models. The
performance of other existing works on DEAP data with same DE feature has
been compared in Table 4. It can be observed from Table 4 that the proposed
system attains better performance than the existing methods for EEG-based
emotion recognition.
12 Abhishek Iyer et al.
Table 2 Comparison of previous benchmark methodologies for EEG-based emotion recog-
nition on SEED data.
Paper Model Feature ACC (%) STD (%)
He Li et. al[18] DAN DE 83.81 8.56
Wang et. al[49] PGCNN RASM 84.35 10.28
T. Song et. al[41] DGCNN DE 90.40 8.49
Hwang et. al[14] CNN TP-DE (cubic) 90.41 8.71
Wei Liu et. al[26] BDAE DE 91.01 8.91
W Zheng et. al[53] GELM DE 91.07 7.54
Hao Tang et. al[45] Bimodal-LSTM PSD+DE 93.97 -
JL Qiu et. al[32] DCCA - 94.58 6.16
Liu J et. al[23] CNN+SAE+DNN PCC 96.77 -
Proposed approach Ensemble(stack) DE 97.16 1.08
Table 3 Classification performance of individual and ensemble model for EEG-based emo-
tion recognition on DEAP data.
Classification Methods WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)
CNN 64 74 68 63.50 3.89
LSTM 64.3 74 68 63.89 3.78
Hybrid 64 75 69 64.02 3.84
Ensemble 65 75 70 65.00 3.57
For the future work, we planned to extent our work to propose new feature
for the effective emotion recognition from EEG signal. Also, the SEED and
DEAP datasets will be evaluated with new features to further improve the
existing performances. We also intend to test the proposed model for other
EEG-based neuronal system development.
Table 4 Comparison of previous benchmark methodologies for EEG-based emotion recog-
nition on DEAP data.
Paper Classification Methods Feature Avg ACC (%) STD (%)
Lan et. al [17] logistic regression DE 48.93 15.50
Li et. al [20] SVM DE 57.00 0.3
Proposed approach Ensemble(stack) DE 65.00 3.57
Title Suppressed Due to Excessive Length 13
6 Conclusion
This paper proposes the ensemble learning-based EEG emotion recognition
system. Firstly, the differential entropy was extracted from different frequency
bands of EEG signals. Thereafter, these features are fed to CNN and LSTM
based models. The hybrid model is developed by combining the sub-blocks of
CNN and LSTM models. The ensemble model is proposed based on the CNN,
LSTM, and hybrid model. The experimental results suggest that the ensemble
model achieves better classification performance than the other models em-
ployed in the proposed approach. The proposed ensemble model outperforms
the compared methodologies with 97.16% ACC for EEG-based emotion recog-
nition on SEED dataset. The proposed method is also evaluated on DEAP
dataset and obtains 65% ACC using same features and model parameters.
All the models provided impressive accuracy individually and showed a much
lower standard deviation.
BCI is an upcoming field that is highly reliant on the accurate, repeat-
able, and efficient classification of our brain waves frequently recorded by EEG
methods. The experimental results suggest that the proposed approach is suit-
able for this purpose and paves the way for upcoming research fields of such
as humanoid robots, sophisticated prosthetics, and AI-assisted healthcare and
recovery. In future, a hardware implementation can be done for the proposed
model.
References
1. Agarap, A.F.: Deep learning using rectified linear units (relu) (2019)
2. Anastassiou, G.A.: Multivariate hyperbolic tangent neural network approximation.
Computers & Mathematics with Applications 61(4), 809–821 (2011)
3. Bos, D.O., et al.: Eeg-based emotion recognition. The influence of visual and auditory
stimuli 56(3), 1–17 (2006)
4. Chen, Z., Cao, F., Hu, J.: Approximation by network operators with logistic activation
functions. Applied Mathematics and Computation 256, 565–571 (2015)
5. Cheng, B., Liu, G.: Emotion recognition from surface emg signal using wavelet transform
and neural network. In: Proceedings of the 2nd international conference on bioinfor-
matics and biomedical engineering (ICBBE), pp. 1363–1366 (2008)
6. Cui, H., Liu, A., Zhang, X., Chen, X., Wang, K., Chen, X.: Eeg-based emotion recogni-
tion using an end-to-end regional-asymmetric convolutional neural network. Knowledge-
Based Systems 205, 106243 (2020)
7. Dash, M., Liu, H.: Feature selection for classification. Intelligent data analysis 1(1-4),
131–156 (1997)
8. Duan, R.N., Zhu, J.Y., Lu, B.L.: Differential entropy feature for EEG-based emotion
classification. In: 6th International IEEE/EMBS Conference on Neural Engineering
(NER), pp. 81–84. IEEE (2013)
9. Glowinski, D., Camurri, A., Volpe, G., Dael, N., Scherer, K.: Technique for automatic
emotion recognition by body gesture analysis. In: 2008 IEEE Computer Society Con-
ference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2008)
10. Greff, K., Srivastava, R.K., Koutnik, J., Steunebrink, B.R., Schmidhuber, J.: Lstm: A
search space odyssey. IEEE Transactions on Neural Networks and Learning Systems
28(10), 22222232 (2017)
11. Grozea, C., Voinescu, C.D., Fazli, S.: Bristle-sensorslow-cost flexible passive dry eeg
electrodes for neurofeedback and bci applications. Journal of neural engineering 8(2),
025008 (2011)
14 Abhishek Iyer et al.
12. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X.,
Wang, G., Cai, J., Chen, T.: Recent advances in convolutional neural networks. Pattern
Recognition 77, 354–377 (2018)
13. Huang, Y.J., Wu, C.Y., Wong, A.M.K., Lin, B.S.: Novel active comb-shaped dry elec-
trode for eeg measurement in hairy site. IEEE Transactions on Biomedical Engineering
62(1), 256–263 (2014)
14. Hwang, S., Hong, K., Son, G., Byun, H.: Learning cnn features from de features for
eeg-based emotion recognition. Pattern Analysis and Applications 23(3), 1323–1335
(2020)
15. Kurbalija, V., Ivanovi, M., Radovanovi, M., Geler, Z., Dai, W., Zhao, W.: Emotion
perception and recognition: An exploration of cultural differences and similarities. Cog-
nitive Systems Research 52, 103–116 (2018)
16. van Laarhoven, T.: L2 regularization versus batch and weight normalization (2017)
17. Lan, Z., Sourina, O., Wang, L., Scherer, R., Mller-Putz, G.R.: Domain adaptation tech-
niques for eeg-based emotion recognition: A comparative study on two public datasets.
IEEE Transactions on Cognitive and Developmental Systems 11(1), 85–94 (2019)
18. Li, H., Jin, Y.M., Zheng, W.L., Lu, B.L.: Cross-subject emotion recognition using deep
adaptation networks. In: Neural Information Processing, pp. 403–413. Springer Inter-
national Publishing (2018)
19. Li, M., Lu, B.L.: Emotion classification based on gamma-band eeg. In: 2009 Annual
International Conference of the IEEE Engineering in medicine and biology society, pp.
1223–1226. IEEE (2009)
20. Li, P., Liu, H., Si, Y., Li, C., Li, F., Zhu, X., Huang, X., Zeng, Y., Yao, D., Zhang, Y.,
Xu, P.: Eeg based emotion recognition by combining functional connectivity network
and local activations. IEEE Transactions on Biomedical Engineering 66(10), 2869–2881
(2019)
21. Li, X., Song, D., Zhang, P., Zhang, Y., Hou, Y., Hu, B.: Exploring eeg features in
cross-subject emotion recognition. Frontiers in neuroscience 12, 162 (2018)
22. Li, Y., Zheng, W., Zong, Y., Cui, Z., Zhang, T., Zhou, X.: A bi-hemisphere domain
adversarial neural network model for eeg emotion recognition. IEEE Transactions on
Affective Computing (2018)
23. Liu, J., Wu, G., Luo, Y., Qiu, S., Yang, S., Li, W., Bi, Y.: Eeg-based emotion clas-
sification using a deep neural network and sparse autoencoder. Frontiers in Systems
Neuroscience 14, 43 (2020)
24. Liu, N.H., Chiang, C.Y., Hsu, H.M.: Improving driver alertness through music selection
using a mobile eeg to detect brainwaves. Sensors 13(7), 8199–8221 (2013)
25. Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural
networks (2017)
26. Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using multimodal deep
learning (2016)
27. Manli, S., Song, Z., Jiang, X., Pan, J., Pang, Y.: Learning pooling for convolutional
neural network. Neurocomputing 224 (2016)
28. Mariooryad, S., Busso, C.: Compensating for speaker or lexical variabilities in speech
for emotion recognition. Speech Communication 57, 1–12 (2014)
29. Mathersul, D., Williams, L.M., Hopkinson, P.J., Kemp, A.H.: Investigating models of
affect: relationships among eeg alpha asymmetry, depression, and anxiety. Emotion
8(4), 560 (2008)
30. Park, S., Kwak, N.: Analysis on the dropout effect in convolutional neural networks.
pp. 189–204 (2017)
31. Piana, S., Staglian`o, A., Odone, F., Camurri, A.: Adaptive body gesture representation
for automatic emotion recognition. ACM Transactions on Interactive Intelligent Systems
(TiiS) 6(1), 1–31 (2016)
32. Qiu, J.L., Liu, W., Lu, B.L.: Multi-view emotion recognition using deep canonical corre-
lation analysis. In: Neural Information Processing, pp. 221–231. Springer International
Publishing (2018)
33. Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-fold cross validation
in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine
Intelligence 32(3), 569–575 (2010)
Title Suppressed Due to Excessive Length 15
34. Sammler, D., Grigutsch, M., Fritz, T., Koelsch, S.: Music and emotion: electrophysio-
logical correlates of the processing of pleasant and unpleasant music. Psychophysiology
44(2), 293–304 (2007)
35. Sarkar, P., Etemad, A.: Self-supervised ecg representation learning for emotion recog-
nition. IEEE Transactions on Affective Computing (2020)
36. Sauvet, F., Bougard, C., Coroenne, M., Lely, L., Van Beers, P., Elbaz, M., Guillard,
M., Leger, D., Chennaoui, M.: In-flight automatic detection of vigilance states using a
single eeg channel. IEEE Transactions on Biomedical Engineering 61(12), 2840–2847
(2014)
37. Sharma, R., Sahu, S.S., Upadhyay, A., Sharma, R.R., Sahoo, A.K.: Sleep stage clas-
sification using DWT and dispersion entropy applied on EEG signals. In: Computer-
aided Design and Diagnosis Methods for Biomedical Applications, pp. 35–56. CRC Press
(2021)
38. Sharma, R.R., Pachori, R.B.: Time-frequency representation using ievdhm-ht with ap-
plication to classification of epileptic eeg signals. IET Science, Measurement & Tech-
nology 12(1), 72–82 (2017)
39. Sharma, S., Sharma, R.R.: Variational mode decomposition based finger flexion move-
ment detection using ECoG signals. In: Artificial Intelligence-based Brain-Computer
Interface, pp. 101–119. Elsevier (2022)
40. Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, D., Xu, X., Yang, X.: A review of emotion
recognition using physiological signals. Sensors 18(7), 2074 (2018)
41. Song, T., Zheng, W., Song, P., Cui, Z.: Eeg emotion recognition using dynamical graph
convolutional neural networks. IEEE Transactions on Affective Computing 11(3), 532–
541 (2020)
42. Subasi, A., Gursoy, M.I.: Eeg signal classification using pca, ica, lda and support vector
machines. Expert systems with applications 37(12), 8659–8666 (2010)
43. Tabar, Y.R., Halici, U.: A novel deep learning approach for classification of EEG motor
imagery signals. Journal of Neural Engineering 14(1), 016003 (2016)
44. Tahir, M.A., Kittler, J., Bouridane, A.: Multilabel classification using heterogeneous
ensemble of multi-label classifiers. Pattern Recognition Letters 33(5), 513–523 (2012)
45. Tang, H., Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using deep
neural networks. In: Neural Information Processing, pp. 811–819. Springer International
Publishing (2017)
46. Teuwen, J., Moriakov, N.: Chapter 20 - convolutional neural networks. In: Handbook
of Medical Image Computing and Computer Assisted Intervention, The Elsevier and
MICCAI Society Book Series, pp. 481–501. Academic Press (2020)
47. Tripathi, S., Acharya, S., Sharma, R.D., Mittal, S., Bhattacharya, S.: Using deep and
convolutional neural networks for accurate emotion classification on deap dataset. In:
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, p. 47464752.
AAAI Press (2017)
48. Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Pro-
ceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
49. Wang, Z., Tong, Y., Heng, X.: Phase-locking value based graph convolutional neural
networks for emotion recognition. IEEE Access 7, 93711–93722 (2019)
50. Webb, G.I., Zheng, Z.: Multistrategy ensemble learning: reducing error by combining
ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering
16(8), 980–991 (2004)
51. Young, A.W., Rowland, D., Calder, A.J., Etcoff, N.L., Seth, A., Perrett, D.I.: Facial
expression megamix: Tests of dimensional and category accounts of emotion recognition.
Cognition 63(3), 271–313 (1997)
52. Zhang, D., Yao, L., Zhang, X., Wang, S., Chen, W., Boots, R., Benatallah, B.: Cascade
and parallel convolutional recurrent neural networks on eeg-based intention recognition
for brain computer interface. In: Proceedings of the Thirty-Second AAAI Conference
on Artificial Intelligence, pp. 1703–1710. AAAI Press (2018)
53. Zheng, W., Zhu, J., Peng, Y., Lu, B.: EEG-based emotion classification using deep belief
networks. In: 2014 IEEE International Conference on Multimedia and Expo (ICME),
pp. 1–6 (2014)
16 Abhishek Iyer et al.
54. Zheng, W.L., Lu, B.L.: Investigating critical frequency bands and channels for EEG-
based emotion recognition with deep neural networks. IEEE Transactions on Au-
tonomous Mental Development 7(3), 162–175 (2015)
55. Zheng, W.L., Zhu, J.Y., Lu, B.L.: Identifying stable patterns over time for emotion
recognition from eeg. IEEE Transactions on Affective Computing 10(3), 417–429 (2017)
56. Zhong, P., Wang, D., Miao, C.: Eeg-based emotion recognition using regularized graph
neural networks. IEEE Transactions on Affective Computing (2020)
... It has also been found that analyzing functional connectivity is essential for the advancement of emotion recognition. Based on this, graph neural networks (GNNs) [17] and convolutional neural networks (CNNs) [18] have been proposed to extract spatial embedding of DE features among different EEG channels. Furthermore, long short-term memory (LSTM) [19] and attention mechanisms have been utilized to learn emotion-related EEG representations [20]. ...
... where 2 is the variance of the signal. Differential entropy features of each segment were extracted separately in the (0.1-4 Hz), (4-8 Hz), (8)(9)(10)(11)(12)(13), (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31), and (31-50 Hz) frequency bands. In one experiment of a subject, the DE features trained from continuous samples across time are concatenated and smoothed with a linear dynamic system (LDS) model [12]. ...
Preprint
Recent advances in non-invasive EEG technology have broadened its application in emotion recognition, yielding a multitude of related datasets. Yet, deep learning models struggle to generalize across these datasets due to variations in acquisition equipment and emotional stimulus materials. To address the pressing need for a universal model that fluidly accommodates diverse EEG dataset formats and bridges the gap between laboratory and real-world data, we introduce a novel deep learning framework: the Contrastive Learning based Diagonal Transformer Autoencoder (CLDTA), tailored for EEG-based emotion recognition. The CLDTA employs a diagonal masking strategy within its encoder to extracts full-channel EEG data's brain network knowledge, facilitating transferability to the datasets with fewer channels. And an information separation mechanism improves model interpretability by enabling straightforward visualization of brain networks. The CLDTA framework employs contrastive learning to distill subject-independent emotional representations and uses a calibration prediction process to enable rapid adaptation of the model to new subjects with minimal samples, achieving accurate emotion recognition. Our analysis across the SEED, SEED-IV, SEED-V, and DEAP datasets highlights CLDTA's consistent performance and proficiency in detecting both task-specific and general features of EEG signals related to emotions, underscoring its potential to revolutionize emotion recognition research.
... Although recent advancements have introduced deep learn-ing methods for emotion recognition using EEG, these studies have often overlooked the complex nonlinear dynamics inherent in EEG data. Most existing approaches rely on linear input representation techniques, which may not fully capture the intricacies of the nonlinear characteristics of EEG signals [17]- [21]. Thus, exploration of effective input formulation strategies for EEG-based emotion recognition remains a pressing need, as underscored by Prabowo et al. [4]. ...
... Initial efforts focused on the inherent time-series nature of EEG data. Iyer et al. [17] explored this approach by segmenting EEG signals into frequency bands, computing differential entropy, and employing a combination of CNN and Long Short-Term Memory (LSTM) networks for emotion detection. This method highlighted the significance of frequency-specific brain activities in emotion detection. ...
Article
Full-text available
Time-series classification (TSC) has been widely utilized across various domains, including brain-computer interfaces (BCI) for emotion recognition through electroencephalogram (EEG) signals. However, traditional methods often struggle to capture the complex emotional patterns present in EEG data. Recent advancements in encoding techniques have provided promising avenues for improving emotion recognition. This study introduces asymmetric windowing recurrence plots (AWRP) as a novel encoding technique to efficiently encapsulate the dynamic characteristics of EEG signals into texture-rich image representations. This study systematically compares the impact of conventional thresholded and unthresholded recurrence plots (RP) versus the proposed AWRP in emotion recognition tasks. Empirical validations conducted across benchmark datasets, such as DEAP and SEED, demonstrate that the AWRP method achieves classification accuracies of 99.84% and 99.69%, respectively, outperforming existing state- of-the-art methodologies. This study emphasizes the significance of input formulation, highlighting that richer input textures, as provided by AWRP, significantly enhance emotion recognition performance while ensuring computational memory usage efficiency. These findings have significant implications in the domain of EEG-based emotion recognition and offer a novel perspective that can guide future research.
... The proposed study concluded that EEG should be considered an essential category in emotion identification. Abhishek et al. [70] used the differential entropy feature of five different bands from the EEG signals and fed into the ensemble of CNN and LSTM. The authors concluded that EEG is a potent and popular tool for recognizing distinct emotions, and their changes are readily and easily observable. ...
... Iyer et al. [70] developed an Ensemble-hybrid model for emotion recognition using CNN, LSTM, a hybrid of CNN-LSTM, and then a Stacking ensemble. Out of all, stacking achieves the best classification results. ...
Article
Full-text available
Automated Emotion Recognition Systems (ERS) with physiological signals help improve health and decision-making in everyday life. It uses traditional Machine Learning (ML) methods, requiring high-quality learning models for physiological data (sensitive information). However, automated ERS enables data attacks and leaks, significantly losing user privacy and integrity. This privacy problem can be solved using a novel Federated Learning (FL) approach, which enables distributed machine learning model training. This review examines 192 papers focusing on emotion recognition via physiological signals and FL. It is the first review article concerning the privacy of sensitive physiological data for an ERS. The paper reviews the different emotions, benchmark datasets, machine learning, and federated learning approaches for classifying emotions. It proposes a novel multi-modal Federated Learning for Physiological signals based on Emotion Recognition Systems (Fed-PhyERS) architecture, experimenting with the AMIGOS dataset and its applications for a next-generation automated ERS. Based on critical analysis, this paper provides the key takeaways, identifies the limitations, and proposes future research directions to address gaps in previous studies. Moreover, it reviews ethical considerations related to implementing the proposed architecture. This review paper aims to provide readers with a comprehensive insight into the current trends, architectures, and techniques utilized within the field.
... In [1], EEG source localization is combined with a graph neural network model in order to investigate subject (in)dependent classification performance. A hybrid CNN and long short-term memory network (LSTM) classifier is performed on SEED data, the accuracy of 98% and 97.16% are obtained [3,14]. Deep neural network (DNN), CNN, LSTM, and CNN-LSTM architectures are also compared in [45]. ...
Article
Full-text available
Multi-channel Electroencephalogram (EEG) based emotion recognition is focused on several analysis of frequency bands of the acquired signals. In this paper, spectral properties appeared on five EEG bands (\(\delta \), \(\theta \), \(\alpha \), \(\beta \), \(\gamma \)) and gated transformer network (GTN) based emotion recognition using EEG signal are proposed. Spectral energies and differential entropies of 62-channel signals are converted to 3D (sequence-channel-trial) form to feed the GTN. The GTN with enhanced gated two tower based transformer architecture is fed by 3D sequences extracted from SEED and SEED-IV emotional datasets. 15 participants’ states in session 1–3 are evaluated using the proposed GTN based sequence classification, and the results are repeated by \(3\small \times \) shuffling. Totally, 135 times training and testing are performed on each dataset, and the results are presented. The proposed GTN model achieves mean accuracy rates of 98.82% on the SEED dataset and 96.77% on the SEED-IV dataset for three and four emotional state recognition tasks, respectively. The proposed emotion recognition model can be employed as a promising approach for EEG emotion recognition.
... However, in tasks of emotion recognition using EEG, although numerous existing approaches have shown empirical success in identifying emotions from EEG signals [15], [16], there is still room for improvement. There are shortcomings in the research on the fusion of temporal, spatial, and frequency domain. ...
Article
Full-text available
In emotion recognition tasks, electroencephalography (EEG) has gained significant favor among researchers as a powerful biological signal tool. However, existing studies often fail to fully utilize the high temporal resolution provided by EEG when combining spatiotemporal and frequency features for emotion recognition, and do not meet the needs of effective feature fusion. Therefore, this paper proposes a multilevel multidomain feature fusion network model called MMF-Net, aiming to obtain a more comprehensive representation of spatiotemporal-frequency features and achieve higher accuracy in emotion classification. The model takes the original EEG two-dimensional feature map as input, simultaneously extracting spatiotemporal and spatial-frequency domain features at different levels to effectively utilize temporal resolution. Next, at each level, a specially designed fusion network layer is employed to combine the captured temporal, spatial, and frequency domain features. In addition, the fusion network layer contributes positively to the convergence of the model and the enhancement of feature detectors. In subject-dependent experiments, MMF-Net achieved average accuracy rates of 99.50% and 99.59% for valence and arousal dimensions on the DEAP dataset, respectively. In subject-independent experiments, the average accuracy rates for these two dimensions reached 97.46% and 97.54%, respectively.
... This part can be successfully evaluated by measuring the amount of playing and the length of click learning. However, for a student participating in learning, their attention is not sufficient and cannot be reflected in more detail [9,10]. In our study, we analyzed data on students' comments on the current curriculum based on their individual comments. ...
Article
Full-text available
Online education review data have strong statistical and predictive power but lack efficient and accurate analysis methods. In this paper, we propose a multi-modal emotion analysis method to analyze the online education of college students based on educational data. Specifically, we design a multi-modal emotion analysis method that combines text and emoji data, using pre-training emotional prompt learning to enhance the sentiment polarity. We also analyze whether this fusion model reflects the true emotional polarity. The conducted experiments show that our multi-modal emotion analysis method achieves good performance on several datasets, and multi-modal emotional prompt methods can more accurately reflect emotional expressions in online education data.
... Another study proposed a two-layer LSTM and four-layer improved NN deep learning algorithms to improve the performance in EEG classification [17]. These advancements in AI provide robust and adaptable methods for EEG data analysis, overcoming the challenges posed by traditional methods [16,18]. ...
Preprint
Full-text available
Recent advancements in cognitive neuroscience, particularly in Electroencephalogram (EEG) signal processing, image generation, and brain-computer interfaces (BCI), have opened up new avenues for research. This study introduces a novel framework, Bridging Artificial Intelligence and Neurological Signals (BRAINS), which leverages the power of AI to extract meaningful information from EEG signals and generate images. The BRAINS framework addresses the limitations of traditional EEG analysis techniques, which struggle with nonstationary signals, spectral estimation, and noise sensitivity. Instead, BRAINS employs Long Short-Term Memory (LSTM) networks and contrastive learning, which effectively handle time-series EEG data and recognize intrinsic connections and patterns. The study utilizes the MNIST dataset of handwritten digits as stimuli in EEG experiments, allowing for diverse yet controlled stimuli. The data collected is then processed through an LSTM-based network employing contrastive learning and extracting complex features from EEG data. These features are fed into an image generator model, producing images as close to the original stimuli as possible. This study demonstrates the potential of integrating AI and EEG technology, offering promising implications for the future of brain-computer interfaces.
Article
Electroencephalogram (EEG) signals used for emotion classification are vital in the Human–Computer Interface (HCI), which has gained a lot of focus. However, the irregular and non-stationary characteristics of the EEG signals manifest barriers and limit state-of-the-art techniques from accurately assessing different emotions from the EEG data, leading to minimal emotion recognition performance. Moreover, cross-subject emotion recognition (CSER) has always been challenging due to the weak generality of features from EEG signals among subjects. Thus, this study employed a novel algorithm, Complete Ensemble Empirical Mode Decomposition with adaptive noise (CEEMDAN), which decomposes EEG into intrinsic mode functions (IMFs) to comprehend the associated EEG’s stochastic characteristics. Further IMFs are characterized by Normal Inverse Gaussian (NIG) probability density function (PDF) parameters. These NIG features are fed into an optimized Extreme Gradient Boosting (XGboost) classifier developed using a cross-validation technique. The uniqueness of this research is in the use of NIG modeling of CEEMDAN domain IMFs to extract specific emotions from EEG signals. Qualitative, visual, and statistical assessments are used to illustrate the importance of the NIG parameters. Extensive experiments are carried out with the online available data sources SJTU Emotion EEG Dataset (SEED), SEED-IV, and Database for Emotion Analysis of Physiological Signals (DEAP) to evaluate the potency of the proposed approach. The suggested system for recognizing emotions performed better than cutting-edge techniques, attaining the highest accuracy of 98.9%, 97.8%, and 96.7% with the tenfold cross-validation (CV) protocol and 96.84%, 95.38%, and 91.39% for cross-subject validation (CSV) approach using SEED, SEED-IV, and DEAP databases, respectively.
Article
Emotion recognition utilizing EEG signals has emerged as a pivotal component of human–computer interaction. In recent years, with the relentless advancement of deep learning techniques, using deep learning for analyzing EEG signals has assumed a prominent role in emotion recognition. Applying deep learning in the context of EEG-based emotion recognition carries profound practical implications. Although many model approaches and some review articles have scrutinized this domain, they have yet to undergo a comprehensive and precise classification and summarization process. The existing classifications are somewhat coarse, with insufficient attention given to the potential applications within this domain. Therefore, this article systematically classifies recent developments in EEG-based emotion recognition, providing researchers with a lucid understanding of this field’s various trajectories and methodologies. Additionally, it elucidates why distinct directions necessitate distinct modeling approaches. In conclusion, this article synthesizes and dissects the practical significance of EEG signals in emotion recognition, emphasizing its promising avenues for future application.
Article
Full-text available
Emotion classification based on brain–computer interface (BCI) systems is an appealing research topic. Recently, deep learning has been employed for the emotion classifications of BCI systems and compared to traditional classification methods improved results have been obtained. In this paper, a novel deep neural network is proposed for emotion classification using EEG systems, which combines the Convolutional Neural Network (CNN), Sparse Autoencoder (SAE), and Deep Neural Network (DNN) together. In the proposed network, the features extracted by the CNN are first sent to SAE for encoding and decoding. Then the data with reduced redundancy are used as the input features of a DNN for classification task. The public datasets of DEAP and SEED are used for testing. Experimental results show that the proposed network is more effective than conventional CNN methods on the emotion recognitions. For the DEAP dataset, the highest recognition accuracies of 89.49% and 92.86% are achieved for valence and arousal, respectively. For the SEED dataset, however, the best recognition accuracy reaches 96.77%. By combining the CNN, SAE, and DNN and training them separately, the proposed network is shown as an efficient method with a faster convergence than the conventional CNN.
Article
Full-text available
Electroencephalography (EEG) measures the neuronal activities in different brain regions via electrodes. Many existing studies on EEG-based emotion recognition do not fully exploit the topology of EEG channels. In this paper, we propose a regularized graph neural network (RGNN) for EEG-based emotion recognition. RGNN considers the biological topology among different brain regions to capture both local and global relations among different EEG channels. Specifically, we model the inter-channel relations in EEG signals via an adjacency matrix in a graph neural network where the connection and sparseness of the adjacency matrix are inspired by neuroscience theories of human brain organization. In addition, we propose two regularizers, namely node-wise domain adversarial training (NodeDAT) and emotion-aware distribution learning (EmotionDL), to better handle cross-subject EEG variations and noisy labels, respectively. Extensive experiments on two public datasets, SEED and SEED-IV, demonstrate the superior performance of our model than state-of-the-art models in most experimental settings. Moreover, ablation studies show that the proposed adjacency matrix and two regularizers contribute consistent and significant gain to the performance of our RGNN model. Finally, investigations on the neuronal activities reveal important brain regions and inter-channel relations for EEG-based emotion recognition.
Article
Emotion recognition is an important field of research in Brain Computer Interactions. As technology and the understanding of emotions are advancing, there are growing opportunities for automatic emotion recognition systems. Neural networks are a family of statistical learning models inspired by biological neural networks and are used to estimate functions that can depend on a large number of inputs that are generally unknown. In this paper we seek to use this effectiveness of Neural Networks to classify user emotions using EEG signals from the DEAP (Koelstra et al (2012)) dataset which represents the benchmark for Emotion classification research. We explore 2 different Neural Models, a simple Deep Neural Network and a Convolutional Neural Network for classification. Our model provides the state-of-the-art classification accuracy, obtaining 4.51 and 4.96 percentage point improvements over (Rozgic et al (2013)) classification of Valence and Arousal into 2 classes (High and Low) and 13.39 and 6.58 percentage point improvements over (Chung and Yoon(2012)) classification of Valence and Arousal into 3 classes (High, Normal and Low). Moreover our research is a testament that Neural Networks could be robust classifiers for brain signals, even outperforming traditional learning techniques.
Article
Brain-Computer Interface (BCI) is a system empowering humans to communicate with or control the outside world with exclusively brain intentions. Electroencephalography (EEG) based BCIs are promising solutions due to their convenient and portable instruments. Despite the extensive research of EEG in recent years, it is still challenging to interpret EEG signals effectively due to the massive noises in EEG signals (e.g., low signal-noise ratio and incomplete EEG signals), and difficulties in capturing the inconspicuous relationships between EEG signals and certain brain activities. Most existing works either only consider EEG as chain-like sequences neglecting complex dependencies between adjacent signals or requiring pre-processing such as transforming EEG waves into images. In this paper, we introduce both cascade and parallel convolutional recurrent neural network models for precisely identifying human intended movements and instructions effectively learning the compositional spatio-temporal representations of raw EEG streams. Extensive experiments on a large scale movement intention EEG dataset (108 subjects,3,145,160 EEG records) have demonstrated that both models achieve high accuracy near 98.3% and outperform a set of baseline methods and most recent deep learning based EEG recognition models, yielding a significant accuracy increase of 18% in the cross-subject validation scenario. The developed models are further evaluated with a real-world BCI and achieve a recognition accuracy of 93% over five instruction intentions. This suggests the proposed models are able to generalize over different kinds of intentions and BCI systems.
Chapter
The finger flexion movement prediction is a challenging problem of the Brain-computer interface. This chapter focuses on decoding the finger flexion movement using electrocorticogram (ECoG) signals. The variational mode decomposition (VMD) is applied to obtain the sub-components of each channel ECoG recording. Various correlation-based and other parameters such as correntropy, cross information potential, and entropy estimation by Kozachenko-Leonenko are evaluated to categorize the flexion movement. The parameter computation over multiple channels provokes a complicated process. This complication is reduced by applying correlation-based thresholding between all channels and significant channels are selected. All the computed features are given to the cubic support vector machine (C-SVM) classifier for classification. The complete model is investigated for the BCI competition-IV dataset, 2008, which holds brain signals of three subjects with different finger flexion movements. We have achieved the 0.43 correlation and 50.1% accuracy for five finger flexion classification on the considered dataset and for index finger versus little finger flexion classification 80.1% accuracy is achieved with 83% sensitivity.
Chapter
In this chapter, the electroencephalogram (EEG) signals obtained from the polysomnography (PSG) recordings are analyzed using discrete (DWT) wavelet transform and dispersion entropy. The PSG recordings are taken from PhysioNet Sleep European Data Format (EDF) Database in this work. We investigate the performance of dispersion entropy (DEn) and one of its variant fluctuation based dispersion entropy (FDEn) computed from the wavelet subbands of the sleep EEG recordings. The random forest is employed for classification purpose using computed entropies as features. The performance of the algorithm is further compared using the DEn and FDEn separately as well as combining both entropies. The measures like sensitivity, specificity, and accuracy are used to compare the performance of the multi-class sleep stages classification problem. The results show the suitability of the DEn and FDEn for automatic scoring of sleep stages. It is found that FDEn is more suitable for classification as compared to DEn. However, the classification using both DEn and FDEn combine as feature results in even better sleep stages classification accuracy.
Article
We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. The proposed solution consists of two stages of learning a) learning ECG representations and b) learning to classify emotions. ECG representations are learned by a signal transformation recognition network. The network learns high-level abstract representations from unlabeled ECG data. Six different signal transformations are applied to the ECG signals, and transformation recognition is performed as pretext tasks. Training the model on pretext tasks helps the network learn spatiotemporal representations that generalize well across different datasets and different emotion categories. We transfer the weights of the self-supervised network to an emotion recognition network, where the convolutional layers are kept frozen and the dense layers are trained with labelled ECG data. We show that the proposed solution considerably improves the performance compared to a network trained using fully-supervised learning. New state-of-the-art results are set in classification of arousal, valence, affective states, and stress for the four utilized datasets. Extensive experiments are performed, providing interesting insights into the impact of using a multi-task self-supervised structure instead of a single-task model, as well as the optimum level of difficulty required for the pretext self-supervised tasks.
Article
Emotion recognition based on electroencephalography (EEG) is of great important in the field of Human-Computer Interaction (HCI), which has received extensive attention in recent years. Most traditional methods focus on extracting features in time domain and frequency domain. The spatial information from adjacent channels and symmetric channels is often ignored. To better learn spatial representation, in this paper, we propose an end-to-end Regional-Asymmetric Convolutional Neural Network (RACNN) for emotion recognition, which consists of temporal, regional and asymmetric feature extractors. Specifically, continuous 1D convolution layers are employed in temporal feature extractor to learn time-frequency representations. Then, regional feature extractor consists of two 2D convolution layers to capture regional information among physically adjacent channels. Meanwhile, we propose an Asymmetric Differential Layer (ADL) in asymmetric feature extractor by taking the asymmetry property of emotion responses into account, which can capture the discriminative information between left and right hemispheres of the brain. To evaluate our model, we conduct extensive experiments on two publicly available datasets, i.e., DEAP and DREAMER. The proposed model can obtain recognition accuracies over 95% for valence and arousal classifification tasks on both datasets, significantly outperforming the state-of-the-art methods.