Content uploaded by Rishi Raj Sharma
Author content
All content in this area was uploaded by Rishi Raj Sharma on Apr 09, 2022
Content may be subject to copyright.
Noname manuscript No.
(will be inserted by the editor)
CNN and LSTM based Ensemble Learning for
Human Emotion Recognition using EEG Recordings
Abhishek Iyer ·Srimit Sritik Das ·Reva
Teotia ·Shishir Maheshwari ·Rishi Raj
Sharma
Received: date / Accepted: date
Abstract Emotion is a significant parameter in daily life and is considered an
important factor for human interactions. The human-machine interactions and
their advanced stages like humanoid robots essentially require emotional inves-
tigation. This paper proposes a novel method for human emotion recognition
using electroencephalogram (EEG) signals. We have considered three emotions
namely neutral, positive, and negative . These EEG signals are separated into
five frequency bands according to EEG rhythms and the differential entropy
is computed over the different frequency band components. The convolution
neural network (CNN) and long short-term memory (LSTM) based hybrid
model is developed for accurate emotion detection. Further, the extracted fea-
tures are fed to all three models for emotion recognition. Finally, an ensemble
model combines the predictions of all three models. The proposed approach is
validated on two datasets namely SEED and DEAP for EEG based emotion
analysis. The developed method achieved 97.16% accuracy on SEED dataset
for emotion classification. The experimental results indicate that the proposed
approach is effective and yields better performance than the compared meth-
ods for EEG-based emotion analysis.
Keywords Emotion recognition ·EEG ·Hybrid model ·Differential entropy ·
LSTM
Abhishek Iyer ·Srimit Sritik Das ·Reva Teotia ·Shishir Maheshwari
Department of Electrical and Electronics Engineering, Birla Institute of Technology and
Science, Pilani-333031, India
E-mail: f20181105@pilani.bits-pilani.ac.in, f20180527@pilani.bits-pilani.ac.in,
f20190268@pilani.bits-pilani.ac.in, shishir.maheshwari@pilani.bits-pilani.ac.in
Rishi Raj Sharma
Department of Electronics Engineering, Defence Institute of Advanced Technology, Pune-
411025, India
E-mail: dr.rrsrrs@gmail.com
2 Abhishek Iyer et al.
1 Introduction
Emotion is a pivotal factor in human life as it affects the working ability, men-
tal state, and judgment of human beings. Numerous experts worked on this
topic in different disciplines like psychology, cognitive science, neuroscience,
computer technology, brain-computer interfacing (BCI) [39], and others. The
electroencephalogram (EEG) based emotion recognition has created a lot of
scope in these disciplines using the actual affective state [21]. BCI has nu-
merous applications of EEG-based emotion recognition including humanoid
robots. Most of the humanoid robots lack in terms of emotion and this field is
not much explored. In effective BCI, emotion recognition is a major parameter
and it becomes complex due to its fuzzy property. Human emotion correlates
with context, language, time, space, culture, and other components. There-
fore, the absolute true labels are also not possible with different emotions with
EEG recording, which creates the issue [3].
Many authors proposed facial expression [51], gesture [31],[9], posture,
speech [28], and other physical signal-based emotion recognition methods.
These types of data are easy to record but can be easily controlled and falsify
the true emotion [15]. The controlling or mimicking of nervous system related
signals is very tough as they are involuntarily activated [40] and only subject
experts can control these signals. Therefore, true emotion signature can be
observed in nervous system related recordings. Several physiological record-
ings such as EEG, electrocardiogram (ECG) [35], temperature, electromyo-
gram (EMG) [5], respiratory, galvanic skin response (GSR) can be used to
study the human emotion [40]. The minute investigation of brain activity with
various emotions can assist the accurate and computation efficient emotion
recognition models. Recent research in dry electrode implementation [11], [13]
and wearable devices promote the EEG recording based emotion identification
in real-time accomplishment for mental state monitoring [24], [36]. EEG based
emotion recognition is one of the key feature required in human-machine in-
teraction (HMI) and a humanoid robot. This study is focused on EEG based
human emotion analysis in which the electrical activity of the brain is inves-
tigated during different emotions (neutral, positive, and negative emotions).
Many studies are performed for EEG based human emotion diagnosis and
tried to form a definitive relationship of EEG signals with different emotions
[29], [34]. EEG signal analysis is a very challenging task as it is non-stationary
[37]. In a real-time scenario, other signals are added into EEG recording and
signal to noise ratio (SNR) becomes low. Matrix decomposition-based EEG
signal analysis methods are proposed but due to high complexity, real-time
implementation is tough [7], [42], [38]. The emotion related stable patterns of
EEG recordings are observed in [55] which uses the DEAP dataset. The critical
frequencies of emotion and significant channel selection in EEG recordings are
micro-observed in [54]. This channel selection is good to find the position
of the electrode for emotion analysis. The time-frequency based and various
non-linear features are studied for EEG based emotion recognition and achieve
Title Suppressed Due to Excessive Length 3
59.06% accuracy (ACC) with DEAP data [21]. It is suggested that the gamma-
band in EEG recordings is more correlated with the emotion function [19].
Many machine learning-based architectures are proposed for EEG based
emotion examination. The bi-hemisphere based neural network is designed for
EEG emotion detection, and the experiment is performed on SEED dataset
with 63.50% ACC [22]. The graph neural network-based emotion recognition
is performed on the same dataset with 89.23% ACC using gamma band, and
94.24% ACC with all bands [56]. A regional asymmetric convolution neural
network (CNN) based study is carried on DEAP data and acquire 95% ACC
for arousal and valence emotion detection [6]. In most of the methods, existing
models are improved to achieve a good classification of human emotion. The
proposed approach employs multiple models and develops a hybrid approach
to attain better ACC than the developed methods. The two models, based
on CNN and long short term memory (LSTM), are hybridized to improve the
final prediction using the ensemble model.
Rest of the article is organized in the following manner. Section 2, presents
the dataset. Proposed approach with features is explained in section 3. The
proposed hybrid model along with the CNN and LSTM based models are
presented in section 4. This also includes the implementation of ensemble
learning. Results are explained in section 5. Finally, the article is concluded in
section 6.
2 Dataset
We have used two datasets for EEG based emotion recognition. The detailed
explanation of both the datasets is given next to this section.
2.1 SEED data
The database employed in the proposed approach has been obtained from the
brain like computing and machine learning (BCMI) methods. We employed
the SJTU emotion EEG dataset (SEED) [54], [8]. The dataset contains EEG
data of 15 subjects (7 males and 8 females) recorded in three separate ses-
sions, each session having 15 trials. In each trial, the EEG signal is recorded
when the subject is watching Chinese film clips with three types of emotions,
namely positive, neutral, and negative. The duration of each film clip is about
4 minutes and two film clips targeting the same emotion are not shown consec-
utively. The participants reported their emotional reactions to each film clip
by completing the questionnaire immediately after watching each film clip.
The EEG signals are recorded using a 62- channel electrode cap according to
the international 10-20 system. The data is then down-sampled to 200Hz to
make system faster and a band-pass frequency filter from 0-75Hz is applied
which contains all the EEG rhythm information.
4 Abhishek Iyer et al.
2.2 DEAP data
The DEAP data has been recorded for the analysis of human emotion using
EEG signals. It is recorded for 32 healthy participants aged between 19 years
to 37 years and out of 32 participants 16 were female. Each participant has
been exposed to 40 music videos each of which has a duration of 1-min with the
same emotion throughout the video length. The data comprises 40 channels
out of which 32 EEG channels have been investigated in this paper. The data
is recorded with Biosem ActiveTwo devices at a sampling rate of 512 Hz. It is
further downsampled to 128 Hz to reduce the system complexity. The DEAP
data provides 32 files, where each file contains the 40-channel EEG recording
of 40 videos of one minute duration each.
3 Proposed approach
Block diagram of the proposed approach is shown in Fig. 1. All the subjects
were made to sit on a chair in the resting state and are asked to watch the
videos portraying different emotions. Simultaneously, EEG signals are recorded
and pre-processing is done. The differential entropy (DE) based features are
computed in five EEG rhythms. The DE based features are explained in the
next sub-section. Further, CNN and LSTM models are employed and combined
to obtain the hybrid model. Thereafter, the ensemble model is proposed based
on these models.
3.1 Features extraction
We have employed DE as a feature in the proposed approach. DE extends
the idea of Shannon entropy and is used to measure the complexity of a con-
tinuous random variable. DE as a feature was first introduced to EEG-based
emotion recognition by Duan et al.[8]. It has been found to be more suited for
emotion recognition than the traditional feature. DE has the balanced abil-
ity to discriminate EEG patterns between low and high-frequency energy. DE
feature extracted from EEG data provides stable and accurate information for
emotion classification [53]. The differential entropy feature is as defined below:
h(Y) = −Z∞
−∞
1
√2πσ2e−(y−µ)2
2σ2log( 1
√2πσ2e−(y−µ)2
2σ2)dy
=1
2log(2πeσ2)
(1)
where the time series Y obeys the Gaussian distribution N(µ,σ2).
DE was employed to construct features in the five frequency bands: delta
(1- 3Hz), theta (4-7Hz), alpha (8-13Hz), beta (14-30Hz), and gamma (31-
50Hz). For the SEED dataset, the extracted DE feature for a sample EEG
signal has 310 dimensions as there are 62 channels for each frequency band
Title Suppressed Due to Excessive Length 5
Data
preprocessing
and feature
extraction
Stimuli
Emotion
classification
Positive
Negative
Neutral
Model & training
Data
Input Ensemble
CNN
LSTM
Hybrid
Fig. 1 Block diagram of the proposed system for EEG-based emotion recognition.
[54]. Similarly, the 32 channels are considered for DEAP dataset in five EEG
sub-bands which leads to total 160 DE features.
Various models along with the hybrid model and ensemble model are ex-
plained in next section.
4 Model employed for emotion recognition in the proposed system
In the proposed work, initially, the CNN and LSTM based models are employed
for emotion recognition. Thereafter, a hybrid model is proposed which is a
combination of CNN and LSTM. Finally, an ensemble model of these three
proposed models is taken into consideration. All these models are explained
in this section.
4.1 CNN-based model
The idea behind CNNs bears a resemblance to traditional artificial neural
networks (ANNs), consisting of neurons that self-optimize through learning.
CNN’s are powerful performers on large sequential data represented by ma-
trices such as images broken down to their pixel values [46]. A smaller n×n
kernel slides over the entire feature matrix performing convolutions over the
superposed space [12]. The feature map size can be kept consistent across mul-
tiple convolutions using padding of 0s. However, functions like Max Pooling
are employed to reduce the amount of computational data and still retain the
important information [27]. As the feature maps pass through the different
convolutional layers, the filters learn to detect patterns and more abstract
features.
EEG based emotion classification using the CNN method was also ex-
plored in the approaches of [47]. Cascade and parallel convolutional recurrent
neural networks have been used for EEG human-intended movement classifi-
cation tasks [52]. Additionally, before applying the CNN, EEG data could be
converted to image representation after feature extraction [43]. However, the
accuracy of emotion recognition by using only CNN is not high.
The details of the CNN architecture employed in the proposed approach
are shown in Fig. 2: The CNN model consists of four convolutional (conv)
blocks with 64, 128, 256, 512 filters, respectively. The kernel size of conv filters
6 Abhishek Iyer et al.
Input shape
(62,265,5)
Output shape
(31,132,64)
Output shape
(15,66,128)
Output shape
(7,33,256)
Output shape
(3,16,512)
2457651225664
Conv Block 2
Conv2D
Conv Block 3
Conv2D
Maxpool
Dropout
Conv2D
Maxpool
Dropout
Conv Block 1
Conv2D
Conv2D
Conv2D
Maxpool
Dropout
Fully Connected Block Conv Block 4
Conv2D
Maxpool
Dropout
Flatten
Dense
Dropout
Dense
Dropout
Dense
Drpoout
Softmax
Output
Fig. 2 CNN-based model architecture employed in the proposed approach for EEG-based
emotion recognition.
is 5×5 and 3×3. All the layers use padding, followed by maximum sub-sampling
layers, which operate over 2 ×2 sub-windows at each conv layer, known as the
Max Pooling layers. The network ends with three fully connected dense layers
fed to the c-way softmax [25] classification layer. Relu activation is employed
due to its unity gradient, where the maximum amount of error is passed during
back-propagation [1]. Dropout regularization is used after every layer which
improves the performance of the model via a modest regularization effect [30].
Thereafter, the predictions of the CNN model are fed to the proposed ensemble
model for emotion recognition.
4.2 LSTM-based model
The LSTM networks are modified recurrent neural networks (RNN), capable
of learning long-term dependencies. LSTM network is parametrized by weight
matrices from the input and the previous state for each of the gates, in addition
to the memory cell, which overcomes the issue of vanishing/exploding gradient
[10].
We use the standard formulation of LSTMs with the logistic function (σ)
[4] on the gates and the hyperbolic tangent [2] on the activations. The input
is of the shape 1325 ×62. The model has 4 LSTM layers with dropouts in be-
tween, and then the output is passed to the fully connected network. SoftMax
activation function [25] is used to predict the final output. The block diagram
of the LSTM architecture is shown in Fig. 3.
Title Suppressed Due to Excessive Length 7
Input
Fully Connected Block
Flatten
Dense
Dropout
Dense
Dropout
Dense
Softmax
Input shape
(1325,62)
Output shape
(1325,256)
Output shape
(1325,128)
339200 512 256 64 Output
LSTM Block
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
LSTM
Fig. 3 LSTM-based architecture employed in the proposed approach for EEG-based emo-
tion recognition.
4.3 Hybrid model
The hybrid model combines more than one base model in series. Fig. 4 shows
the structure of the hybrid model employed in the proposed approach. The
hybrid model improves the performance by capturing more information that
is left undetected previously.
The first three blocks of the hybrid model consist of convolutional (conv)
blocks. The conv block consists of max pool layers and the Dropout regu-
larisation to avoid overfitting [30]. The output shape of the third conv block
is 15 ×66 ×512. On the other hand, the input shape to the LSTM block is
66 ×7680. The reshape layer is employed between the conv and LSTM block
to facilitate this dimensional mismatch. In general, 2D conv block work on in-
puts which are R3, while LSTM inputs are in R2. The LSTM network uses the
Tanh activation function [2] and batch norm regularization [16]. The output
of the LSTM block is passed to a fully connected network that uses softMax
[25] to calculate the probabilities of the output.
4.4 Ensemble learning-based model
Ensemble learning is mainly of two types, namely, homogeneous and hetero-
geneous. It combines the prediction from multiple models and integrates the
individual strengths of the base models. This results in the robustness and
the improved performance of the overall approach [50]. Ensemble learning is
homogeneous when the base models are of the same type. In the proposed
approach, ensemble learning is heterogeneous as the base models are different.
Once these models are trained, a statistical method is used to combine
the predictions of the different models. The statistical method involves the
methods of bagging, boosting, and stacking. We have employed stacking as it
is suitable for heterogeneous ensemble models [44]. Stacking is the process in
which separate models learn parallely on the dataset and a small meta model,
usually a feed-forward neural network (FNN) is used to combine individual
predictions and come up with the final outputs. Stacking introduces a meta-
model that receives the different predictions of the base models as its input.
8 Abhishek Iyer et al.
Conv Block 3Conv Block 1
Conv2D
Conv2D
Conv2D
Maxpool
Dropout
Conv Block 2
Conv2D
Conv2D
Maxpool
Dropout
Conv2D
Reshape
LSTM Block Fully Connected Block
Softmax
Output shape
(66,512)
3379251225664Output
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
LSTM
Dropout
Flatten
Dense
Dropout
Dense
Dropout
Desne
Input shape
(62,265,5)
Output shape
(31,132,128)
Output shape
(15,66,256)
Output shape
(15,66,512)
Output shape
(66,7680)
Fig. 4 Hybrid model employed in the proposed approach for EEG-based emotion recogni-
tion.
The combining function is
either a max-function or a
meta prediction model
Ensemble Model
Prediction
Prediction
Prediction
Final
Prediction
CNN Model
LSTM Model
Hybrid Model
Fig. 5 Ensemble model employed in the proposed approach for EEG-based emotion recog-
nition.
The meta-model [48] learns to maximize the output prediction, and this be-
comes our final output. In addition to stacking, we have also investigated the
max function as a statistical method to combine the predictions. Fig. 5 shows
the block diagram of ensemble model. The meta model used in the stack-
ing method consists of 4 fully connected (FC) layers followed by a softmax
classifier [25].
Title Suppressed Due to Excessive Length 9
5 Results & discussion
The proposed approach has been evaluated on two datasets namely SEED
and DEAP. In the proposed approach, the performance of various models has
been investigated using the k-fold cross-validation test [33] with k = 10. The
individual performances of the CNN, LSTM and the hybrid model have been
obtained. Further, we have also obtained the performance of the ensemble
model. Each model is trained for 60 epochs with a batch size of 64. The
learning rate (LR) has not been fixed due to saturation in loss which results
in no further improvement in the performance of the model. To overcome this
limitation, we have employed LR annealer which makes the learning rate a
variable parameter. It should ne noted that we have used same feature and
experimental setup for both the datasets.
5.1 Experimental results for SEED data
The performance of individual models are measured by evaluating certain
parameters such as weighted average precision (WAP), weighted average sen-
sitivity (WAS), and weighted average F1 score (WAF1). F1 score is a good
metric to check stability of the model. Table 1 tabulates the performance pa-
rameters of individual models, hybrid model and ensemble model for EEG
emotion recognition.
The experimental results suggest that the CNN and LSTM model individ-
ually achieves the classification accuracy (ACC) of 89.53% and 89.99%, re-
spectively. The hybrid model achieves an ACC of 93.46%. On the other hand,
the ensemble model achieves the ACC of 97.16% for the stack-based ensemble
learning. The results for SEED data are tabulated in Table 1. From Table 1,
it can be noticed that the ensemble-based method provides improved perfor-
mance over other models. We believe that this is because the base models are
not weak and provide good accuracy by themselves.
Fig. 7 shows the plot of loss function and LR with respect to epoch. When
LR saturates after some epochs, there is no significant decreases in loss which
results in poor model performance. On the other hand, as we decrease the RL
when loss saturates, the loss tends to settle more quickly and improves system
performance.
We have also shown Box and Whisker plots of ACC to shed some more
light on the results. The inter-quartile range (IQR) is indicated with the box
and the orange line showing where the median lies. This includes all the results
from 25 percentile to 75 percentile. The minimum and maximum values are
marked by the solid black line at the top and bottom of the box and whisker
plot. The outliers, marked by circles, are results that did not fall in the whisker
range, which contains results in the range of 1.5×IQR.
We further compare our experimental results of the proposed approach
with some of the past benchmark methodologies on emotion recognition on
the SEED dataset. Table 2 tabulates the comparison of the proposed approach
10 Abhishek Iyer et al.
Table 1 Classification performance of individual and ensemble model for EEG-based emo-
tion recognition on SEED data.
Model MCC WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)
CNN-based 0.819 89 88 88.49 89.53 1.78
LSTM-based 0.821 89 89 89.00 89.99 1.92
Hybrid 0.867 94 93 93.49 93.46 1.36
Ensemble 0.90 97 97 97.00 97.16 1.08
Fig. 6 The box and Whisker plots for ACC achieved by proposed (top-left) CNN-based
model, (top-right) LSTM-based model, (bottom-left) Hybrid-model, (bottom-right) Ensem-
ble stack model for EEG-based emotion recognition
with other past benchmark results. It can be observed that the proposed ap-
proach outperforms the previous methodologies. It can also be noticed that the
standard deviation (STD) of the proposed approach is very less as compared
with other approaches tabulated in Table 2. This also reflects the repeatability
and reproducibility of the proposed approach.
5.2 Experimental results for DEAP data
We have also employed DEAP dataset for evaluating the performance of pro-
posed approach for EEG-based emotion analyis with same feature and exper-
imental setup. The performance of the proposed approach on DEAP dataset
Title Suppressed Due to Excessive Length 11
(a) (b)
Fig. 7 (a) Loss figures (b) Learning rate figures
has been tabulated in Table 3. It can be observed from Table 3 that the ensem-
ble obtains maximum performance as compared to other individual models.
The CNN-based, LSTM-based and hybrid models achieve classification per-
formances of 63.50%, 63.89%, 64.02%, respectively. Tables demonstrate that
the ensemble model achieves better performance than individual models. The
performance of other existing works on DEAP data with same DE feature has
been compared in Table 4. It can be observed from Table 4 that the proposed
system attains better performance than the existing methods for EEG-based
emotion recognition.
12 Abhishek Iyer et al.
Table 2 Comparison of previous benchmark methodologies for EEG-based emotion recog-
nition on SEED data.
Paper Model Feature ACC (%) STD (%)
He Li et. al[18] DAN DE 83.81 8.56
Wang et. al[49] PGCNN RASM 84.35 10.28
T. Song et. al[41] DGCNN DE 90.40 8.49
Hwang et. al[14] CNN TP-DE (cubic) 90.41 8.71
Wei Liu et. al[26] BDAE DE 91.01 8.91
W Zheng et. al[53] GELM DE 91.07 7.54
Hao Tang et. al[45] Bimodal-LSTM PSD+DE 93.97 -
JL Qiu et. al[32] DCCA - 94.58 6.16
Liu J et. al[23] CNN+SAE+DNN PCC 96.77 -
Proposed approach Ensemble(stack) DE 97.16 1.08
Table 3 Classification performance of individual and ensemble model for EEG-based emo-
tion recognition on DEAP data.
Classification Methods WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)
CNN 64 74 68 63.50 3.89
LSTM 64.3 74 68 63.89 3.78
Hybrid 64 75 69 64.02 3.84
Ensemble 65 75 70 65.00 3.57
For the future work, we planned to extent our work to propose new feature
for the effective emotion recognition from EEG signal. Also, the SEED and
DEAP datasets will be evaluated with new features to further improve the
existing performances. We also intend to test the proposed model for other
EEG-based neuronal system development.
Table 4 Comparison of previous benchmark methodologies for EEG-based emotion recog-
nition on DEAP data.
Paper Classification Methods Feature Avg ACC (%) STD (%)
Lan et. al [17] logistic regression DE 48.93 15.50
Li et. al [20] SVM DE 57.00 0.3
Proposed approach Ensemble(stack) DE 65.00 3.57
Title Suppressed Due to Excessive Length 13
6 Conclusion
This paper proposes the ensemble learning-based EEG emotion recognition
system. Firstly, the differential entropy was extracted from different frequency
bands of EEG signals. Thereafter, these features are fed to CNN and LSTM
based models. The hybrid model is developed by combining the sub-blocks of
CNN and LSTM models. The ensemble model is proposed based on the CNN,
LSTM, and hybrid model. The experimental results suggest that the ensemble
model achieves better classification performance than the other models em-
ployed in the proposed approach. The proposed ensemble model outperforms
the compared methodologies with 97.16% ACC for EEG-based emotion recog-
nition on SEED dataset. The proposed method is also evaluated on DEAP
dataset and obtains 65% ACC using same features and model parameters.
All the models provided impressive accuracy individually and showed a much
lower standard deviation.
BCI is an upcoming field that is highly reliant on the accurate, repeat-
able, and efficient classification of our brain waves frequently recorded by EEG
methods. The experimental results suggest that the proposed approach is suit-
able for this purpose and paves the way for upcoming research fields of such
as humanoid robots, sophisticated prosthetics, and AI-assisted healthcare and
recovery. In future, a hardware implementation can be done for the proposed
model.
References
1. Agarap, A.F.: Deep learning using rectified linear units (relu) (2019)
2. Anastassiou, G.A.: Multivariate hyperbolic tangent neural network approximation.
Computers & Mathematics with Applications 61(4), 809–821 (2011)
3. Bos, D.O., et al.: Eeg-based emotion recognition. The influence of visual and auditory
stimuli 56(3), 1–17 (2006)
4. Chen, Z., Cao, F., Hu, J.: Approximation by network operators with logistic activation
functions. Applied Mathematics and Computation 256, 565–571 (2015)
5. Cheng, B., Liu, G.: Emotion recognition from surface emg signal using wavelet transform
and neural network. In: Proceedings of the 2nd international conference on bioinfor-
matics and biomedical engineering (ICBBE), pp. 1363–1366 (2008)
6. Cui, H., Liu, A., Zhang, X., Chen, X., Wang, K., Chen, X.: Eeg-based emotion recogni-
tion using an end-to-end regional-asymmetric convolutional neural network. Knowledge-
Based Systems 205, 106243 (2020)
7. Dash, M., Liu, H.: Feature selection for classification. Intelligent data analysis 1(1-4),
131–156 (1997)
8. Duan, R.N., Zhu, J.Y., Lu, B.L.: Differential entropy feature for EEG-based emotion
classification. In: 6th International IEEE/EMBS Conference on Neural Engineering
(NER), pp. 81–84. IEEE (2013)
9. Glowinski, D., Camurri, A., Volpe, G., Dael, N., Scherer, K.: Technique for automatic
emotion recognition by body gesture analysis. In: 2008 IEEE Computer Society Con-
ference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2008)
10. Greff, K., Srivastava, R.K., Koutnik, J., Steunebrink, B.R., Schmidhuber, J.: Lstm: A
search space odyssey. IEEE Transactions on Neural Networks and Learning Systems
28(10), 22222232 (2017)
11. Grozea, C., Voinescu, C.D., Fazli, S.: Bristle-sensorslow-cost flexible passive dry eeg
electrodes for neurofeedback and bci applications. Journal of neural engineering 8(2),
025008 (2011)
14 Abhishek Iyer et al.
12. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X.,
Wang, G., Cai, J., Chen, T.: Recent advances in convolutional neural networks. Pattern
Recognition 77, 354–377 (2018)
13. Huang, Y.J., Wu, C.Y., Wong, A.M.K., Lin, B.S.: Novel active comb-shaped dry elec-
trode for eeg measurement in hairy site. IEEE Transactions on Biomedical Engineering
62(1), 256–263 (2014)
14. Hwang, S., Hong, K., Son, G., Byun, H.: Learning cnn features from de features for
eeg-based emotion recognition. Pattern Analysis and Applications 23(3), 1323–1335
(2020)
15. Kurbalija, V., Ivanovi, M., Radovanovi, M., Geler, Z., Dai, W., Zhao, W.: Emotion
perception and recognition: An exploration of cultural differences and similarities. Cog-
nitive Systems Research 52, 103–116 (2018)
16. van Laarhoven, T.: L2 regularization versus batch and weight normalization (2017)
17. Lan, Z., Sourina, O., Wang, L., Scherer, R., Mller-Putz, G.R.: Domain adaptation tech-
niques for eeg-based emotion recognition: A comparative study on two public datasets.
IEEE Transactions on Cognitive and Developmental Systems 11(1), 85–94 (2019)
18. Li, H., Jin, Y.M., Zheng, W.L., Lu, B.L.: Cross-subject emotion recognition using deep
adaptation networks. In: Neural Information Processing, pp. 403–413. Springer Inter-
national Publishing (2018)
19. Li, M., Lu, B.L.: Emotion classification based on gamma-band eeg. In: 2009 Annual
International Conference of the IEEE Engineering in medicine and biology society, pp.
1223–1226. IEEE (2009)
20. Li, P., Liu, H., Si, Y., Li, C., Li, F., Zhu, X., Huang, X., Zeng, Y., Yao, D., Zhang, Y.,
Xu, P.: Eeg based emotion recognition by combining functional connectivity network
and local activations. IEEE Transactions on Biomedical Engineering 66(10), 2869–2881
(2019)
21. Li, X., Song, D., Zhang, P., Zhang, Y., Hou, Y., Hu, B.: Exploring eeg features in
cross-subject emotion recognition. Frontiers in neuroscience 12, 162 (2018)
22. Li, Y., Zheng, W., Zong, Y., Cui, Z., Zhang, T., Zhou, X.: A bi-hemisphere domain
adversarial neural network model for eeg emotion recognition. IEEE Transactions on
Affective Computing (2018)
23. Liu, J., Wu, G., Luo, Y., Qiu, S., Yang, S., Li, W., Bi, Y.: Eeg-based emotion clas-
sification using a deep neural network and sparse autoencoder. Frontiers in Systems
Neuroscience 14, 43 (2020)
24. Liu, N.H., Chiang, C.Y., Hsu, H.M.: Improving driver alertness through music selection
using a mobile eeg to detect brainwaves. Sensors 13(7), 8199–8221 (2013)
25. Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural
networks (2017)
26. Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using multimodal deep
learning (2016)
27. Manli, S., Song, Z., Jiang, X., Pan, J., Pang, Y.: Learning pooling for convolutional
neural network. Neurocomputing 224 (2016)
28. Mariooryad, S., Busso, C.: Compensating for speaker or lexical variabilities in speech
for emotion recognition. Speech Communication 57, 1–12 (2014)
29. Mathersul, D., Williams, L.M., Hopkinson, P.J., Kemp, A.H.: Investigating models of
affect: relationships among eeg alpha asymmetry, depression, and anxiety. Emotion
8(4), 560 (2008)
30. Park, S., Kwak, N.: Analysis on the dropout effect in convolutional neural networks.
pp. 189–204 (2017)
31. Piana, S., Staglian`o, A., Odone, F., Camurri, A.: Adaptive body gesture representation
for automatic emotion recognition. ACM Transactions on Interactive Intelligent Systems
(TiiS) 6(1), 1–31 (2016)
32. Qiu, J.L., Liu, W., Lu, B.L.: Multi-view emotion recognition using deep canonical corre-
lation analysis. In: Neural Information Processing, pp. 221–231. Springer International
Publishing (2018)
33. Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-fold cross validation
in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine
Intelligence 32(3), 569–575 (2010)
Title Suppressed Due to Excessive Length 15
34. Sammler, D., Grigutsch, M., Fritz, T., Koelsch, S.: Music and emotion: electrophysio-
logical correlates of the processing of pleasant and unpleasant music. Psychophysiology
44(2), 293–304 (2007)
35. Sarkar, P., Etemad, A.: Self-supervised ecg representation learning for emotion recog-
nition. IEEE Transactions on Affective Computing (2020)
36. Sauvet, F., Bougard, C., Coroenne, M., Lely, L., Van Beers, P., Elbaz, M., Guillard,
M., Leger, D., Chennaoui, M.: In-flight automatic detection of vigilance states using a
single eeg channel. IEEE Transactions on Biomedical Engineering 61(12), 2840–2847
(2014)
37. Sharma, R., Sahu, S.S., Upadhyay, A., Sharma, R.R., Sahoo, A.K.: Sleep stage clas-
sification using DWT and dispersion entropy applied on EEG signals. In: Computer-
aided Design and Diagnosis Methods for Biomedical Applications, pp. 35–56. CRC Press
(2021)
38. Sharma, R.R., Pachori, R.B.: Time-frequency representation using ievdhm-ht with ap-
plication to classification of epileptic eeg signals. IET Science, Measurement & Tech-
nology 12(1), 72–82 (2017)
39. Sharma, S., Sharma, R.R.: Variational mode decomposition based finger flexion move-
ment detection using ECoG signals. In: Artificial Intelligence-based Brain-Computer
Interface, pp. 101–119. Elsevier (2022)
40. Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, D., Xu, X., Yang, X.: A review of emotion
recognition using physiological signals. Sensors 18(7), 2074 (2018)
41. Song, T., Zheng, W., Song, P., Cui, Z.: Eeg emotion recognition using dynamical graph
convolutional neural networks. IEEE Transactions on Affective Computing 11(3), 532–
541 (2020)
42. Subasi, A., Gursoy, M.I.: Eeg signal classification using pca, ica, lda and support vector
machines. Expert systems with applications 37(12), 8659–8666 (2010)
43. Tabar, Y.R., Halici, U.: A novel deep learning approach for classification of EEG motor
imagery signals. Journal of Neural Engineering 14(1), 016003 (2016)
44. Tahir, M.A., Kittler, J., Bouridane, A.: Multilabel classification using heterogeneous
ensemble of multi-label classifiers. Pattern Recognition Letters 33(5), 513–523 (2012)
45. Tang, H., Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using deep
neural networks. In: Neural Information Processing, pp. 811–819. Springer International
Publishing (2017)
46. Teuwen, J., Moriakov, N.: Chapter 20 - convolutional neural networks. In: Handbook
of Medical Image Computing and Computer Assisted Intervention, The Elsevier and
MICCAI Society Book Series, pp. 481–501. Academic Press (2020)
47. Tripathi, S., Acharya, S., Sharma, R.D., Mittal, S., Bhattacharya, S.: Using deep and
convolutional neural networks for accurate emotion classification on deap dataset. In:
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, p. 47464752.
AAAI Press (2017)
48. Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Pro-
ceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
49. Wang, Z., Tong, Y., Heng, X.: Phase-locking value based graph convolutional neural
networks for emotion recognition. IEEE Access 7, 93711–93722 (2019)
50. Webb, G.I., Zheng, Z.: Multistrategy ensemble learning: reducing error by combining
ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering
16(8), 980–991 (2004)
51. Young, A.W., Rowland, D., Calder, A.J., Etcoff, N.L., Seth, A., Perrett, D.I.: Facial
expression megamix: Tests of dimensional and category accounts of emotion recognition.
Cognition 63(3), 271–313 (1997)
52. Zhang, D., Yao, L., Zhang, X., Wang, S., Chen, W., Boots, R., Benatallah, B.: Cascade
and parallel convolutional recurrent neural networks on eeg-based intention recognition
for brain computer interface. In: Proceedings of the Thirty-Second AAAI Conference
on Artificial Intelligence, pp. 1703–1710. AAAI Press (2018)
53. Zheng, W., Zhu, J., Peng, Y., Lu, B.: EEG-based emotion classification using deep belief
networks. In: 2014 IEEE International Conference on Multimedia and Expo (ICME),
pp. 1–6 (2014)
16 Abhishek Iyer et al.
54. Zheng, W.L., Lu, B.L.: Investigating critical frequency bands and channels for EEG-
based emotion recognition with deep neural networks. IEEE Transactions on Au-
tonomous Mental Development 7(3), 162–175 (2015)
55. Zheng, W.L., Zhu, J.Y., Lu, B.L.: Identifying stable patterns over time for emotion
recognition from eeg. IEEE Transactions on Affective Computing 10(3), 417–429 (2017)
56. Zhong, P., Wang, D., Miao, C.: Eeg-based emotion recognition using regularized graph
neural networks. IEEE Transactions on Affective Computing (2020)
A preview of this full-text is provided by Springer Nature.
Content available from Multimedia Tools and Applications
This content is subject to copyright. Terms and conditions apply.