ArticlePDF Available

CNN and LSTM based Ensemble Learning for Human Emotion Recognition using EEG Recordings

April 2022
Multimedia Tools and Applications

April 2022

DOI:10.1007/s11042-022-12310-7

Authors:

Abhishek Iyer

Birla Institute of Technology and Science Pilani

Srimit Sritik Das

Birla Institute of Technology and Science Pilani

Reva Teotia

Birla Institute of Technology and Science Pilani

Shishir Maheshwari

Thapar University

Show all 5 authorsHide

Emotion is a significant parameter in daily life and is considered an important factor for human interactions. The human-machine interactions and their advanced stages like humanoid robots essentially require emotional investigation. This paper proposes a novel method for human emotion recognition using electroencephalogram (EEG) signals. We have considered three emotions namely neutral, positive, and negative. These EEG signals are separated into five frequency bands according to EEG rhythms and the differential entropy is computed over the different frequency band components. The convolution neural network (CNN) and long short-term memory (LSTM) based hybrid model is developed for accurate emotion detection. Further, the extracted features are fed to all three models for emotion recognition. Finally, an ensemble model combines the predictions of all three models. The proposed approach is validated on two datasets namely SEED and DEAP for EEG based emotion analysis. The developed method achieved 97.16% accuracy on SEED dataset for emotion classification. The experimental results indicate that the proposed approach is effective and yields better performance than the compared methods for EEG-based emotion analysis.

Block diagram of the proposed system for EEG-based emotion recognition

…

CNN-based model architecture employed in the proposed approach for EEG-based emotion recognition

…

LSTM-based architecture employed in the proposed approach for EEG-based emotion recognition

…

Hybrid model employed in the proposed approach for EEG-based emotion recognition

…

Ensemble model employed in the proposed approach for EEG-based emotion recognition

…

Figures - available from: Multimedia Tools and Applications

This content is subject to copyright. Terms and conditions apply.

Content uploaded by Rishi Raj Sharma

Content may be subject to copyright.

Noname manuscript No.

(will be inserted by the editor)

CNN and LSTM based Ensemble Learning for

Human Emotion Recognition using EEG Recordings

Abhishek Iyer ·Srimit Sritik Das ·Reva

Teotia ·Shishir Maheshwari ·Rishi Raj

Sharma

Received: date / Accepted: date

Abstract Emotion is a signiﬁcant parameter in daily life and is considered an

important factor for human interactions. The human-machine interactions and

their advanced stages like humanoid robots essentially require emotional inves-

tigation. This paper proposes a novel method for human emotion recognition

using electroencephalogram (EEG) signals. We have considered three emotions

namely neutral, positive, and negative . These EEG signals are separated into

ﬁve frequency bands according to EEG rhythms and the diﬀerential entropy

is computed over the diﬀerent frequency band components. The convolution

neural network (CNN) and long short-term memory (LSTM) based hybrid

model is developed for accurate emotion detection. Further, the extracted fea-

tures are fed to all three models for emotion recognition. Finally, an ensemble

model combines the predictions of all three models. The proposed approach is

validated on two datasets namely SEED and DEAP for EEG based emotion

analysis. The developed method achieved 97.16% accuracy on SEED dataset

for emotion classiﬁcation. The experimental results indicate that the proposed

approach is eﬀective and yields better performance than the compared meth-

ods for EEG-based emotion analysis.

Keywords Emotion recognition ·EEG ·Hybrid model ·Diﬀerential entropy ·

LSTM

Abhishek Iyer ·Srimit Sritik Das ·Reva Teotia ·Shishir Maheshwari

Department of Electrical and Electronics Engineering, Birla Institute of Technology and

Science, Pilani-333031, India

E-mail: f20181105@pilani.bits-pilani.ac.in, f20180527@pilani.bits-pilani.ac.in,

f20190268@pilani.bits-pilani.ac.in, shishir.maheshwari@pilani.bits-pilani.ac.in

Rishi Raj Sharma

Department of Electronics Engineering, Defence Institute of Advanced Technology, Pune-

411025, India

E-mail: dr.rrsrrs@gmail.com

2 Abhishek Iyer et al.

1 Introduction

Emotion is a pivotal factor in human life as it aﬀects the working ability, men-

tal state, and judgment of human beings. Numerous experts worked on this

topic in diﬀerent disciplines like psychology, cognitive science, neuroscience,

computer technology, brain-computer interfacing (BCI) [39], and others. The

electroencephalogram (EEG) based emotion recognition has created a lot of

scope in these disciplines using the actual aﬀective state [21]. BCI has nu-

merous applications of EEG-based emotion recognition including humanoid

robots. Most of the humanoid robots lack in terms of emotion and this ﬁeld is

not much explored. In eﬀective BCI, emotion recognition is a major parameter

and it becomes complex due to its fuzzy property. Human emotion correlates

with context, language, time, space, culture, and other components. There-

fore, the absolute true labels are also not possible with diﬀerent emotions with

EEG recording, which creates the issue [3].

Many authors proposed facial expression [51], gesture [31],[9], posture,

speech [28], and other physical signal-based emotion recognition methods.

These types of data are easy to record but can be easily controlled and falsify

the true emotion [15]. The controlling or mimicking of nervous system related

signals is very tough as they are involuntarily activated [40] and only subject

experts can control these signals. Therefore, true emotion signature can be

observed in nervous system related recordings. Several physiological record-

ings such as EEG, electrocardiogram (ECG) [35], temperature, electromyo-

gram (EMG) [5], respiratory, galvanic skin response (GSR) can be used to

study the human emotion [40]. The minute investigation of brain activity with

various emotions can assist the accurate and computation eﬃcient emotion

recognition models. Recent research in dry electrode implementation [11], [13]

and wearable devices promote the EEG recording based emotion identiﬁcation

in real-time accomplishment for mental state monitoring [24], [36]. EEG based

emotion recognition is one of the key feature required in human-machine in-

teraction (HMI) and a humanoid robot. This study is focused on EEG based

human emotion analysis in which the electrical activity of the brain is inves-

tigated during diﬀerent emotions (neutral, positive, and negative emotions).

Many studies are performed for EEG based human emotion diagnosis and

tried to form a deﬁnitive relationship of EEG signals with diﬀerent emotions

[29], [34]. EEG signal analysis is a very challenging task as it is non-stationary

[37]. In a real-time scenario, other signals are added into EEG recording and

signal to noise ratio (SNR) becomes low. Matrix decomposition-based EEG

signal analysis methods are proposed but due to high complexity, real-time

implementation is tough [7], [42], [38]. The emotion related stable patterns of

EEG recordings are observed in [55] which uses the DEAP dataset. The critical

frequencies of emotion and signiﬁcant channel selection in EEG recordings are

micro-observed in [54]. This channel selection is good to ﬁnd the position

of the electrode for emotion analysis. The time-frequency based and various

non-linear features are studied for EEG based emotion recognition and achieve

Title Suppressed Due to Excessive Length 3

59.06% accuracy (ACC) with DEAP data [21]. It is suggested that the gamma-

band in EEG recordings is more correlated with the emotion function [19].

Many machine learning-based architectures are proposed for EEG based

emotion examination. The bi-hemisphere based neural network is designed for

EEG emotion detection, and the experiment is performed on SEED dataset

with 63.50% ACC [22]. The graph neural network-based emotion recognition

is performed on the same dataset with 89.23% ACC using gamma band, and

94.24% ACC with all bands [56]. A regional asymmetric convolution neural

network (CNN) based study is carried on DEAP data and acquire 95% ACC

for arousal and valence emotion detection [6]. In most of the methods, existing

models are improved to achieve a good classiﬁcation of human emotion. The

proposed approach employs multiple models and develops a hybrid approach

to attain better ACC than the developed methods. The two models, based

on CNN and long short term memory (LSTM), are hybridized to improve the

ﬁnal prediction using the ensemble model.

Rest of the article is organized in the following manner. Section 2, presents

the dataset. Proposed approach with features is explained in section 3. The

proposed hybrid model along with the CNN and LSTM based models are

presented in section 4. This also includes the implementation of ensemble

learning. Results are explained in section 5. Finally, the article is concluded in

section 6.

2 Dataset

We have used two datasets for EEG based emotion recognition. The detailed

explanation of both the datasets is given next to this section.

2.1 SEED data

The database employed in the proposed approach has been obtained from the

brain like computing and machine learning (BCMI) methods. We employed

the SJTU emotion EEG dataset (SEED) [54], [8]. The dataset contains EEG

data of 15 subjects (7 males and 8 females) recorded in three separate ses-

sions, each session having 15 trials. In each trial, the EEG signal is recorded

when the subject is watching Chinese ﬁlm clips with three types of emotions,

namely positive, neutral, and negative. The duration of each ﬁlm clip is about

4 minutes and two ﬁlm clips targeting the same emotion are not shown consec-

utively. The participants reported their emotional reactions to each ﬁlm clip

by completing the questionnaire immediately after watching each ﬁlm clip.

The EEG signals are recorded using a 62- channel electrode cap according to

the international 10-20 system. The data is then down-sampled to 200Hz to

make system faster and a band-pass frequency ﬁlter from 0-75Hz is applied

which contains all the EEG rhythm information.

4 Abhishek Iyer et al.

2.2 DEAP data

The DEAP data has been recorded for the analysis of human emotion using

EEG signals. It is recorded for 32 healthy participants aged between 19 years

to 37 years and out of 32 participants 16 were female. Each participant has

been exposed to 40 music videos each of which has a duration of 1-min with the

same emotion throughout the video length. The data comprises 40 channels

out of which 32 EEG channels have been investigated in this paper. The data

is recorded with Biosem ActiveTwo devices at a sampling rate of 512 Hz. It is

further downsampled to 128 Hz to reduce the system complexity. The DEAP

data provides 32 ﬁles, where each ﬁle contains the 40-channel EEG recording

of 40 videos of one minute duration each.

3 Proposed approach

Block diagram of the proposed approach is shown in Fig. 1. All the subjects

were made to sit on a chair in the resting state and are asked to watch the

videos portraying diﬀerent emotions. Simultaneously, EEG signals are recorded

and pre-processing is done. The diﬀerential entropy (DE) based features are

computed in ﬁve EEG rhythms. The DE based features are explained in the

next sub-section. Further, CNN and LSTM models are employed and combined

to obtain the hybrid model. Thereafter, the ensemble model is proposed based

on these models.

3.1 Features extraction

We have employed DE as a feature in the proposed approach. DE extends

the idea of Shannon entropy and is used to measure the complexity of a con-

tinuous random variable. DE as a feature was ﬁrst introduced to EEG-based

emotion recognition by Duan et al.[8]. It has been found to be more suited for

emotion recognition than the traditional feature. DE has the balanced abil-

ity to discriminate EEG patterns between low and high-frequency energy. DE

feature extracted from EEG data provides stable and accurate information for

emotion classiﬁcation [53]. The diﬀerential entropy feature is as deﬁned below:

h(Y) = −Z∞

−∞

√2πσ2e−(y−µ)2

2σ2log( 1

√2πσ2e−(y−µ)2

2σ2)dy

2log(2πeσ2)

(1)

where the time series Y obeys the Gaussian distribution N(µ,σ2).

DE was employed to construct features in the ﬁve frequency bands: delta

(1- 3Hz), theta (4-7Hz), alpha (8-13Hz), beta (14-30Hz), and gamma (31-

50Hz). For the SEED dataset, the extracted DE feature for a sample EEG

signal has 310 dimensions as there are 62 channels for each frequency band

Title Suppressed Due to Excessive Length 5

Data

preprocessing

and feature

extraction

Stimuli

Emotion

classification

Positive

Negative

Neutral

Model & training

Data

Input Ensemble

CNN

LSTM

Hybrid

Fig. 1 Block diagram of the proposed system for EEG-based emotion recognition.

[54]. Similarly, the 32 channels are considered for DEAP dataset in ﬁve EEG

sub-bands which leads to total 160 DE features.

Various models along with the hybrid model and ensemble model are ex-

plained in next section.

4 Model employed for emotion recognition in the proposed system

In the proposed work, initially, the CNN and LSTM based models are employed

for emotion recognition. Thereafter, a hybrid model is proposed which is a

combination of CNN and LSTM. Finally, an ensemble model of these three

proposed models is taken into consideration. All these models are explained

in this section.

4.1 CNN-based model

The idea behind CNNs bears a resemblance to traditional artiﬁcial neural

networks (ANNs), consisting of neurons that self-optimize through learning.

CNN’s are powerful performers on large sequential data represented by ma-

trices such as images broken down to their pixel values [46]. A smaller n×n

kernel slides over the entire feature matrix performing convolutions over the

superposed space [12]. The feature map size can be kept consistent across mul-

tiple convolutions using padding of 0s. However, functions like Max Pooling

are employed to reduce the amount of computational data and still retain the

important information [27]. As the feature maps pass through the diﬀerent

convolutional layers, the ﬁlters learn to detect patterns and more abstract

features.

EEG based emotion classiﬁcation using the CNN method was also ex-

plored in the approaches of [47]. Cascade and parallel convolutional recurrent

neural networks have been used for EEG human-intended movement classiﬁ-

cation tasks [52]. Additionally, before applying the CNN, EEG data could be

converted to image representation after feature extraction [43]. However, the

accuracy of emotion recognition by using only CNN is not high.

The details of the CNN architecture employed in the proposed approach

are shown in Fig. 2: The CNN model consists of four convolutional (conv)

blocks with 64, 128, 256, 512 ﬁlters, respectively. The kernel size of conv ﬁlters

6 Abhishek Iyer et al.

Input shape

(62,265,5)

Output shape

(31,132,64)

Output shape

(15,66,128)

Output shape

(7,33,256)

Output shape

(3,16,512)

2457651225664

Conv Block 2

Conv2D

Conv Block 3

Conv2D

Maxpool

Dropout

Conv2D

Maxpool

Dropout

Conv Block 1

Conv2D

Maxpool

Dropout

Fully Connected Block Conv Block 4

Conv2D

Maxpool

Dropout

Flatten

Dense

Dropout

Dense

Dropout

Dense

Drpoout

Softmax

Output

Fig. 2 CNN-based model architecture employed in the proposed approach for EEG-based

emotion recognition.

is 5×5 and 3×3. All the layers use padding, followed by maximum sub-sampling

layers, which operate over 2 ×2 sub-windows at each conv layer, known as the

Max Pooling layers. The network ends with three fully connected dense layers

fed to the c-way softmax [25] classiﬁcation layer. Relu activation is employed

due to its unity gradient, where the maximum amount of error is passed during

back-propagation [1]. Dropout regularization is used after every layer which

improves the performance of the model via a modest regularization eﬀect [30].

Thereafter, the predictions of the CNN model are fed to the proposed ensemble

model for emotion recognition.

4.2 LSTM-based model

The LSTM networks are modiﬁed recurrent neural networks (RNN), capable

of learning long-term dependencies. LSTM network is parametrized by weight

matrices from the input and the previous state for each of the gates, in addition

to the memory cell, which overcomes the issue of vanishing/exploding gradient

[10].

We use the standard formulation of LSTMs with the logistic function (σ)

[4] on the gates and the hyperbolic tangent [2] on the activations. The input

is of the shape 1325 ×62. The model has 4 LSTM layers with dropouts in be-

tween, and then the output is passed to the fully connected network. SoftMax

activation function [25] is used to predict the ﬁnal output. The block diagram

of the LSTM architecture is shown in Fig. 3.

Title Suppressed Due to Excessive Length 7

Input

Fully Connected Block

Flatten

Dense

Dropout

Dense

Dropout

Dense

Softmax

Input shape

(1325,62)

Output shape

(1325,256)

Output shape

(1325,128)

339200 512 256 64 Output

LSTM Block

LSTM

Dropout

LSTM

Dropout

LSTM

Dropout

LSTM

Fig. 3 LSTM-based architecture employed in the proposed approach for EEG-based emo-

tion recognition.

4.3 Hybrid model

The hybrid model combines more than one base model in series. Fig. 4 shows

the structure of the hybrid model employed in the proposed approach. The

hybrid model improves the performance by capturing more information that

is left undetected previously.

The ﬁrst three blocks of the hybrid model consist of convolutional (conv)

blocks. The conv block consists of max pool layers and the Dropout regu-

larisation to avoid overﬁtting [30]. The output shape of the third conv block

is 15 ×66 ×512. On the other hand, the input shape to the LSTM block is

66 ×7680. The reshape layer is employed between the conv and LSTM block

to facilitate this dimensional mismatch. In general, 2D conv block work on in-

puts which are R3, while LSTM inputs are in R2. The LSTM network uses the

Tanh activation function [2] and batch norm regularization [16]. The output

of the LSTM block is passed to a fully connected network that uses softMax

[25] to calculate the probabilities of the output.

4.4 Ensemble learning-based model

Ensemble learning is mainly of two types, namely, homogeneous and hetero-

geneous. It combines the prediction from multiple models and integrates the

individual strengths of the base models. This results in the robustness and

the improved performance of the overall approach [50]. Ensemble learning is

homogeneous when the base models are of the same type. In the proposed

approach, ensemble learning is heterogeneous as the base models are diﬀerent.

Once these models are trained, a statistical method is used to combine

the predictions of the diﬀerent models. The statistical method involves the

methods of bagging, boosting, and stacking. We have employed stacking as it

is suitable for heterogeneous ensemble models [44]. Stacking is the process in

which separate models learn parallely on the dataset and a small meta model,

usually a feed-forward neural network (FNN) is used to combine individual

predictions and come up with the ﬁnal outputs. Stacking introduces a meta-

model that receives the diﬀerent predictions of the base models as its input.

8 Abhishek Iyer et al.

Conv Block 3Conv Block 1

Conv2D

Maxpool

Dropout

Conv Block 2

Conv2D

Maxpool

Dropout

Conv2D

Reshape

LSTM Block Fully Connected Block

Softmax

Output shape

(66,512)

3379251225664Output

LSTM

Dropout

LSTM

Dropout

LSTM

Dropout

LSTM

Dropout

Flatten

Dense

Dropout

Dense

Dropout

Desne

Input shape

(62,265,5)

Output shape

(31,132,128)

Output shape

(15,66,256)

Output shape

(15,66,512)

Output shape

(66,7680)

Fig. 4 Hybrid model employed in the proposed approach for EEG-based emotion recogni-

tion.

The combining function is

either a max-function or a

meta prediction model

Ensemble Model

Prediction

Final

Prediction

CNN Model

LSTM Model

Hybrid Model

Fig. 5 Ensemble model employed in the proposed approach for EEG-based emotion recog-

nition.

The meta-model [48] learns to maximize the output prediction, and this be-

comes our ﬁnal output. In addition to stacking, we have also investigated the

max function as a statistical method to combine the predictions. Fig. 5 shows

the block diagram of ensemble model. The meta model used in the stack-

ing method consists of 4 fully connected (FC) layers followed by a softmax

classiﬁer [25].

Title Suppressed Due to Excessive Length 9

5 Results & discussion

The proposed approach has been evaluated on two datasets namely SEED

and DEAP. In the proposed approach, the performance of various models has

been investigated using the k-fold cross-validation test [33] with k = 10. The

individual performances of the CNN, LSTM and the hybrid model have been

obtained. Further, we have also obtained the performance of the ensemble

model. Each model is trained for 60 epochs with a batch size of 64. The

learning rate (LR) has not been ﬁxed due to saturation in loss which results

in no further improvement in the performance of the model. To overcome this

limitation, we have employed LR annealer which makes the learning rate a

variable parameter. It should ne noted that we have used same feature and

experimental setup for both the datasets.

5.1 Experimental results for SEED data

The performance of individual models are measured by evaluating certain

parameters such as weighted average precision (WAP), weighted average sen-

sitivity (WAS), and weighted average F1 score (WAF1). F1 score is a good

metric to check stability of the model. Table 1 tabulates the performance pa-

rameters of individual models, hybrid model and ensemble model for EEG

emotion recognition.

The experimental results suggest that the CNN and LSTM model individ-

ually achieves the classiﬁcation accuracy (ACC) of 89.53% and 89.99%, re-

spectively. The hybrid model achieves an ACC of 93.46%. On the other hand,

the ensemble model achieves the ACC of 97.16% for the stack-based ensemble

learning. The results for SEED data are tabulated in Table 1. From Table 1,

it can be noticed that the ensemble-based method provides improved perfor-

mance over other models. We believe that this is because the base models are

not weak and provide good accuracy by themselves.

Fig. 7 shows the plot of loss function and LR with respect to epoch. When

LR saturates after some epochs, there is no signiﬁcant decreases in loss which

results in poor model performance. On the other hand, as we decrease the RL

when loss saturates, the loss tends to settle more quickly and improves system

performance.

We have also shown Box and Whisker plots of ACC to shed some more

light on the results. The inter-quartile range (IQR) is indicated with the box

and the orange line showing where the median lies. This includes all the results

from 25 percentile to 75 percentile. The minimum and maximum values are

marked by the solid black line at the top and bottom of the box and whisker

plot. The outliers, marked by circles, are results that did not fall in the whisker

range, which contains results in the range of 1.5×IQR.

We further compare our experimental results of the proposed approach

with some of the past benchmark methodologies on emotion recognition on

the SEED dataset. Table 2 tabulates the comparison of the proposed approach

10 Abhishek Iyer et al.

Table 1 Classiﬁcation performance of individual and ensemble model for EEG-based emo-

tion recognition on SEED data.

Model MCC WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)

CNN-based 0.819 89 88 88.49 89.53 1.78

LSTM-based 0.821 89 89 89.00 89.99 1.92

Hybrid 0.867 94 93 93.49 93.46 1.36

Ensemble 0.90 97 97 97.00 97.16 1.08

Fig. 6 The box and Whisker plots for ACC achieved by proposed (top-left) CNN-based

model, (top-right) LSTM-based model, (bottom-left) Hybrid-model, (bottom-right) Ensem-

ble stack model for EEG-based emotion recognition

with other past benchmark results. It can be observed that the proposed ap-

proach outperforms the previous methodologies. It can also be noticed that the

standard deviation (STD) of the proposed approach is very less as compared

with other approaches tabulated in Table 2. This also reﬂects the repeatability

and reproducibility of the proposed approach.

5.2 Experimental results for DEAP data

We have also employed DEAP dataset for evaluating the performance of pro-

posed approach for EEG-based emotion analyis with same feature and exper-

imental setup. The performance of the proposed approach on DEAP dataset

Title Suppressed Due to Excessive Length 11

(a) (b)

Fig. 7 (a) Loss ﬁgures (b) Learning rate ﬁgures

has been tabulated in Table 3. It can be observed from Table 3 that the ensem-

ble obtains maximum performance as compared to other individual models.

The CNN-based, LSTM-based and hybrid models achieve classiﬁcation per-

formances of 63.50%, 63.89%, 64.02%, respectively. Tables demonstrate that

the ensemble model achieves better performance than individual models. The

performance of other existing works on DEAP data with same DE feature has

been compared in Table 4. It can be observed from Table 4 that the proposed

system attains better performance than the existing methods for EEG-based

emotion recognition.

12 Abhishek Iyer et al.

Table 2 Comparison of previous benchmark methodologies for EEG-based emotion recog-

nition on SEED data.

Paper Model Feature ACC (%) STD (%)

He Li et. al[18] DAN DE 83.81 8.56

Wang et. al[49] PGCNN RASM 84.35 10.28

T. Song et. al[41] DGCNN DE 90.40 8.49

Hwang et. al[14] CNN TP-DE (cubic) 90.41 8.71

Wei Liu et. al[26] BDAE DE 91.01 8.91

W Zheng et. al[53] GELM DE 91.07 7.54

Hao Tang et. al[45] Bimodal-LSTM PSD+DE 93.97 -

JL Qiu et. al[32] DCCA - 94.58 6.16

Liu J et. al[23] CNN+SAE+DNN PCC 96.77 -

Proposed approach Ensemble(stack) DE 97.16 1.08

Table 3 Classiﬁcation performance of individual and ensemble model for EEG-based emo-

tion recognition on DEAP data.

Classiﬁcation Methods WAP(%) WAS(%) WAF1(%) Avg ACC (%) STD (%)

CNN 64 74 68 63.50 3.89

LSTM 64.3 74 68 63.89 3.78

Hybrid 64 75 69 64.02 3.84

Ensemble 65 75 70 65.00 3.57

For the future work, we planned to extent our work to propose new feature

for the eﬀective emotion recognition from EEG signal. Also, the SEED and

DEAP datasets will be evaluated with new features to further improve the

existing performances. We also intend to test the proposed model for other

EEG-based neuronal system development.

Table 4 Comparison of previous benchmark methodologies for EEG-based emotion recog-

nition on DEAP data.

Paper Classiﬁcation Methods Feature Avg ACC (%) STD (%)

Lan et. al [17] logistic regression DE 48.93 15.50

Li et. al [20] SVM DE 57.00 0.3

Proposed approach Ensemble(stack) DE 65.00 3.57

Title Suppressed Due to Excessive Length 13

6 Conclusion

This paper proposes the ensemble learning-based EEG emotion recognition

system. Firstly, the diﬀerential entropy was extracted from diﬀerent frequency

bands of EEG signals. Thereafter, these features are fed to CNN and LSTM

based models. The hybrid model is developed by combining the sub-blocks of

CNN and LSTM models. The ensemble model is proposed based on the CNN,

LSTM, and hybrid model. The experimental results suggest that the ensemble

model achieves better classiﬁcation performance than the other models em-

ployed in the proposed approach. The proposed ensemble model outperforms

the compared methodologies with 97.16% ACC for EEG-based emotion recog-

nition on SEED dataset. The proposed method is also evaluated on DEAP

dataset and obtains 65% ACC using same features and model parameters.

All the models provided impressive accuracy individually and showed a much

lower standard deviation.

BCI is an upcoming ﬁeld that is highly reliant on the accurate, repeat-

able, and eﬃcient classiﬁcation of our brain waves frequently recorded by EEG

methods. The experimental results suggest that the proposed approach is suit-

able for this purpose and paves the way for upcoming research ﬁelds of such

as humanoid robots, sophisticated prosthetics, and AI-assisted healthcare and

recovery. In future, a hardware implementation can be done for the proposed

model.

References

1. Agarap, A.F.: Deep learning using rectiﬁed linear units (relu) (2019)

2. Anastassiou, G.A.: Multivariate hyperbolic tangent neural network approximation.

Computers & Mathematics with Applications 61(4), 809–821 (2011)

3. Bos, D.O., et al.: Eeg-based emotion recognition. The inﬂuence of visual and auditory

stimuli 56(3), 1–17 (2006)

4. Chen, Z., Cao, F., Hu, J.: Approximation by network operators with logistic activation

functions. Applied Mathematics and Computation 256, 565–571 (2015)

5. Cheng, B., Liu, G.: Emotion recognition from surface emg signal using wavelet transform

and neural network. In: Proceedings of the 2nd international conference on bioinfor-

matics and biomedical engineering (ICBBE), pp. 1363–1366 (2008)

6. Cui, H., Liu, A., Zhang, X., Chen, X., Wang, K., Chen, X.: Eeg-based emotion recogni-

tion using an end-to-end regional-asymmetric convolutional neural network. Knowledge-

Based Systems 205, 106243 (2020)

7. Dash, M., Liu, H.: Feature selection for classiﬁcation. Intelligent data analysis 1(1-4),

131–156 (1997)

8. Duan, R.N., Zhu, J.Y., Lu, B.L.: Diﬀerential entropy feature for EEG-based emotion

classiﬁcation. In: 6th International IEEE/EMBS Conference on Neural Engineering

(NER), pp. 81–84. IEEE (2013)

9. Glowinski, D., Camurri, A., Volpe, G., Dael, N., Scherer, K.: Technique for automatic

emotion recognition by body gesture analysis. In: 2008 IEEE Computer Society Con-

ference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2008)

10. Greﬀ, K., Srivastava, R.K., Koutnik, J., Steunebrink, B.R., Schmidhuber, J.: Lstm: A

search space odyssey. IEEE Transactions on Neural Networks and Learning Systems

28(10), 22222232 (2017)

11. Grozea, C., Voinescu, C.D., Fazli, S.: Bristle-sensorslow-cost ﬂexible passive dry eeg

electrodes for neurofeedback and bci applications. Journal of neural engineering 8(2),

025008 (2011)

14 Abhishek Iyer et al.

12. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X.,

Wang, G., Cai, J., Chen, T.: Recent advances in convolutional neural networks. Pattern

Recognition 77, 354–377 (2018)

13. Huang, Y.J., Wu, C.Y., Wong, A.M.K., Lin, B.S.: Novel active comb-shaped dry elec-

trode for eeg measurement in hairy site. IEEE Transactions on Biomedical Engineering

62(1), 256–263 (2014)

14. Hwang, S., Hong, K., Son, G., Byun, H.: Learning cnn features from de features for

eeg-based emotion recognition. Pattern Analysis and Applications 23(3), 1323–1335

(2020)

15. Kurbalija, V., Ivanovi, M., Radovanovi, M., Geler, Z., Dai, W., Zhao, W.: Emotion

perception and recognition: An exploration of cultural diﬀerences and similarities. Cog-

nitive Systems Research 52, 103–116 (2018)

16. van Laarhoven, T.: L2 regularization versus batch and weight normalization (2017)

17. Lan, Z., Sourina, O., Wang, L., Scherer, R., Mller-Putz, G.R.: Domain adaptation tech-

niques for eeg-based emotion recognition: A comparative study on two public datasets.

IEEE Transactions on Cognitive and Developmental Systems 11(1), 85–94 (2019)

18. Li, H., Jin, Y.M., Zheng, W.L., Lu, B.L.: Cross-subject emotion recognition using deep

adaptation networks. In: Neural Information Processing, pp. 403–413. Springer Inter-

national Publishing (2018)

19. Li, M., Lu, B.L.: Emotion classiﬁcation based on gamma-band eeg. In: 2009 Annual

International Conference of the IEEE Engineering in medicine and biology society, pp.

1223–1226. IEEE (2009)

20. Li, P., Liu, H., Si, Y., Li, C., Li, F., Zhu, X., Huang, X., Zeng, Y., Yao, D., Zhang, Y.,

Xu, P.: Eeg based emotion recognition by combining functional connectivity network

and local activations. IEEE Transactions on Biomedical Engineering 66(10), 2869–2881

(2019)

21. Li, X., Song, D., Zhang, P., Zhang, Y., Hou, Y., Hu, B.: Exploring eeg features in

cross-subject emotion recognition. Frontiers in neuroscience 12, 162 (2018)

22. Li, Y., Zheng, W., Zong, Y., Cui, Z., Zhang, T., Zhou, X.: A bi-hemisphere domain

adversarial neural network model for eeg emotion recognition. IEEE Transactions on

Aﬀective Computing (2018)

23. Liu, J., Wu, G., Luo, Y., Qiu, S., Yang, S., Li, W., Bi, Y.: Eeg-based emotion clas-

siﬁcation using a deep neural network and sparse autoencoder. Frontiers in Systems

Neuroscience 14, 43 (2020)

24. Liu, N.H., Chiang, C.Y., Hsu, H.M.: Improving driver alertness through music selection

using a mobile eeg to detect brainwaves. Sensors 13(7), 8199–8221 (2013)

25. Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural

networks (2017)

26. Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using multimodal deep

learning (2016)

27. Manli, S., Song, Z., Jiang, X., Pan, J., Pang, Y.: Learning pooling for convolutional

neural network. Neurocomputing 224 (2016)

28. Mariooryad, S., Busso, C.: Compensating for speaker or lexical variabilities in speech

for emotion recognition. Speech Communication 57, 1–12 (2014)

29. Mathersul, D., Williams, L.M., Hopkinson, P.J., Kemp, A.H.: Investigating models of

aﬀect: relationships among eeg alpha asymmetry, depression, and anxiety. Emotion

8(4), 560 (2008)

30. Park, S., Kwak, N.: Analysis on the dropout eﬀect in convolutional neural networks.

pp. 189–204 (2017)

31. Piana, S., Staglian`o, A., Odone, F., Camurri, A.: Adaptive body gesture representation

for automatic emotion recognition. ACM Transactions on Interactive Intelligent Systems

(TiiS) 6(1), 1–31 (2016)

32. Qiu, J.L., Liu, W., Lu, B.L.: Multi-view emotion recognition using deep canonical corre-

lation analysis. In: Neural Information Processing, pp. 221–231. Springer International

Publishing (2018)

33. Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-fold cross validation

in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine

Intelligence 32(3), 569–575 (2010)

Title Suppressed Due to Excessive Length 15

34. Sammler, D., Grigutsch, M., Fritz, T., Koelsch, S.: Music and emotion: electrophysio-

logical correlates of the processing of pleasant and unpleasant music. Psychophysiology

44(2), 293–304 (2007)

35. Sarkar, P., Etemad, A.: Self-supervised ecg representation learning for emotion recog-

nition. IEEE Transactions on Aﬀective Computing (2020)

36. Sauvet, F., Bougard, C., Coroenne, M., Lely, L., Van Beers, P., Elbaz, M., Guillard,

M., Leger, D., Chennaoui, M.: In-ﬂight automatic detection of vigilance states using a

single eeg channel. IEEE Transactions on Biomedical Engineering 61(12), 2840–2847

(2014)

37. Sharma, R., Sahu, S.S., Upadhyay, A., Sharma, R.R., Sahoo, A.K.: Sleep stage clas-

siﬁcation using DWT and dispersion entropy applied on EEG signals. In: Computer-

aided Design and Diagnosis Methods for Biomedical Applications, pp. 35–56. CRC Press

(2021)

38. Sharma, R.R., Pachori, R.B.: Time-frequency representation using ievdhm-ht with ap-

plication to classiﬁcation of epileptic eeg signals. IET Science, Measurement & Tech-

nology 12(1), 72–82 (2017)

39. Sharma, S., Sharma, R.R.: Variational mode decomposition based ﬁnger ﬂexion move-

ment detection using ECoG signals. In: Artiﬁcial Intelligence-based Brain-Computer

Interface, pp. 101–119. Elsevier (2022)

40. Shu, L., Xie, J., Yang, M., Li, Z., Li, Z., Liao, D., Xu, X., Yang, X.: A review of emotion

recognition using physiological signals. Sensors 18(7), 2074 (2018)

41. Song, T., Zheng, W., Song, P., Cui, Z.: Eeg emotion recognition using dynamical graph

convolutional neural networks. IEEE Transactions on Aﬀective Computing 11(3), 532–

541 (2020)

42. Subasi, A., Gursoy, M.I.: Eeg signal classiﬁcation using pca, ica, lda and support vector

machines. Expert systems with applications 37(12), 8659–8666 (2010)

43. Tabar, Y.R., Halici, U.: A novel deep learning approach for classiﬁcation of EEG motor

imagery signals. Journal of Neural Engineering 14(1), 016003 (2016)

44. Tahir, M.A., Kittler, J., Bouridane, A.: Multilabel classiﬁcation using heterogeneous

ensemble of multi-label classiﬁers. Pattern Recognition Letters 33(5), 513–523 (2012)

45. Tang, H., Liu, W., Zheng, W.L., Lu, B.L.: Multimodal emotion recognition using deep

neural networks. In: Neural Information Processing, pp. 811–819. Springer International

Publishing (2017)

46. Teuwen, J., Moriakov, N.: Chapter 20 - convolutional neural networks. In: Handbook

of Medical Image Computing and Computer Assisted Intervention, The Elsevier and

MICCAI Society Book Series, pp. 481–501. Academic Press (2020)

47. Tripathi, S., Acharya, S., Sharma, R.D., Mittal, S., Bhattacharya, S.: Using deep and

convolutional neural networks for accurate emotion classiﬁcation on deap dataset. In:

Proceedings of the Thirty-First AAAI Conference on Artiﬁcial Intelligence, p. 47464752.

AAAI Press (2017)

48. Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Pro-

ceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

49. Wang, Z., Tong, Y., Heng, X.: Phase-locking value based graph convolutional neural

networks for emotion recognition. IEEE Access 7, 93711–93722 (2019)

50. Webb, G.I., Zheng, Z.: Multistrategy ensemble learning: reducing error by combining

ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering

16(8), 980–991 (2004)

51. Young, A.W., Rowland, D., Calder, A.J., Etcoﬀ, N.L., Seth, A., Perrett, D.I.: Facial

expression megamix: Tests of dimensional and category accounts of emotion recognition.

Cognition 63(3), 271–313 (1997)

52. Zhang, D., Yao, L., Zhang, X., Wang, S., Chen, W., Boots, R., Benatallah, B.: Cascade

and parallel convolutional recurrent neural networks on eeg-based intention recognition

for brain computer interface. In: Proceedings of the Thirty-Second AAAI Conference

on Artiﬁcial Intelligence, pp. 1703–1710. AAAI Press (2018)

53. Zheng, W., Zhu, J., Peng, Y., Lu, B.: EEG-based emotion classiﬁcation using deep belief

networks. In: 2014 IEEE International Conference on Multimedia and Expo (ICME),

pp. 1–6 (2014)

16 Abhishek Iyer et al.

54. Zheng, W.L., Lu, B.L.: Investigating critical frequency bands and channels for EEG-

based emotion recognition with deep neural networks. IEEE Transactions on Au-

tonomous Mental Development 7(3), 162–175 (2015)

55. Zheng, W.L., Zhu, J.Y., Lu, B.L.: Identifying stable patterns over time for emotion

recognition from eeg. IEEE Transactions on Aﬀective Computing 10(3), 417–429 (2017)

56. Zhong, P., Wang, D., Miao, C.: Eeg-based emotion recognition using regularized graph

neural networks. IEEE Transactions on Aﬀective Computing (2020)

A preview of this full-text is provided by Springer Nature.

Learn more

Content available from Multimedia Tools and Applications

This content is subject to copyright. Terms and conditions apply.

CLDTA: Contrastive Learning based on Diagonal Transformer Autoencoder for Cross-Dataset EEG Emotion Recognition

Preprint

Jun 2024

Recent advances in non-invasive EEG technology have broadened its application in emotion recognition, yielding a multitude of related datasets. Yet, deep learning models struggle to generalize across these datasets due to variations in acquisition equipment and emotional stimulus materials. To address the pressing need for a universal model that fluidly accommodates diverse EEG dataset formats and bridges the gap between laboratory and real-world data, we introduce a novel deep learning framework: the Contrastive Learning based Diagonal Transformer Autoencoder (CLDTA), tailored for EEG-based emotion recognition. The CLDTA employs a diagonal masking strategy within its encoder to extracts full-channel EEG data's brain network knowledge, facilitating transferability to the datasets with fewer channels. And an information separation mechanism improves model interpretability by enabling straightforward visualization of brain networks. The CLDTA framework employs contrastive learning to distill subject-independent emotional representations and uses a calibration prediction process to enable rapid adaptation of the model to new subjects with minimal samples, achieving accurate emotion recognition. Our analysis across the SEED, SEED-IV, SEED-V, and DEAP datasets highlights CLDTA's consistent performance and proficiency in detecting both task-specific and general features of EEG signals related to emotions, underscoring its potential to revolutionize emotion recognition research.

Enhancing EEG-Based Emotion Recognition Using Asymmetric Windowing Recurrence Plots

Article

Full-text available

Jun 2024

Time-series classification (TSC) has been widely utilized across various domains, including brain-computer interfaces (BCI) for emotion recognition through electroencephalogram (EEG) signals. However, traditional methods often struggle to capture the complex emotional patterns present in EEG data. Recent advancements in encoding techniques have provided promising avenues for improving emotion recognition. This study introduces asymmetric windowing recurrence plots (AWRP) as a novel encoding technique to efficiently encapsulate the dynamic characteristics of EEG signals into texture-rich image representations. This study systematically compares the impact of conventional thresholded and unthresholded recurrence plots (RP) versus the proposed AWRP in emotion recognition tasks. Empirical validations conducted across benchmark datasets, such as DEAP and SEED, demonstrate that the AWRP method achieves classification accuracies of 99.84% and 99.69%, respectively, outperforming existing state- of-the-art methodologies. This study emphasizes the significance of input formulation, highlighting that richer input textures, as provided by AWRP, significantly enhance emotion recognition performance while ensuring computational memory usage efficiency. These findings have significant implications in the domain of EEG-based emotion recognition and offer a novel perspective that can guide future research.

Federated learning in Emotion Recognition Systems based on physiological signals for privacy preservation: a review

Article

Full-text available

Jun 2024
MULTIMED TOOLS APPL

Automated Emotion Recognition Systems (ERS) with physiological signals help improve health and decision-making in everyday life. It uses traditional Machine Learning (ML) methods, requiring high-quality learning models for physiological data (sensitive information). However, automated ERS enables data attacks and leaks, significantly losing user privacy and integrity. This privacy problem can be solved using a novel Federated Learning (FL) approach, which enables distributed machine learning model training. This review examines 192 papers focusing on emotion recognition via physiological signals and FL. It is the first review article concerning the privacy of sensitive physiological data for an ERS. The paper reviews the different emotions, benchmark datasets, machine learning, and federated learning approaches for classifying emotions. It proposes a novel multi-modal Federated Learning for Physiological signals based on Emotion Recognition Systems (Fed-PhyERS) architecture, experimenting with the AMIGOS dataset and its applications for a next-generation automated ERS. Based on critical analysis, this paper provides the key takeaways, identifies the limitations, and proposes future research directions to address gaps in previous studies. Moreover, it reviews ethical considerations related to implementing the proposed architecture. This review paper aims to provide readers with a comprehensive insight into the current trends, architectures, and techniques utilized within the field.

Gated transformer network based EEG emotion recognition

Article

Full-text available

Jun 2024

Multi-channel Electroencephalogram (EEG) based emotion recognition is focused on several analysis of frequency bands of the acquired signals. In this paper, spectral properties appeared on five EEG bands (\(\delta \), \(\theta \), \(\alpha \), \(\beta \), \(\gamma \)) and gated transformer network (GTN) based emotion recognition using EEG signal are proposed. Spectral energies and differential entropies of 62-channel signals are converted to 3D (sequence-channel-trial) form to feed the GTN. The GTN with enhanced gated two tower based transformer architecture is fed by 3D sequences extracted from SEED and SEED-IV emotional datasets. 15 participants’ states in session 1–3 are evaluated using the proposed GTN based sequence classification, and the results are repeated by \(3\small \times \) shuffling. Totally, 135 times training and testing are performed on each dataset, and the results are presented. The proposed GTN model achieves mean accuracy rates of 98.82% on the SEED dataset and 96.77% on the SEED-IV dataset for three and four emotional state recognition tasks, respectively. The proposed emotion recognition model can be employed as a promising approach for EEG emotion recognition.

Emotion Recognition in EEG Based on Multilevel Multidomain Feature Fusion

Article

Full-text available

Jan 2024

In emotion recognition tasks, electroencephalography (EEG) has gained significant favor among researchers as a powerful biological signal tool. However, existing studies often fail to fully utilize the high temporal resolution provided by EEG when combining spatiotemporal and frequency features for emotion recognition, and do not meet the needs of effective feature fusion. Therefore, this paper proposes a multilevel multidomain feature fusion network model called MMF-Net, aiming to obtain a more comprehensive representation of spatiotemporal-frequency features and achieve higher accuracy in emotion classification. The model takes the original EEG two-dimensional feature map as input, simultaneously extracting spatiotemporal and spatial-frequency domain features at different levels to effectively utilize temporal resolution. Next, at each level, a specially designed fusion network layer is employed to combine the captured temporal, spatial, and frequency domain features. In addition, the fusion network layer contributes positively to the convergence of the model and the enhancement of feature detectors. In subject-dependent experiments, MMF-Net achieved average accuracy rates of 99.50% and 99.59% for valence and arousal dimensions on the DEAP dataset, respectively. In subject-independent experiments, the average accuracy rates for these two dimensions reached 97.46% and 97.54%, respectively.

Multi-Modal Emotion Recognition for Online Education Using Emoji Prompts

Article

Full-text available

Jun 2024

Online education review data have strong statistical and predictive power but lack efficient and accurate analysis methods. In this paper, we propose a multi-modal emotion analysis method to analyze the online education of college students based on educational data. Specifically, we design a multi-modal emotion analysis method that combines text and emoji data, using pre-training emotional prompt learning to enhance the sentiment polarity. We also analyze whether this fusion model reflects the true emotional polarity. The conducted experiments show that our multi-modal emotion analysis method achieves good performance on several datasets, and multi-modal emotional prompt methods can more accurately reflect emotional expressions in online education data.

Bridging Artificial Intelligence and Neurological Signals (BRAINS): A Novel Framework for EEG-Based Image Generation

Preprint

Full-text available

Jun 2024

Recent advancements in cognitive neuroscience, particularly in Electroencephalogram (EEG) signal processing, image generation, and brain-computer interfaces (BCI), have opened up new avenues for research. This study introduces a novel framework, Bridging Artificial Intelligence and Neurological Signals (BRAINS), which leverages the power of AI to extract meaningful information from EEG signals and generate images. The BRAINS framework addresses the limitations of traditional EEG analysis techniques, which struggle with nonstationary signals, spectral estimation, and noise sensitivity. Instead, BRAINS employs Long Short-Term Memory (LSTM) networks and contrastive learning, which effectively handle time-series EEG data and recognize intrinsic connections and patterns. The study utilizes the MNIST dataset of handwritten digits as stimuli in EEG experiments, allowing for diverse yet controlled stimuli. The data collected is then processed through an LSTM-based network employing contrastive learning and extracting complex features from EEG data. These features are fed into an image generator model, producing images as close to the original stimuli as possible. This study demonstrates the potential of integrating AI and EEG technology, offering promising implications for the future of brain-computer interfaces.

Decoding emotional patterns using NIG modeling of EEG signals in the CEEMDAN domain

Article

Jun 2024

Electroencephalogram (EEG) signals used for emotion classification are vital in the Human–Computer Interface (HCI), which has gained a lot of focus. However, the irregular and non-stationary characteristics of the EEG signals manifest barriers and limit state-of-the-art techniques from accurately assessing different emotions from the EEG data, leading to minimal emotion recognition performance. Moreover, cross-subject emotion recognition (CSER) has always been challenging due to the weak generality of features from EEG signals among subjects. Thus, this study employed a novel algorithm, Complete Ensemble Empirical Mode Decomposition with adaptive noise (CEEMDAN), which decomposes EEG into intrinsic mode functions (IMFs) to comprehend the associated EEG’s stochastic characteristics. Further IMFs are characterized by Normal Inverse Gaussian (NIG) probability density function (PDF) parameters. These NIG features are fed into an optimized Extreme Gradient Boosting (XGboost) classifier developed using a cross-validation technique. The uniqueness of this research is in the use of NIG modeling of CEEMDAN domain IMFs to extract specific emotions from EEG signals. Qualitative, visual, and statistical assessments are used to illustrate the importance of the NIG parameters. Extensive experiments are carried out with the online available data sources SJTU Emotion EEG Dataset (SEED), SEED-IV, and Database for Emotion Analysis of Physiological Signals (DEAP) to evaluate the potency of the proposed approach. The suggested system for recognizing emotions performed better than cutting-edge techniques, attaining the highest accuracy of 98.9%, 97.8%, and 96.7% with the tenfold cross-validation (CV) protocol and 96.84%, 95.38%, and 91.39% for cross-subject validation (CSV) approach using SEED, SEED-IV, and DEAP databases, respectively.

Towards Superior EEG-Based Emotion Recognition: Integrating CNN Outputs with Machine Learning Classifiers for Enhanced Performance

Conference Paper

May 2024

A comprehensive review of deep learning in EEG-based emotion recognition: classifications, trends, and practical implications

Article

May 2024

Emotion recognition utilizing EEG signals has emerged as a pivotal component of human–computer interaction. In recent years, with the relentless advancement of deep learning techniques, using deep learning for analyzing EEG signals has assumed a prominent role in emotion recognition. Applying deep learning in the context of EEG-based emotion recognition carries profound practical implications. Although many model approaches and some review articles have scrutinized this domain, they have yet to undergo a comprehensive and precise classification and summarization process. The existing classifications are somewhat coarse, with insufficient attention given to the potential applications within this domain. Therefore, this article systematically classifies recent developments in EEG-based emotion recognition, providing researchers with a lucid understanding of this field’s various trajectories and methodologies. Additionally, it elucidates why distinct directions necessitate distinct modeling approaches. In conclusion, this article synthesizes and dissects the practical significance of EEG signals in emotion recognition, emphasizing its promising avenues for future application.

EEG-Based Emotion Classification Using a Deep Neural Network and Sparse Autoencoder

Article

Full-text available

Sep 2020

Emotion classification based on brain–computer interface (BCI) systems is an appealing research topic. Recently, deep learning has been employed for the emotion classifications of BCI systems and compared to traditional classification methods improved results have been obtained. In this paper, a novel deep neural network is proposed for emotion classification using EEG systems, which combines the Convolutional Neural Network (CNN), Sparse Autoencoder (SAE), and Deep Neural Network (DNN) together. In the proposed network, the features extracted by the CNN are first sent to SAE for encoding and decoding. Then the data with reduced redundancy are used as the input features of a DNN for classification task. The public datasets of DEAP and SEED are used for testing. Experimental results show that the proposed network is more effective than conventional CNN methods on the emotion recognitions. For the DEAP dataset, the highest recognition accuracies of 89.49% and 92.86% are achieved for valence and arousal, respectively. For the SEED dataset, however, the best recognition accuracy reaches 96.77%. By combining the CNN, SAE, and DNN and training them separately, the proposed network is shown as an efficient method with a faster convergence than the conventional CNN.

EEG-Based Emotion Recognition Using Regularized Graph Neural Networks

Article

Full-text available

May 2020

Electroencephalography (EEG) measures the neuronal activities in different brain regions via electrodes. Many existing studies on EEG-based emotion recognition do not fully exploit the topology of EEG channels. In this paper, we propose a regularized graph neural network (RGNN) for EEG-based emotion recognition. RGNN considers the biological topology among different brain regions to capture both local and global relations among different EEG channels. Specifically, we model the inter-channel relations in EEG signals via an adjacency matrix in a graph neural network where the connection and sparseness of the adjacency matrix are inspired by neuroscience theories of human brain organization. In addition, we propose two regularizers, namely node-wise domain adversarial training (NodeDAT) and emotion-aware distribution learning (EmotionDL), to better handle cross-subject EEG variations and noisy labels, respectively. Extensive experiments on two public datasets, SEED and SEED-IV, demonstrate the superior performance of our model than state-of-the-art models in most experimental settings. Moreover, ablation studies show that the proposed adjacency matrix and two regularizers contribute consistent and significant gain to the performance of our RGNN model. Finally, investigations on the neuronal activities reveal important brain regions and inter-channel relations for EEG-based emotion recognition.

Using Deep and Convolutional Neural Networks for Accurate Emotion Classification on DEAP Data

Article

Feb 2017

Emotion recognition is an important field of research in Brain Computer Interactions. As technology and the understanding of emotions are advancing, there are growing opportunities for automatic emotion recognition systems. Neural networks are a family of statistical learning models inspired by biological neural networks and are used to estimate functions that can depend on a large number of inputs that are generally unknown. In this paper we seek to use this effectiveness of Neural Networks to classify user emotions using EEG signals from the DEAP (Koelstra et al (2012)) dataset which represents the benchmark for Emotion classification research. We explore 2 different Neural Models, a simple Deep Neural Network and a Convolutional Neural Network for classification. Our model provides the state-of-the-art classification accuracy, obtaining 4.51 and 4.96 percentage point improvements over (Rozgic et al (2013)) classification of Valence and Arousal into 2 classes (High and Low) and 13.39 and 6.58 percentage point improvements over (Chung and Yoon(2012)) classification of Valence and Arousal into 3 classes (High, Normal and Low). Moreover our research is a testament that Neural Networks could be robust classifiers for brain signals, even outperforming traditional learning techniques.

Cascade and Parallel Convolutional Recurrent Neural Networks on EEG-based Intention Recognition for Brain Computer Interface

Article

Apr 2018

Brain-Computer Interface (BCI) is a system empowering humans to communicate with or control the outside world with exclusively brain intentions. Electroencephalography (EEG) based BCIs are promising solutions due to their convenient and portable instruments. Despite the extensive research of EEG in recent years, it is still challenging to interpret EEG signals effectively due to the massive noises in EEG signals (e.g., low signal-noise ratio and incomplete EEG signals), and difficulties in capturing the inconspicuous relationships between EEG signals and certain brain activities. Most existing works either only consider EEG as chain-like sequences neglecting complex dependencies between adjacent signals or requiring pre-processing such as transforming EEG waves into images. In this paper, we introduce both cascade and parallel convolutional recurrent neural network models for precisely identifying human intended movements and instructions effectively learning the compositional spatio-temporal representations of raw EEG streams. Extensive experiments on a large scale movement intention EEG dataset (108 subjects,3,145,160 EEG records) have demonstrated that both models achieve high accuracy near 98.3% and outperform a set of baseline methods and most recent deep learning based EEG recognition models, yielding a significant accuracy increase of 18% in the cross-subject validation scenario. The developed models are further evaluated with a real-world BCI and achieve a recognition accuracy of 93% over five instruction intentions. This suggests the proposed models are able to generalize over different kinds of intentions and BCI systems.

Variational Mode Decomposition Based Finger Flexion Detection Using ECoG Signals

Chapter

Feb 2022

The finger flexion movement prediction is a challenging problem of the Brain-computer interface. This chapter focuses on decoding the finger flexion movement using electrocorticogram (ECoG) signals. The variational mode decomposition (VMD) is applied to obtain the sub-components of each channel ECoG recording. Various correlation-based and other parameters such as correntropy, cross information potential, and entropy estimation by Kozachenko-Leonenko are evaluated to categorize the flexion movement. The parameter computation over multiple channels provokes a complicated process. This complication is reduced by applying correlation-based thresholding between all channels and significant channels are selected. All the computed features are given to the cubic support vector machine (C-SVM) classifier for classification. The complete model is investigated for the BCI competition-IV dataset, 2008, which holds brain signals of three subjects with different finger flexion movements. We have achieved the 0.43 correlation and 50.1% accuracy for five finger flexion classification on the considered dataset and for index finger versus little finger flexion classification 80.1% accuracy is achieved with 83% sensitivity.

Sleep Stage Classification Using DWT and Dispersion Entropy Applied on EEG Signals

Chapter

Mar 2021

In this chapter, the electroencephalogram (EEG) signals obtained from the polysomnography (PSG) recordings are analyzed using discrete (DWT) wavelet transform and dispersion entropy. The PSG recordings are taken from PhysioNet Sleep European Data Format (EDF) Database in this work. We investigate the performance of dispersion entropy (DEn) and one of its variant fluctuation based dispersion entropy (FDEn) computed from the wavelet subbands of the sleep EEG recordings. The random forest is employed for classification purpose using computed entropies as features. The performance of the algorithm is further compared using the DEn and FDEn separately as well as combining both entropies. The measures like sensitivity, specificity, and accuracy are used to compare the performance of the multi-class sleep stages classification problem. The results show the suitability of the DEn and FDEn for automatic scoring of sleep stages. It is found that FDEn is more suitable for classification as compared to DEn. However, the classification using both DEn and FDEn combine as feature results in even better sleep stages classification accuracy.

Feature Selection for Classification

Article

Jul 1997

Self-Supervised ECG Representation Learning for Emotion Recognition

Article

Aug 2020

We exploit a self-supervised deep multi-task learning framework for electrocardiogram (ECG) -based emotion recognition. The proposed solution consists of two stages of learning a) learning ECG representations and b) learning to classify emotions. ECG representations are learned by a signal transformation recognition network. The network learns high-level abstract representations from unlabeled ECG data. Six different signal transformations are applied to the ECG signals, and transformation recognition is performed as pretext tasks. Training the model on pretext tasks helps the network learn spatiotemporal representations that generalize well across different datasets and different emotion categories. We transfer the weights of the self-supervised network to an emotion recognition network, where the convolutional layers are kept frozen and the dense layers are trained with labelled ECG data. We show that the proposed solution considerably improves the performance compared to a network trained using fully-supervised learning. New state-of-the-art results are set in classification of arousal, valence, affective states, and stress for the four utilized datasets. Extensive experiments are performed, providing interesting insights into the impact of using a multi-task self-supervised structure instead of a single-task model, as well as the optimum level of difficulty required for the pretext self-supervised tasks.

EEG-based emotion recognition using an end-to-end regional-asymmetric convolutional neural network

Article

Jul 2020
KNOWL-BASED SYST

Emotion recognition based on electroencephalography (EEG) is of great important in the field of Human-Computer Interaction (HCI), which has received extensive attention in recent years. Most traditional methods focus on extracting features in time domain and frequency domain. The spatial information from adjacent channels and symmetric channels is often ignored. To better learn spatial representation, in this paper, we propose an end-to-end Regional-Asymmetric Convolutional Neural Network (RACNN) for emotion recognition, which consists of temporal, regional and asymmetric feature extractors. Specifically, continuous 1D convolution layers are employed in temporal feature extractor to learn time-frequency representations. Then, regional feature extractor consists of two 2D convolution layers to capture regional information among physically adjacent channels. Meanwhile, we propose an Asymmetric Differential Layer (ADL) in asymmetric feature extractor by taking the asymmetry property of emotion responses into account, which can capture the discriminative information between left and right hemispheres of the brain. To evaluate our model, we conduct extensive experiments on two publicly available datasets, i.e., DEAP and DREAMER. The proposed model can obtain recognition accuracies over 95% for valence and arousal classifification tasks on both datasets, significantly outperforming the state-of-the-art methods.

Meta-Learning to Detect Rare Objects

Conference Paper

Oct 2019

CNN and LSTM based Ensemble Learning for Human Emotion Recognition using EEG Recordings

Abstract and Figures

Recommended publications

Normal Inverse Gaussian Features for EEG-Based Automatic Emotion Recognition

EEG-Based Emotion Classification Using a Deep Neural Network and Sparse Autoencoder

Human emotion recognition from EEG signals: model evaluation in DEAP and SEED datasets

EEG-based Emotion Recognition Using Graph Convolutional Network with Learnable Electrode Relations