Content uploaded by Harco Leslie Hendric Spits Warnars
Author content
All content in this area was uploaded by Harco Leslie Hendric Spits Warnars on Jul 07, 2023
Content may be subject to copyright.
The Use of Deep and Machine Learning For Face
Expression Recognition : a Literature Review
*Note: Sub-titles are not captured in Xplore and should not be used
Gusti Pangestu*
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Computer Science Department,
School of Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
gusti.pangestu@binus.ac.id
Harco Leslie Hendric Spits Warnars
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
Spits.hendric@binus.ac.id
Ford Lumban Gao
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
fgaol@binus.edu
Benfano Soewito
Computer Science Department, BINUS
Graduate Program - Doctor of
Computer Science
Bina Nusantara university
Jakarta, Indonesia 11480
bsoewito@binus.edu
Abstract— In the current era, Artificial Intelligence (AI) and
Computer Vision take the main role and participating to the
people daily activities. Face Expression became as interesting
topic to be explore. Face expression recognition or detection
could be applied in many aspects, such as for student focus
detection based on face expression, intruder detection system,
lie detection and many more. Referring the useful of face
expression detection, there are also many research that focused
on the methodology to detects or recognize the face expression.
Several approach are use such by using Deep Learning,
Machine Learning and statistical method. This paper focused on
the process to obtain and knowing the best approach or methods
to recognize the face expression by using Systematic Literature
Review (SLR) process. From hundreds of retrieved paper with
similar interest, there are 6 paper in total that fulfil our
requirements based on tittle, result and methods. Subsequently,
every research paper was reviewed and compared with another
to find the best approaches proposed and dataset used.
Keywords—component, formatting, style, styling, insert (key
words)
I. INTRODUCTION
The increase of technology, especially computational
powers and the availability of sensing techniques and
engineering leading us to a new condition and era. The
increase of technology also made computer become more
intelligence [1]. Along with the increasing ability of
computer, the interaction between human and computer
became increasingly better. Computers have become one of
the tools that are widely used related to things related to
humans. Facial expressions are one of the most chosen
research focuses. Until now, quite a lot of research has
focused on reading human facial expressions.
There are lots of external products that utilize human
facial expressions, starting from the field of Education [2],
Health [3] even on social aspect [4][5]. The three aspects that
have been mentioned above are the aspects that make the
most use of the results of facial expression analysis.
There are quite a number of methods for detecting facial
expressions. Many researchers use an image processing
approach to detect and recognize facial expressions. One of
the methods used by researchers is to use Neural Networks,
including Convolutional Neural Networks (CNN) or Neural
Networks in a conventional way. In connection with the many
models and choices for the Neural Network architecture, it is
necessary to conduct further research on how successful and
accurate the Neural Network model is.
In this paper, we will discuss what are the methods,
approaches and methods for the facial expression recognition
process, especially using Neural Networks. There are several
steps and methods that are carried out, briefly these methods
are divided into several sections which will be discussed in
detail in the Methods section. This research will also examine
the results of several studies related to the use of Neural
Networks to recognize facial expressions. The final result of
this research is a conclusion analysis that is expected to help
other researchers to be able to accurately determine the most
appropriate architecture for the case study being carried out,
especially in detecting facial expressions.
II. METHOD
There are three main steps for reaching the objective
including: Plan the review, conduct and manage the review,
and the last is report and analysis the review [6].
Those three main steps are expanded in several stages
shown in Fig. 1. The stages mechanism was adopted
Literature Review Research proposed and written by Huda C,
et al [7].
In this section, every sub-section would be as an
explanation of every step that shown by Fig. 1. There is total
10 steps which started from Defining the Research Question
[8], [9] and concluded with the result of the selected paper
[10]. The process is arranged as sequential, but in reality it is
not limited to the sequential manner because it could be
repeated in the specific part to obtain the best results and
conclusion.
A. Defining Research Questions
Defining the Research Question is the most important
step in the research because of its vitality to lead the
methodology, methods, data and the instrument of the
research [11].
In this research, there are 3 items of Research Question
(RQ) that represent the needs and goals of this research. This
research also defining the RQ using the FINERMAPS
manner introduced by [12]. The Research Question items are
shown in Fig. 2.
B. Convert RQ to Keywords
After Research Question (RQ) wad defined, the next
step is to convert RQ to the keywords. The keywords needs
to be obtained in due to the process of finding a related
research in the manuscript database such as Scopus, Web of
Science etc. In this research, according to the problem
defined in the previous section, the keywords obtained from
the translation of the RQ are shown in Fig. 3.
C. Define the Inclusion and Exclusion criteria
The keywords will produce a lot of data whereas not all
of the data will be included in this research. Therefore, we
need and mechanism to filter the result to meet the data we
needs. In this stage, the inclusion and exclusion were applied
to produce a deep knowledges [13]. The inclusion and
exclusion criteria are shown in Table 1.
D. Define the Consulted Database
There are a lots of research paper database like Scopus,
Web of Science, IEEE, Springer, Elsevier etc. Those paper
database was containing huge indexed manuscript.
According to the amount of database, this research use
the two of those database which is Scopus and Google
Scholar. The use of those database is based on the access that
we has. Furthermore, scopus database contain a lot of high
quality, picked and good paper research [14]. But, according
to the inclusion and exclusion stated above that needs the
paper from 2018-2022, the scopus itself does not as fast as we
think about the indexing method. Therefor, we use the google
scholar database to obtain the latest research paper especially
with the year of publication in 2022.
E. Retrieve the paper
The process of retrieve the paper require the inclusion and
exclusion. The first result without inclusion and exclusion
will produce lot of paper considering that there is no
limitation in the query.
The second result of retrieved paper is the result after
applying inclusion and exclusion. The result will be more
shrink caused by the limitation. From the second result, again,
we choose several papers called “selected papers”. There are
many criteria that are taken into consideration, including
years, topic, and result. The selected paper then will be
entering the last steps, to produce the output and answering
the Research Question.
F. Produce the Result
In this step, the selection paper will be fully reviewed
from the dataset that used, model, methods or approaches
until the result. Furthermore, from the selected reviewed
Fig 1. Steps of Literature Review Process
RQ 1
What is the Problem?
RQ 2
What kind of Model being used?
RQ 3
What kind of data being used?
Fig 2. The Research Question (RQ)
String 1
Face Expression Recognition Deep
Learning
String 2
Face Expression Detection Deep
Learning
String 3
Face Emotion Detection Deep Learning
String 4
Face Emotion Recognition Deep
Learning
Fig. 3. Keywords generated from RQ
TABLE I. INCLUSION AND EXCLUSION CRITERIA
Inclusion Criteria
Justification
Published paper in 2018-
2022
Using the most recent
published paper only
Journal and Conference
only
Avoiding unreviewed
research
The study of paper must
be identical to the
keywords (Face
expression recognition)
Avoiding research that
does not meet the criteria
Exclusion Criteria
Not in english paper
Paper must be in english
which is the standard
International language
Paper is in review or still
not published
To focus on primary
research only
papers, the conclusion and the answer of the Research
Question (RQ) are obtained. The conclusion will determine
the best model and scenario of Deep Learning model as
approach for Face Expression recognition or detection.
III. RESULT AND DISCUSSION
This section will discussed the result of retrieving data
from the database. As mentioned before, we use two database
called scopus and google scholars.
The first retrieval process produce around 1848
documents in Scopus database and around 2.420.000
documents in Google Scholar database. Therefor, total of the
first retrieval process is 2.421.848 documents. For better
quality result, we use the scopus retrieved data for projection
as shown in Fig. 4.
According to Fig. 4, the research about Face Expression
detection using Deep Learning was significantly increasing
since 2015. The scopus database also shown that the highest
publication about the topic in 2021, remembering 2022 is still
in the process.
The next process it to apply the inclusion and exclusion
criteria to the retrieved data. The result after applying
inclusion and exclusion criteria, there was about 1.420 papers
result with China become the most country published
research paper about face expression detection.
Fast reviewing process also was done to the 1.420 paper
selected. In the fast-reviewing process, we have done the
selection using reading the title, abstract and conclusion.
Therefore, there are 63 paper that quite relevant to our topics.
Also, the query for search process in the database became “(
TITLE-ABS-KEY ( face AND expression AND recognition
AND deep AND learning ) OR TITLE-ABS-KEY ( face
AND expression AND detection AND deep AND learning
) OR TITLE-ABS-KEY ( face AND emotion AND
detection AND deep AND learning ) OR TITLE-ABS-
KEY ( face AND emotion AND recognition AND deep
AND learning ) ) AND ( LIMIT-TO ( PUBYEAR , 2022 )
OR LIMIT-TO ( PUBYEAR , 2021 ) OR LIMIT-TO (
PUBYEAR , 2020 ) OR LIMIT-TO ( PUBYEAR , 2019 )
OR LIMIT-TO ( PUBYEAR , 2018 ) ) AND ( LIMIT-TO
( DOCTYPE , "cp" ) OR LIMIT-TO ( DOCTYPE , "ar" )
) AND ( LIMIT-TO ( SUBJAREA , "COMP" ) OR LIMIT-
TO ( SUBJAREA , "ENGI" ) )”.
To answer the Research Question (RQ), first, we applied
the inclusion and exclusion to 63 papers that previously
obtained and select the best related paper (selected paper).
From 63 papers selected. There are 6 paper that meets
our criteria with high-quality research presented. From those
6 papers, we do several analyze such the method uses,
accuracy and dataset. Those analytic result of 6 papers will
be used to answer the following RQ :
A. RQ 1, what is the problem?
This research paper was developed to satisfy the
curiosity about the best approach to detect the face
expression. Moreover, from the 6 selected papers, most of
them are focused on how to obtain the best accuracy based on
the model or approach they have been used. There are 4
papers proposed a deep learning approach to detects the face
expression, 1 paper proposed Recurrent Neural Network
approach (RNN), and one paper propose the use of Machine
Learning as shown in Table II and Fig. 5.
TABLE III. METHODS AND MODEL USED BY SELECTED PAPER
Methods or Model
Selected Paper
Proposed Model 1
[18]
Proposed Model 2
[18]
Efficient Net
[22]
VGG-F
[23]
VGG Face
[23]
ResNet-50
[23]
RNN + landmark
[17]
CNN-LCDRC
[19]
Gaussian NB
[24]
DT
[24]
KNN
[24]
SVM
[24]
MLP
[24]
QDA
[24]
RF
[24]
LR
[24]
TABLE II. APPROACHES FOR FACE EXPRESSION DETECTION IN
THE SELECTED PAPER
Approach
Selected Paper
Deep Learning / CNN
[18], [22], [23],
[19]
Recurrent Neural Network
(RNN) + Facial landmark
[17]
Machine Learning
[24]
Fig. 4. Timeline of the Face Expression research
Fig. 5. Comparison of Face Expression Detection Methods
67%
16%
17%
Face Expression Detection Approaches
CNN RNN Machine Learning
Based on the reviewing the result and approach of each
papers, the answer of RQ 1 (what is the problem?) was to
determine the best approach that has been done previously by
other researchers.
B. RQ 2, What kind of model being used?
According to the Table II, from 6 selected papers, there
are 16 models and Methods was used (shown in Table III).
Those methods has its own accuracy measurement. The
limitation of this research is only
Table III shows the methods or approaches utilized by
the selected paper. As shown in Table III, Deep Learning is
the most used approach for Face Expression Detection.
However, there is a lot of Deep Learning or CNN models that
used. Also, Machine Learning produce a significance result
too such as an SVM method [15][16].
Based on the deep reviewing Result of Table II and Table
III, each method or approach has its own result for face
expression recognition. The accuracy result has difference
value in each different dataset. Hence, the average
approaches was used to obtain the featureless results shown
in Fig. 6.
According to Fig. 6, the highest accuracy value was
obtained by RNN (Recurrent Neural Network) method [17].
Nonetheless, the result proposed by Mayang et al [18]
produce deficient result with average accuracy around
63.51% using the CNN approach. Whereas the highest
accuracy result produce by CNN approaches in those 6
research papers was 99% using the CNN customized model
proposed by Sangamesh et al [19].
C. RQ3, What kind of data being used?
The use of Machine Learning and its development
mostly used for classification process[20]. Classification is a
challenging step, especially for low dataset cases. Datasets is
the significant aspect in classification process [21]. Face
Expression Recognition utilize the classification process to
determine the expression on the face. Therefore, the dataset
take a significant role for face expression detection.
From 6 studies we have reviewed, there are total 11
dataset used in the experiments. Every dataset also produce
its own result and accuracy. The detail of dataset and its own
accuracy shown in Fig. 7.
Referring to Fig. 7, the highest accuracy of face
expression recognition was produced using YALE and ORL
dataset followed by CK+ dataset.
Fig. 6. Average accuracy comparison for each method
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
CNN
(costumized)
SVM
(costumized)
RNN
(costumized)
MLP
(costumized)
KNN
(costumized)
Methods Accuracy Comparison
Fig. 7. Average accuracy produced using each dataset
66.93%
63.55%
64.74%
65.10%
55.43%
95.75%
91.40%
85.50%
99%
99%
Acc, 91%
ACCURACIES ON EVERY DATASET
IV. CONCLUSION
Based on deep review process on the six research paper
about face expression recognition. There are several points to
be concluded. First, face expression recognition experiencing
the popularities since 2015 causing by the increase of
Artificial Intelligence (AI) based on Fig. 4. Second, the face
expression recognition gain its best and highest accuracy by
utilizing several approach especially Support Vector Machine
(SVM) and Deep Learning. The accuracy result also
influenced by the dataset used. There are several dataset
commonly used in the research, the highest accuracy of face
expression recognition or detection is by utilizing YALE,
ORL, and followed by CK+ dataset.
Indeed, the combination of method and dataset take a
major role. Therefore, in the future, it is very challenging to
obtain the best approaches and methods using the
combination of the dataset consider of the different result in
the different dataset.
ACKNOWLEDGMENT
This research paper was supported by Doctor of Computer
Science program Bina Nusantara University, Indonesia. This
research is done due to the fulfilling the Software Matrix
course requirements.
REFERENCES
[1] D. G. Ganakwar, “A Case Study of various Face
Detection Methods,” International Journal for
Research in Applied Science and Engineering
Technology, vol. 7, no. 11, pp. 496–500, 2019, doi:
10.22214/ijraset.2019.11080.
[2] A. R. Dores, F. Barbosa, C. Queirós, I. P. Carvalho,
and M. D. Griffiths, “Recognizing emotions through
facial expressions: A largescale experimental study,”
International Journal of Environmental Research
and Public Health, vol. 17, no. 20, pp. 1–13, Oct.
2020, doi: 10.3390/ijerph17207420.
[3] Y. S. Lee and W. H. Park, “Diagnosis of Depressive
Disorder Model on Facial Expression Based on Fast
R-CNN,” Diagnostics, vol. 12, no. 2, Feb. 2022, doi:
10.3390/diagnostics12020317.
[4] “Supplemental Material for Facial Expression
Predictions as Drivers of Social Perception,” Journal
of Personality and Social Psychology, 2018, doi:
10.1037/pspa0000108.supp.
[5] S. Jia, S. Wang, C. Hu, P. J. Webster, and X. Li,
“Detection of Genuine and Posed Facial Expressions
of Emotion: Databases and Methods,” Frontiers in
Psychology, vol. 11. Frontiers Media S.A., Jan. 15,
2021. doi: 10.3389/fpsyg.2020.580287.
[6] B. Kitchenham, “Procedures for Performing
Systematic Reviews,” 2004.
[7] C. Huda, A. Ramadhan, A. Trisetyarso, E.
Abdurachman, and Y. Heryadi, “Smart Tourism
Recommendation Model: A Systematic Literature
Review.” [Online]. Available: www.ijacsa.thesai.org
[8] J. Agee, “Developing qualitative research questions:
A reflective process,” International Journal of
Qualitative Studies in Education, vol. 22, no. 4, pp.
431–447, Jul. 2009, doi:
10.1080/09518390902736512.
[9] “CONTINUING MEDICAL EDUCATION
FORMATION MÉDICALE CONTINUE
PRACTICAL TIPS FOR SURGICAL
RESEARCH,” 2010.
[10] H. Alkharusi, “Literature review on achievement
goals and classroom goal structure: implications for
future research,” Electronic Journal of Research in
Education Psychology, vol. 8, no. 22, pp. 1363–1386,
Nov. 2017, doi: 10.25115/ejrep.v8i22.1425.
[11] O. Doody and M. E. Bailey, “Setting a research
question, aim and objective,” 2016.
[12] S. K. Ratan, T. Anand, and J. Ratan, “Formulation of
Research Question - Stepwise Approach,” J Indian
Assoc Pediatr Surg, vol. 24, no. 1, pp. 15–20, 2019,
doi: 10.4103/jiaps.JIAPS_76_18.
[13] C. M. Patino and J. C. Ferreira, “Inclusion and
exclusion criteria in research studies: Definitions and
why they matter,” Jornal Brasileiro de Pneumologia,
vol. 44, no. 2. Sociedade Brasileira de Pneumologia
e Tisiologia, p. 84, Mar. 01, 2018. doi:
10.1590/s1806-37562018000000088.
[14] R. Pranckutė, “Web of science (Wos) and scopus:
The titans of bibliographic information in today’s
academic world,” Publications, vol. 9, no. 1. MDPI
AG, Mar. 01, 2021. doi:
10.3390/publications9010012.
[15] H. Hasan, H. Z. M. Shafri, and M. Habshi, “A
Comparison between Support Vector Machine
(SVM) and Convolutional Neural Network (CNN)
Models for Hyperspectral Image Classification,” in
IOP Conference Series: Earth and Environmental
Science, Nov. 2019, vol. 357, no. 1. doi:
10.1088/1755-1315/357/1/012035.
[16] S. Chen and C. Liu, “Eye detection using
discriminatory Haar features and a new efficient
SVM,” Image and Vision Computing, vol. 33, pp.
68–77, 2015, doi: 10.1016/j.imavis.2014.10.007.
[17] S. A. Rizwan, Y. Y. Ghadi, A. Jalal, and K. Kim,
“Automated Facial Expression Recognition and Age
Estimation Using Deep Learning,” Computers,
Materials and Continua, vol. 71, no. 2, pp. 5235–
5252, 2022, doi: 10.32604/cmc.2022.023328.
[18] M. K. Rusia and D. K. Singh, “An efficient CNN
approach for facial expression recognition with some
measures of overfitting,” International Journal of
Information Technology (Singapore), vol. 13, no. 6,
pp. 2419–2430, Dec. 2021, doi: 10.1007/s41870-
021-00803-x.
[19] S. Hosgurmath, V. V. Mallappa, N. B. Patil, and V.
Petli, “A face recognition system using convolutional
feature extraction with linear collaborative
discriminant regression classification,” International
Journal of Electrical and Computer Engineering,
vol. 12, no. 2, pp. 1468–1476, Apr. 2022, doi:
10.11591/ijece.v12i2.pp1468-1476.
[20] S. B. Kotsiantis, “Supervised Machine Learning: A
Review of Classification Techniques,” 2007.
[21] A. Althnian et al., “Impact of dataset size on
classification performance: An empirical evaluation
in the medical domain,” Applied Sciences
(Switzerland), vol. 11, no. 2, pp. 1–18, Jan. 2021, doi:
10.3390/app11020796.
[22] N. Kumari and R. Bhatia, “Efficient facial emotion
recognition model using deep convolutional neural
network and modified joint trilateral filter,” Soft
Computing, 2022, doi: 10.1007/s00500-022-06804-
7.
[23] M. I. Georgescu, G. E. Duţǎ, and R. T. Ionescu,
“Teacher–student training and triplet loss to reduce
the effect of drastic face occlusion: Application to
emotion recognition, gender identification and age
estimation,” in Machine Vision and Applications,
Jan. 2022, vol. 33, no. 1. doi: 10.1007/s00138-021-
01270-x.
[24] A. I. Siam, N. F. Soliman, A. D. Algarni, F. E. Abd
El-Samie, and A. Sedik, “Deploying Machine
Learning Techniques for Human Emotion
Detection,” Computational Intelligence and
Neuroscience, vol. 2022, 2022, doi:
10.1155/2022/8032673.
IEEE conference templates contain guidance text for
composing and formatting conference papers. Please
ensure that all template text is removed from your
conference paper prior to submission to the
conference. Failure to remove template text from
your paper may result in your paper not being
published.